I am looking for postdocs, especially in but not limited to causal inference, Bayesian modeling, end-user programming, data visualization, and data science education! If you are interested, please email.
I am also looking for PhD students interested in HCI, statistics/data science, and programming languages! If you are interested, please apply to UCLA CS. I especially encourage those who have backgrounds/majors in areas outside of CS to apply.
Research mission
My research lab is working towards a future where anyone can use technology and data effectively to learn and make informed decisions that advance society. Our mission is to understand real-world users, design usable abstractions, and develop interactive reasoning approaches that empower people to make data-informed decisions. These days, we have been focused on supporting scientists through computational abstractions and reasoning that promote valid data analysis, robust experimental design, and transparent reporting.
Selected projects
Tools for reliable statistical analysis
Tisane is a mixed-initiative system that guides users to author generalized linear models and generalized linear models with mixed-effects. Tisane (i) provides a high-level study design specification language for expressing conceptual and data measurement relationships between variables, (ii) represents variable relationships in an internal graph representation used to generate a space of possible linear models (based on causal modeling), and (iii) guides users through an interactive disambiguation process where users answer questions to arrive at a final linear model. Tisane outputs a script for fitting a single model and diagnosing it via visualization.
Tea is a high-level domain-specific language for directly expressing study designs, assumptions about data, and hypotheses that are used to infer valid Null Hypothesis Significance Tests. Tea's runtime system compiles the input program into a set of logical constraints. Tea solves these constraints to identify a set of statistical tests that will test a user's hypothesis and respect statistical test assumptions about the data (e.g., distribution, variance).
Theory of statistical analysis authoring
Hypothesis formalization is a dual-search process that involves iterating on conceptual and statistical models. To formalize hypotheses, analysts must align three sets of concerns: conceptual domain knowledge, data collection details, and statistical methods and implementation. Existing tools do not explicitly scaffold this hypothesis formalization process. Current tools require statistical expertise to navigate idiosyncratic taxonomies of statistical methods and programming expertise to use APIs, which may be overloaded and unintuitive. The considerations and skills required of analysts is often great enough that analysts seek shortcuts--adapting their hypotheses to their limited statistical knowledge, choosing suboptimal analysis methods--and make mistakes. This is why new programming tools that explicitly scaffold the hypothesis formalization process could not only ease analysts' experiences but also reduce the errors and suboptimal analysis choices they make. Tea and Tisane are two example systems for directly supporting hypothesis formalization.