+ Welcome!

mail_outline Email: melodyhuang@fas.harvard.edu

Twitter: @melodyyhuang

I'm currently a Postdoctoral Fellow at Harvard, working with Kosuke Imai. I will be joining Yale as an Assistant Professor in Fall 2024. My research broadly focuses on developing robust statistical methods to credibly estimate causal effects under real-world complications.

Before this, I received my Ph.D. in Statistics at the Unversity of California, Berkeley, where I was fortunate to be advised by Erin Hartman.

Recent News

  • [Jul. 2023]
My paper with Dan Soriano and Sam Pimentel on design sensitivity for weighted observational studies is now available on ArXiv (link)!
  • [Mar. 2023]
My paper with Erin Hartman on sensitivity analysis for survey weighting will be appearing in Political Analysis!
  • [Mar. 2023]
I will be presenting on variance-based sensitivity models at the Online Causal Inference Seminar!
  • [Jan. 2023]
Our paper on sensitivity analysis for survey weighting was chosen as one of the ASA 2023 Student Paper Competition winners!
  • [Nov. 2022]
I will be presenting my work on sensitivity analysis at University of Pennsylvania's Causal Inference Seminar.

+ Research


Sensitivity Analysis in the Generalization of Experimental Results


Randomized control trials (RCT’s) allow researchers to estimate causal effects in an experimental sample with minimal identifying assumptions. However, to generalize a causal effect from an RCT to a target population, researchers must adjust for a set of treatment effect moderators. In practice, it is impossible to know whether the set of moderators has been properly accounted for. In the following paper, I propose a three parameter sensitivity analysis for the generalization of experimental results using weighted estimators, with several advantages over existing methods. First, the framework does not require any parametric assumption on either the selection or treatment effect heterogeneity mechanism. Second, I show that the sensitivity parameters are guaranteed to be bounded and propose (1) a diagnostic for researchers to determine how robust a point estimate is to killer confounders, and (2) an adjusted calibration approach for researchers to accurately benchmark the parameters using existing data. Finally, I demonstrate that the proposed framework can be easily extended to the class of doubly robust, augmented weighted estimators. The sensitivity analysis framework is applied to a set of Jobs Training Program experiments.

Variance-based Sensitivity Analysis for Weighted Estimators Result in More Informative Bounds
with Sam Pimentel


Weighting methods are popular tools for estimating causal effects; assessing their robustness under unobserved confounding is important in practice. In the following paper, we introduce a new set of sensitivity models called "variance-based sensitivity models." Variance-based sensitivity models characterize the bias from omitting a confounder by bounding the distributional differences that arise in the weights from omitting a confounder, with several notable innovations over existing approaches. First, the variance-based sensitivity models can be parameterized with respect to a simple R^2 parameter that is both standardized and bounded. We introduce a formal benchmarking procedure that allows researchers to use observed covariates to reason about plausible parameter values in an interpretable and transparent way. Second, we show that researchers can estimate valid confidence intervals under a set of variance-based sensitivity models, and provide extensions for researchers to incorporate their substantive knowledge about the confounder to help tighten the intervals. Last, we highlight the connection between our proposed approach and existing sensitivity analyses, and demonstrate both, empirically and theoretically, that variance-based sensitivity models can provide improvements on both the stability and tightness of the estimated confidence intervals over existing methods. We illustrate our proposed approach on a study examining blood mercury levels using the National Health and Nutrition Examination Survey (NHANES).

Design Sensitivity and Its Implication on Weighted Observational Studies
with Dan Soriano and Sam Pimentel


Sensitivity to unmeasured confounding is not typically a primary consideration in designing treated-control comparisons in observational studies. We introduce a framework allowing researchers to optimize robustness to omitted variable bias at the design stage using a measure called design sensitivity. Design sensitivity, which describes the asymptotic power of a sensitivity analysis, allows transparent assessment of the impact of different estimation strategies on sensitivity. We apply this general framework to two commonly-used sensitivity models, the marginal sensitivity model and the variance-based sensitivity model. By comparing design sensitivities, we interrogate how key features of weighted designs, including choices about trimming of weights and model augmentation, impact robustness to unmeasured confounding, and how these impacts may differ for the two different sensitivity models. We illustrate the proposed framework on a study examining drivers of support for the 2016 Colombian peace agreement.


Leveraging Population Outcomes to Improve the Generalization of Experimental Results
with Naoki Egami, Erin Hartman, and Luke Miratrix
Annals of Applied Statistics (2023)

Article Pre-Print

Randomized control trials are often considered the gold standard in causal inference due to their high internal validity. Despite its importance, generalizing experimental results to a target population is challenging in social and biomedical sciences. Recent papers clarify the assumptions necessary for generalization and develop various weighting estimators for the population average treatment effect (PATE). However, in practice, many of these methods result in large variance and little statistical power, thereby limiting the value of the PATE inference. In this article, we propose post-residualized weighting, in which information about the outcome measured in the observational population data is used to improve the efficiency of many existing popular methods without making additional assumptions. We empirically demonstrate the efficiency gains through simulations and apply our proposed method to a set of jobs training program experiments.

Improving Precision in the Design and Analysis of Experiments with Non-Compliance
with Erin Hartman
Political Science Research and Methods (2023)

Article Code

Even in the best-designed experiment, noncompliance with treatment assignment can complicate analysis. Under one-way noncompliance, researchers typically rely on an instrumental variables approach, under an exclusion restriction assumption, to identify the complier average causal effect (CACE). This approach suffers from high variance, particularly when the experiment has a low compliance rate. The following paper suggests blocking designs that can help overcome precision losses in the face of high rates of noncompliance in experiments when a placebo-controlled design is infeasible. We also introduce the principal ignorability assumption and a class of principal score weighted estimators, which are largely absent from the experimental political science literature. We then introduce the ''block principal ignorability'' assumption which, when combined with a blocking design, suggests a simple difference-in-means estimator for estimating the CACE. We show that blocking can improve precision of both IV and principal score weighting approaches, and further show that our simple, design-based solution has superior performance to both principal score weighting and instrumental variables under blocking. Finally, in a re-evaluation of the Gerber, Green, and Nickerson (2003) study, we find that blocked, principal ignorability approaches to estimation of the CACE, including our blocked difference-in-means and principal score weighting estimators, result in confidence intervals roughly half the size of traditional instrumental variable approaches.

Sensitivity Analysis for Survey Weighting
with Erin Hartman
Political Analysis (2023)

Article Pre-Print Code

Survey weighting allows researchers to account for bias in survey samples, due to unit nonresponse or convenience sampling, using measured demographic covariates. Unfortunately, in practice, it is impossible to know whether the estimated survey weights are sufficient to alleviate concerns about bias due to unobserved confounders or incorrect functional forms used in weighting. In the following paper, we propose two sensitivity analyses for the exclusion of important covariates: (1) a sensitivity analysis for partially observed confounders (i.e., variables measured across the survey sample, but not the target population) and (2) a sensitivity analysis for fully unobserved confounders (i.e., variables not measured in either the survey or the target population). We provide graphical and numerical summaries of the potential bias that arises from such confounders, and introduce a benchmarking approach that allows researchers to quantitatively reason about the sensitivity of their results. We demonstrate our proposed sensitivity analyses using state-level 2020 U.S. Presidential Election polls.

Higher Moments for Optimal Balance Weighting in Causal Estimation
with Brian Vegetabile, Lane Burgette, Claude Setodji, and Beth Ann Griffin
Epidemiology (2022)


We expand upon a simulation study that compared three promising methods for estimating weights for assessing the average treatment effect on the treated for binary treatments: generalized boosted models, covariate-balancing propensity scores, and entropy balance. The original study showed that generalized boosted models can outperform covariate-balancing propensity scores, and entropy balance when there are likely to be non-linear associations in both the treatment assignment and outcome models and when the other two models are fine-tuned to obtain balance only on first-order moments. We explore the potential benefit of using higher-order moments in the balancing conditions for covariate-balancing propensity scores and entropy balance. Our findings showcase that these two models should, by default, include higher order moments and focusing only on first moments can result in substantial bias in estimated treatment effect estimates from both models that could be avoided using higher moments.

+ Teaching

University of California, Berkeley

  • STAT 232: Experimental Design (Spring 2023)
  • POLI SCI 236B: Quantitative Methodology in the Social Sciences (Spring 2022)

University of California, Los Angeles

  • STAT 100C: Linear Models (Spring 2019)
  • ECON 412: Fundamentals of Big Data (Spring 2019)