Research
Research Interests
High-dimensional Econometrics
Micro-econometrics
Boundary (Constrained) Estimation and Inference
Causal Inference
Job Market Paper
"High-dimensional Random-Coefficient Multinomial Logit Model and Demand Estimation" (2024)
Abstract: Random-coefficient multinomial logit models are widely applied to study discrete choices in economics. By assuming random coefficients for each individual, the models can account for unobserved individual heterogeneity and suggest more realistic substitution patterns, compared to standard logit models. In this paper, I find that random coefficients become undetectable (i.e., estimated variances are zero) even if they exist, as many observed individual covariates are incorporated. Having zero estimates of variances not only yields bias in estimating other parameters but also raises the concern of parameters on boundary. To address these issues, I propose l1-regularized maximum likelihood estimation for simultaneous covariate selection, and develop a debiased machine learning estimator to correct regularization bias while accounting for parameter constraints, such as non-negativity of variance. I derive non-asymptotic probability bounds for the regularized estimator and limiting distributions for the debiased estimator. Finally, I validate the estimators with thorough Monte Carlo simulations, and illustrate the impacts of high-dimensional covariates in an application to soft-drink markets in North Carolina.
Publications
"A Bootstrapped Test of Covariance Stationarity Based on Walsh Functions", with Jonathan B. Hill, 2024, Bernoulli, forthcoming. https://arxiv.org/pdf/2210.14086
"Research Collaboration beyond the Boundary: Evidence from University Patents in China" with Jingbo Cui and Zhenxuan Wang, 2023, Journal of Regional Science, 63, 674-702. https://doi.org/10.1111/jors.12635.
Working Papers
"High-dimensional Inference when the True Parameter is On or Near the Boundary", 2024, Available upon requests.
Abstract: I revisit the estimation and inference of a low-dimensional target parameter given a high-dimensional nuisance parameter, which have been studied recently but assuming that the true parameter is an interior point in the parameter space. As a natural extension to the double/debiased machine learning (DML) method proposed by Chernozhukov et al. (2018), I derive the non-standard limiting distribution for a high-dimensional linear model when (a subset of) the true parameter is on or near the boundary. I prove that the solution to the DML problem with boundary can be approximated by the minimizer of a quadratic function almost surely, which can be considered as a projection onto the parameter space. Based on node-wise LASSO, I provide consistent estimators of variances, suggest a quasi-likelihood ratio test for boundary hypotheses with simulated critical values, and prove validity of the test. Connecting to recent literature, I also study the asymptotic size and construct a uniform test for sub-vector inference by using a Bonferroni procedure. My results allow the dimension to grow sub-exponentially fast (i.e., p=o(exp(n^c)) for 0<c<1/2) and are backed up by thorough Monte Carlo simulations. Finally, an empirical application to the Pennsylvania Reemployment Bonus Demonstration (1988-1989) indicates more significant average treatment effects while utilizing boundary information and high-dimensional covariates.
Working in Progress
 "Regularized Extremum Estimators On the Boundary", 2022