My research experiences have primarily focused on exploring the applications of statistical modeling to address problems in a variety of different fields.
Gibbs Sampling for LDA and Applications to RAG (Undergraduate Thesis)
In my thesis, I describe a method for deriving the posterior distribution used in Latent Dirichlet Allocation (LDA) and create a hybrid model in which I combine LDA with a retrieval-augmented generation (RAG) model. I find that this hybrid model outperforms a baseline RAG model in several areas including accuracy and processing time.
AI for Justice
At UCLA’s Computational and Applied Mathematics REU, I worked with faculty, peers, and the Innocence Center on a project developing a model to predict wrongful convictions using topic modeling algorithms such as non-negative matrix factorization, with all analysis conducted in Python.
Effect of ECMO Duration on Post-Transplant Survival
Through the USC Biostatistics and Data Science Summer Training Program, I collaborated with Biostatistics faculty and peers to analyze whether patients’ time spent on ECMO (life-support device) before heart transplant impacted post-transplant outcomes, applying models such as Cox regression and multinomial logistic regression in R.
Analyzing Missing Data in the Stanford Open Policing Project
Through Pomona College’s Summer Undergraduate Research Program, I analyzed 23 traffic-stop datasets from the Stanford Open Policing Project with faculty and peers to assess bias from missing racial data, using odds ratios and R-based visualizations to evaluate potential imputations and reveal the impact of missingness on results.