I am a MS Computer Science student at the Tandon School of Engineering at New York University. I hold a BS from UCSB with a double major in Mathematics and Statistics & Data Science.
With a foundation in mathematical and statistical modeling, my research interests lie in leveraging machine learning and deep learning models for innovative solutions. I am experienced in modeling and extracting meaningful insights from complex, large-scale datasets across a diverse range of domains, including business, finance, sports, environmental science, and game industry, among others.
I am actively looking for 2026 New Graduate opportunities in Machine Learning, Software Engineering, or Quantitative Research roles!

July. 2025 - Aug. 2025
Shanghai, China
July. 2025 - Aug. 2025
May. 2024 - Aug. 2024

Jun. 2023 - Mar. 2024
Santa Barbara, California
Advisor: Prof. Kelly Caylor | Graduate Advsior: Anna Boser
Sep. 2023 - Mar. 2024
Jun. 2023 - Sep. 2023

Jun. 2022 - Jun. 2023
Shanghai, China
Advisor: Prof. Ding-jiang Huang
Aug. 2022 - Jun. 2023
Jun. 2022 - Aug. 2022
 
![]() 2024 - 2026 M.S. in Computer ScienceGPA: 3.9 out of 4Taken Courses:
  | ||
 
![]() 2020 - 2024 B.S. in Mathematics / Statistics and Data SciencePublications:Taken Courses:
 Extracurricular Activities:
  | ||
 
![]() No. 2 High School Attached to East China Normal University2017 - 2020 Secondary School | 
A full-stack platform that helps connect NYC residents with vocational programmes: Django-powered backend (PostgreSQL, WebSockets), Google Maps API for geo-search, AWS Elastic Beanstalk + Nginx deployment, and a Travis-driven CI/CD pipeline.
Built a search engine from scratch on MS MARCO—block-compressed inverted index, DAAT BM25 ranking—then refactored the pipeline with HNSW and advanced reordering to lift F1 by 25 % and systematically probe “lost-in-the-middle” bias in RAG workflows.
The project performs semantic segmentation on delicate, small fields using their satellite imagery. Various pixel-level time series classification models are implemented utilizing time series data from Sentinel satellites. Experiments are conducted to explore the transferability and adaptability of the models’ trained weights across different locations and time periods.

The paper introduces a novel, data-efficient workflow to address data scarcity challenges in remote sensing. The model is evaluated on the VHR-10 dataset and outperforms traditional models like YOLOv8 with small dataset. Results are presented at NeurIPS 2023 CCAI workshop.

The project focuses on forecasting the candy production in the U.S. based on the time series data over the past 45 years. Techniques such as Box-Cox transformation and differencing are applied to achieve stationarity, and the optimal SARIMA model is identified using ACF/PACF analysis and Maximum Likelihood Estimation. The model are further validated through comprehensive diagnostic tests and spectral analysis.

Replicated 16 SOTA SR models, discovered key architectural bottlenecks, and co-created EvaSR—an attention-augmented CNN that matches SOTA PSNR while slashing parameters by 85 % and FLOPs by 70 %.
Automated the collection and parsing of thousands of government PDFs, then fine-tuned a TrOCR-base model with custom template-matching to hit 95 %+ character accuracy on challenging handwritten forms—reducing manual data entry time from hours to seconds.

Engineered a rich feature set from top-5-league match stats and benchmarked eight machine learning models including KNN, Random Forest, and Gradient-boosted Trees. The results are reported along with an Exploratory Data Analysis using R markdown.

Conducted an in-depth study on the fundamental groups and covering spaces of three-dimensional Orbifolds. Studies were presented with a poster including orbifold visualizations.
Authored a lightweight Python tool that symbolically computes curvature, torsion, and Christoffel symbols—turning pages of manual tensor algebra into one-line scripts.