Yang Hu

Experiences

Abama Private Fund Investment Management Co., Ltd.

July. 2025 - Aug. 2025

Shanghai, China

Quantitative Research Intern

July. 2025 - Aug. 2025

Responsibilities:

Designed a parallelized data pipeline to transform raw event-level financial data into 300+ engineered featuresto feed high-frequency trading ML models, cutting data preparation from 8 h to 15 min.
Delivered a live-updating factor store database powering large-scale model training and backtests; increased experiment throughput by ∼ 10× and enabled apples-to-apples model comparison during strategy reviews.
Fine-tuned Transformers for trading signal prediction, improving annualized return by 3% in backtests.
Built an automated feature-discovery framework that iteratively surfaces noise-robust, signal-strengthening factors, accelerating model iteration and reducing manual screening.

R&D Software Engineering Intern

May. 2024 - Aug. 2024

Responsibilities:

Scaled the RAG knowledge base for Unity’s AI assistant via community and forum ingestion and LLM-based quality filtering; expanded answer coverage for Editor/API issues by ∼ 10×, improving first-response resolution.
Built MuseBench, an evaluation platform that pairs an LLM-as-a-judge with a curated benchmark of Q&A pairs, enabling consistent scoring of RAG results across different model versions and cutting evaluation time by 95%.
Deployed a local-LLM pipeline to parse 100k+-line Unity Cloud Build logs and extract root-cause signals; reduced triage from hours to minutes and accelerated time-to-fix for recurring build failures.

The Waves Lab at UCSB

Jun. 2023 - Mar. 2024

Santa Barbara, California

Advisor: Prof. Kelly Caylor | Graduate Advsior: Anna Boser

Undergraduate Researcher

Sep. 2023 - Mar. 2024

Responsibilities:

Developed a novel Segment-Then-Classify Strategy leveraging the Segment Anything Model and Vision Transformer for instance segmentation in remote sensing, reducing manual labeling and training costs.
Applied multiple time series classification models on Sentinel satellite imagery to achieve pixel-level semantic segmentation, enhancing the detection accuracy of smallholder irrigation fields.

Bren Leaders and Internship Program

Jun. 2023 - Sep. 2023

Responsibilities:

Implemented an instance segmentation model for automated mapping of center-pivot irrigation systems in Sub-Saharan Africa for a deeper understanding of irrigation adoption and its impacts in the region.

Data Science & Engineering School at East China Normal University

Jun. 2022 - Jun. 2023

Shanghai, China

Advisor: Prof. Ding-jiang Huang

Undergraduate Research Assistant

Aug. 2022 - Jun. 2023

Responsibilities:

Co-authored the ACM MM ‘24 paper :’Compacter: A Lightweight Transformer for Image Restoration’, achieving state-of-the-art PSNR performance across different Image Restoration tasks with ~ 50% - 65% fewer parameters.
Co-designed Compact Adaptive Self-Attention, enabling omnidirectional spatial–channel information flow through cross-modulation of global context to strengthen long-range dependencies while preserving local detail.
Proposed a Dual Selective Gated Module that dynamically injects global context into each pixel for contextadaptive aggregation, amplifying informative features and suppressing noise.
Built the PyTorch training/benchmarking pipeline and ran ablation experiments, enabling reproducible results and efficient comparison to baselines.

Summer Intern

Jun. 2022 - Aug. 2022

Responsibilities:

Conducted literature reviews on Image Super-Resolution deep learning models.

Education

		New York University 2024 - 2026 M.S. in Computer Science GPA: 3.9 out of 4 Taken Courses: Design and Analysis of Algorithms Web Search Engines Computer Networking Operating Systems Principles of Database Systems Software Engineering Blockchain and Distributed Ledger Technology Big Data Network Security
		University of California, Santa Barbara 2020 - 2024 B.S. in Mathematics / Statistics and Data Science Publications: Compacter: A Lightweight Transformer for Image Restoration Segment-then-Classify: Few-shot instance segmentation for environmental remote sensing Taken Courses: CS \| Data Structures and Algorithms CS \| Algorithms Engineering CS \| Deep Learning DS \| Statistical Machine Learning DS \| Statistical Data Science DS \| Big Data Analytics STAT \| Probability and Statistics A-B-C STAT \| Time Series STAT \| Regression Analysis STAT \| Applied Stochastic Processes STAT \| Design and Analysis of Experiments MATH \| Linear Algebra A-B MATH \| Real Analysis A-B MATH \| Differential Geometry MATH \| Non-euclidean Geometry MATH \| Abstract Algebra A-B-C MATH \| Number Theory FINANCE \| Financial Mathematics FINANCE \| Mathematics of Fixed Income Markets Extracurricular Activities: Chinese Students and Scholars Association (CSSA) Lead event planner Data Science Club Snow Club Intramural soccer player
		No. 2 High School Attached to East China Normal University 2017 - 2020 Secondary School

Projects

VocationalNYC

Jan. 2025 - May 2025

A full-stack platform that helps connect NYC residents with vocational programmes: Django-powered backend (PostgreSQL, WebSockets), Google Maps API for geo-search, AWS Elastic Beanstalk + Nginx deployment, and a Travis-driven CI/CD pipeline.

Web Development Django PostgreSQL AWS DevOps JavaScript Python

Information Retrieval System & Retrieval-Augmented Generation

Sep. 2024 – Dec. 2024

Built a search engine from scratch on MS MARCO—block-compressed inverted index, DAAT BM25 ranking—then refactored the pipeline with HNSW and advanced reordering to lift F1 by 25 % and systematically probe “lost-in-the-middle” bias in RAG workflows.

Deep learning Machine Learning Information Retrieval RAG Search Engine LLM Python

Semantic Segmentation by Pixel-level Time Series Classification

Nov. 2023 - Mar. 2024

The project performs semantic segmentation on delicate, small fields using their satellite imagery. Various pixel-level time series classification models are implemented utilizing time series data from Sentinel satellites. Experiments are conducted to explore the transferability and adaptability of the models’ trained weights across different locations and time periods.

Deep Learning Machine Learning Time Series Computer Vision Image Segmentation Remote Sensing TensorFlow Python

Few-shot Instance Segmentation for Remote Sensing

First author Jun. 2023 - Oct. 2023

The paper introduces a novel, data-efficient workflow to address data scarcity challenges in remote sensing. The model is evaluated on the VHR-10 dataset and outperforms traditional models like YOLOv8 with small dataset. Results are presented at NeurIPS 2023 CCAI workshop.

Deep Learning Machine Learning Computer Vision Image Segmentation Remote Sensing Pytorch Python

Details

Time Series Forecasting of U.S. Candy Production

Time Series Course Project Sep. 2023 - Nov. 2023

The project focuses on forecasting the candy production in the U.S. based on the time series data over the past 45 years. Techniques such as Box-Cox transformation and differencing are applied to achieve stationarity, and the optimal SARIMA model is identified using ACF/PACF analysis and Maximum Likelihood Estimation. The model are further validated through comprehensive diagnostic tests and spectral analysis.

Data Science Time Series Statistical Modeling R

Efficient Visual Attention Design for Image Super-Resolution

AAAI 2024 submission Mar. 2022 - May 2023

Replicated 16 SOTA SR models, discovered key architectural bottlenecks, and co-created EvaSR—an attention-augmented CNN that matches SOTA PSNR while slashing parameters by 85 % and FLOPs by 70 %.

Deep learning Machine Learning Computer Vision Image Super Resolution Pytorch Python

Digitizing Handwritten Data with OCR

Oct. 2023 - Nov. 2023

Automated the collection and parsing of thousands of government PDFs, then fine-tuned a TrOCR-base model with custom template-matching to hit 95 %+ character accuracy on challenging handwritten forms—reducing manual data entry time from hours to seconds.

Deep learning Machine Learning Computer Vision Python

Soccer player transfer market value prediction

Machine Learning Course Project Sep. 2022 - Jan. 2023

Engineered a rich feature set from top-5-league match stats and benchmarked eight machine learning models including KNN, Random Forest, and Gradient-boosted Trees. The results are reported along with an Exploratory Data Analysis using R markdown.

Data Science Machine Learning R

Exploring SO(3) through the lens of Orbifolds

UCSB MATH Directed Reading Program Jan. 2023 - Mar. 2023

Conducted an in-depth study on the fundamental groups and covering spaces of three-dimensional Orbifolds. Studies were presented with a poster including orbifold visualizations.

Math

Differential Geometry Calculator

Authored a lightweight Python tool that symbolically computes curvature, torsion, and Christoffel symbols—turning pages of manual tensor algebra into one-line scripts.

Math Python

Hi, I am Yang

Yang Hu

Graduate student at New York University

Skills

Python

C++

R

SQL

Docker & Kubernetes

DevOps & CI/CD

Cloud & Serverless

Git & Version Control

Experiences

Abama Private Fund Investment Management Co., Ltd.

Quantitative Research Intern

Responsibilities:

Unity

R&D Software Engineering Intern

Responsibilities:

The Waves Lab at UCSB

Undergraduate Researcher

Responsibilities:

Bren Leaders and Internship Program

Responsibilities:

Data Science & Engineering School at East China Normal University

Undergraduate Research Assistant

Responsibilities:

Summer Intern

Responsibilities:

Education

New York University

M.S. in Computer Science

GPA: 3.9 out of 4

Taken Courses:

University of California, Santa Barbara

B.S. in Mathematics / Statistics and Data Science

Publications:

Taken Courses:

Extracurricular Activities:

No. 2 High School Attached to East China Normal University

Secondary School

Projects

VocationalNYC

Information Retrieval System & Retrieval-Augmented Generation

Semantic Segmentation by Pixel-level Time Series Classification

Few-shot Instance Segmentation for Remote Sensing

Time Series Forecasting of U.S. Candy Production

Efficient Visual Attention Design for Image Super-Resolution

Digitizing Handwritten Data with OCR

Soccer player transfer market value prediction

Exploring SO(3) through the lens of Orbifolds

Differential Geometry Calculator