Aaron Chen

Software & AI Engineer

Synthesizing distributed systems with applied AI.
Junior at Washington University in St. Louis.

About

I am pursuing a Double Major in Computer Science & Mathematics at WashU with a GPA of 3.95. My engineering philosophy sits at the intersection of rigorous distributed systems and pragmatic AI application.

I don't just tune models; I build the infrastructure that makes them viable in production. From scaling Kafka-backed async processing to support 5k+ concurrent sessions, to optimizing RAG pipelines that cut operational costs by 27%, I focus on impact.

When I'm not deploying Kubernetes clusters or creating Llama-Vision scanners, I'm competing in math benchmarks (AIME Qualifier), singing, or exploring Jazz Studies.

Experience

  1. May — Aug 2025

    Engineered full-stack LLM solutions and scalable backend infrastructure. Built a hybrid ATS scoring engine combining TF-IDF with LLM rewrite guidance. Scaled Node/Express systems using Kafka-backed async processing to handle 5.1k concurrent sessions. Implemented strict circuit-style fallbacks and timeouts to harden LLM APIs.

    • Postgres
    • Kafka
    • Redis
    • Elasticsearch
    • AWS
  2. May — Aug 2024

    Owned the end-to-end RAG platform for offline news analysis. Integrated RouteLLM to route queries to local models, reducing cloud inference costs by ~38%. Built a self-serve admin dashboard (MongoDB/SSO) for retrieval tuning.

    • LangChain
    • MongoDB
    • RAG
    • Azure

Selected Projects

  • NL Video Scanner (Llama Stack Finalist)

    Privacy-preserving scanner running at 1 FPS using Llama-Vision + OpenCV. Achieved 85%+ Hit@5 with zero data egress. Built local retrieval with LlamaIndex and optimized OpenCV preprocessing for low-light scenes.

    • Llama-Vision
    • LlamaIndex
    • NoSQL
    • OpenCV
    • FastAPI
    • Docker
  • E-Commerce Recommendation Engine

    Two-stage recommender (recall → ranking) processing 10M+ interactions. Served a Two-Tower DNN via TorchServe on Kubernetes (p99 9ms @ 1.2k RPS). Implemented Spark Structured Streaming for real-time feature caching.

    • PyTorch
    • Kubernetes
    • Spark
    • Redis
  • Distributed RPC Framework

    Designed a low-latency RPC core in Spring Boot + Vert.x. Reached <5ms median latency at 5k+ RPS. Implemented Etcd-based service discovery and fault-tolerant failover, validated on a multi-node Linux cluster.

    • Distributed Systems
    • HPC
    • Spring Boot
    • Etcd
    • Vert.x
  • ML Comparative Analysis

    Conducted a comparative study of supervised learning models (SVM, Logistic Regression, Random Forest) on physicochemical datasets. Performed rigorous EDA to identify key correlations (e.g., alcohol vs. quality). Achieved 87.2% accuracy with Random Forest while optimizing SVM for Macro F1-score balance across minority classes.

    • Scikit-Learn
    • Pandas
    • Matplotlib
  • SoloSync Travel App (iOS)

    Built a social travel platform bridging isolation for solo travelers. Integrated MapKit and Google Places for interactive route planning and real-time annotation sharing. Engineered a scalable Node.js/Express backend on AWS EC2 with MariaDB for reliable data persistence.

    • SwiftUI
    • Node.js
    • AWS
    • MapKit
    • Google Places
  • Hollow Archer (2D Action Platformer)

    Engineered a Metroidvania-style platformer in Unity. Architected a hierarchical Enemy AI system using C# inheritance (Patrol/Chase logic). Implemented kinetic combat physics, custom health data structures, and optimized WebGL memory management for browser play.

    • Unity
    • C#
    • WebGL
    • Game Design