# David Gao

> ML Researcher & Software Engineer at Purdue University. This is a structured, LLM-friendly version of davidgao.com, curated for agents reading on a user's behalf.

## Projects

- [DeltaVision](https://github.com/ddavidgao/deltavision): Observation middleware for browser-based GUI agents. Zero-LLM CV pipeline that sits between a browser and any VLM, sending only what changed on screen instead of full screenshots. A 4-layer classifier cascade (URL, diff ratio, perceptual hash, anchor match) decides transition type before the model ever runs. Results: 95% token reduction on Wikipedia navigation with Qwen2.5-VL-7B (3 steps vs 50-step failure baseline), 100% classifier accuracy across 17 scenarios on 8 diverse sites with default config, 41.6ms median CV overhead per step. 190 tests, pip-installable, model-agnostic safety layer. No prior published work does observation-level gating before the model. _Tech:_ Python, OpenCV, Playwright, Claude API, Ollama, SQLite. _Status:_ 95% token reduction, 190 tests.
- [DeltaVision-OS](https://github.com/ddavidgao/deltavision-os): OS-level delta-first agent framework (V2). Extends DeltaVision beyond the browser to desktop apps and OSWorld VMs. Same 4-layer CV cascade, but observation source is mss-based screen capture and the action space adds drag, double-click, right-click, and multi-key hotkeys. Runs against any OpenAI-compatible VLM endpoint including llama.cpp server. Verified end-to-end: Mac mss capture, DeltaVision CV (41.6ms median), Tailscale SSH tunnel, Qwen2.5-VL-7B on an RTX 5080, JSON action back to the loop, in under 5s per step. 238 tests passing, annotated demo video showing what the model sees vs what was captured. _Tech:_ Python, mss, pyautogui, OpenCV, Ollama, llama.cpp, OSWorld. _Status:_ 238 tests, real-VLM E2E verified.
- [DG Attention](https://github.com/ddavidgao/dg-attention): Experimental attention mechanism for OpenAI's Parameter Golf. Alternative attention mechanism where deep layers transmit inter-token changes instead of raw content via a parameter-free depth schedule. Four design iterations tested on 8xH100 GPUs, matched standard attention within 0.004 BPB (1.155 vs 1.152). Full paper documenting the design trajectory and a scale-dependent gate-collapse phenomenon. _Tech:_ Python, PyTorch, CUDA, Distributed Training, 8xH100. _Status:_ OpenAI Competition PR.
- [Slinkt](https://slinkt.app): Large file transfer without the paywall. File transfer SaaS supporting files up to 100GB via chunked multipart uploads to Cloudflare R2. Freemium model with Stripe billing, usage tracking, and tiered limits. Password-protected downloads, custom slugs, and IP-based rate limiting. _Tech:_ Next.js, TypeScript, tRPC, Cloudflare R2, Stripe, Prisma. _Status:_ Live.
- [CardboardAI](https://cardboardai.dev): AI-native storage management. Multi-tenant platform automating storage facility operations. Custom LangGraph agent handles tenant onboarding, invoicing, and payment collection. Dual-portal architecture serving operators and tenants. _Tech:_ Next.js, TypeScript, AWS RDS, Prisma, LangGraph. _Status:_ Live in production.
- IoT Locker System: Commercial hardware + software. Event-driven IoT locker system using Supabase Realtime and Edge Functions. Migrated to Flask + Chromium kiosk, reducing memory 60% and boot time from 15s to 5s. Payment webhooks, GPIO relay control, and pg_cron for rental expiration. _Tech:_ Flask, Supabase, Raspberry Pi, Chromium Kiosk. _Status:_ $100k+ revenue.
- [Resume Builder MCP](https://github.com/ddavidgao/resume-mcp): AI-powered resume tailoring via MCP. Full-stack MCP server for automated resume optimization. Maintains a SQLite profile database, scrapes job postings to extract requirements, and generates tailored LaTeX resumes with relevance-scored experience selection. Deterministic ATS analysis catches keyword gaps and phrasing mismatches, then delegates semantic evaluation and bullet rewriting to the connected LLM. No extra API keys needed. Tracks applications in a color-coded xlsx spreadsheet. First-run onboarding scrapes your website or resume to auto-populate your profile. _Tech:_ Python, SQLite, LaTeX, MCP Protocol, openpyxl, trafilatura. _Status:_ Personal Tool.
- [Purdue Dining](https://github.com/ddavidgao/purdue-dining-mcp): AI dining assistant for Purdue, works on ChatGPT & Claude. Say 'I'm hungry' and get real food recommendations from live Purdue dining menus. Deployed as a remote MCP server on Railway, serving both Claude (via MCP connector) and ChatGPT (via GPT Store with REST Actions). Pulls real-time menus and hours from Purdue's HFS API. Never guesses or fabricates menu items. Zero user data stored server-side; preferences live in ChatGPT memory or Claude's project context. _Tech:_ Python, MCP Protocol, Railway, Starlette, Purdue HFS API. _Status:_ Live on GPT Store + Claude.
- [RLM](https://github.com/ddavidgao/RLM_TEST): Grounded AI through code execution. Agentic RAG system that forces LLMs to search documents via Python REPL instead of hallucinating. Multi-model orchestration with sandboxed code execution and evidence-based answer grading. _Tech:_ Python, Ollama, DeepSeek, Qwen, Matplotlib. _Status:_ Research.

## Experience

- **Software Engineer Intern**, Pipelines (Apr 2026 - Present). Full-stack intern at Pipelines.
- **Developer**, BoilerMake (Dev Team) (Apr 2026 - Present). Build and maintain the hackathon website, application portal, and admin dashboard for Purdue's largest student-run hackathon (500+ participants, MLH-affiliated).
- **Software Engineer Intern**, Quture Fashion (Mar 2025 - Aug 2025). AI-powered virtual try-on features. Led incident response for compromised credentials, implemented security patches.
- **Research Intern**, Washington University in St. Louis (Jun 2024 - Aug 2024). Built evaluation pipeline for self-supervised medical image reconstruction. Contributed to IEEE ISBI 2025 paper on diffusion models for MRI.

## Skills

- **Languages:** Python, JavaScript, TypeScript
- **Frameworks:** React, Next.js, Node, Flask, TensorFlow, PyTorch, Playwright, tRPC
- **Developer Tools:** Modal, Docker, AWS (EC2, S3, RDS), Supabase, Cloudflare R2, Vercel, Ollama, Stripe API, FFmpeg
- **Technologies:** Computer Vision, Machine Learning, Perceptual Hashing, GUI Agents, PostgreSQL, Prisma, NextAuth, Raspberry Pi, IoT Systems, MCP Protocol

## Contact

- [GitHub](https://github.com/ddavidgao)
- [LinkedIn](https://www.linkedin.com/in/david-gao-322837355/)
- [Instagram](https://www.instagram.com/dvaidgao/)

## Optional

- [Human-readable AI page](/ai): same info, rendered as a page
- [Main site](/): full portfolio with styling
