Data Scientist · Data Analyst · Gen AI / LLM Engineer
I build data systems I'd be comfortable putting my name on — private by default, honest about their limits.
The thread through everything here isn't a single tech stack — it's a way of working:
- 🔒 Privacy by default. My healthcare work ships only de-identified aggregate data; my LLM tooling runs 100% locally — no keys, no data leaving your machine. Sensitive data stays where it belongs.
- 🧪 Honest about limits. I document where a method breaks down rather than hiding it — e.g. my DEA study openly flags the small-sample limitation instead of overclaiming. Surfacing the caveat is part of the analysis.
- ✅ Tested & reproducible. Real test suites (13 / 51 passing across repos), Dockerfiles, and one-command setup — so the work runs for someone other than me.
- 📖 Documented for humans. Every repo explains the why, not just the how.
I didn't start here. The repos below are a visible growth arc — I keep the old versions around on purpose so the journey shows.
| Then | Now | |
|---|---|---|
| Apps | B.Tech Tkinter desktop tool (2018) | FastAPI services, JWT auth, Docker, deployed (CareFlow) |
| Analytics | Spreadsheets & one-off scripts | Reproducible pipelines + tested solvers (DEA) |
| AI | Classic ML (Random Forest, XGBoost) | Local LLM tooling, RAG, load-balancing (llm-balance-paraphraser) |
| Data viz | Static charts | Interactive D3 drill-downs over millions of rows (ICU explorer) |
| Project | What it proves | Track |
|---|---|---|
| CareFlow — clinic appointment & live-queue API | Progression: a 2018 Tkinter desktop app rebuilt into a FastAPI service with JWT auth, Postgres, Docker, 13 passing tests. The legacy code is kept in-repo so you can see exactly what was upgraded. | Eng · DA |
| llm-balance-paraphraser — local LLM analyzer & router | Gen AI + ethics: token / KV-cache / VRAM analysis, an Ollama paraphrase pipeline, and a weighted load-balancer with health checks. Runs entirely on your machine — no API keys, no data leaving. 51 passing tests. | GenAI |
| Sustainable-Manufacturing-DEA — efficiency benchmarking | Research integrity: a from-scratch DEA solver benchmarking Nestlé/Henkel/P&G/Unilever across 50+ ESG KPIs — with an explicit, written note on the method's small-sample limitation. | DS · DA |
| Pediatric ICU Cohort Explorer (making public soon) | Privacy-first healthcare: a Flask + D3 drill-down over 4.4M+ clinical observations that ships only de-identified aggregate counts. | DS · DA |
My interactive portfolio frames each of these for a Data Scientist and a Data Analyst lens — link coming once
lry.devis live.
M.S. Data Science · Open to Data Science, Analytics, and Gen AI roles.