Skip to content
View adityonugrohoid's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report adityonugrohoid

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
adityonugrohoid/README.md

Adityo Nugroho

Platform and ML Infrastructure Engineer building production AI systems, on 18 years of telecom network performance at Huawei

Multi-agent systems on Google Cloud. Scale-to-zero GPU inference on Kubernetes. Terraform on AWS. Fine-tuned specialist models for edge devices.

 

Featured Projects

Four ADK agents turn a customer complaint into a triaged NOC incident ticket in under 30 seconds

A Google ADK SequentialAgent orchestrates four LlmAgent sub-agents on Gemini (Vertex AI), collapsing the manual NOC workflow of correlating network events, call detail records, and ticketing into one natural-language step. Every LLM call routes through a 4-attempt model-failover ladder shown live in the UI, so a rate-limited model is swapped for the next one on screen instead of failing the request.

  • Top 100 of thousands at the Google Cloud Gen AI Academy APAC 2026 (Cohort 1); the BigQuery + AlloyDB pilot now runs zero-idle-cost on bundled SQLite, swappable back through one MCP Toolbox tools.yaml line, live on Cloud Run at netpulse-ui.run.app

Production GPU inference on Kubernetes that costs $0 when idle

Two-layer autoscaling: KEDA watches Redis queue depth for 0-to-N pod scaling while the GKE Cluster Autoscaler provisions and deprovisions GPU VMs on pending pod scheduling. vLLM serves with continuous batching; model weights persist on a PVC and image layers pre-cache via Secondary Boot Disk, cutting cold start about 45% (9m20s to 5m9s).

  • TTFT p95 ~140-200ms warm, 12-panel Grafana dashboard (Prometheus + NVIDIA DCGM), Locust load testing, true scale-to-zero at both pod and node level

A 270M model that beats generalist function-callers by baking tool knowledge into its weights

LoRA (r=128) fine-tuning on Gemma 3 270M moves all 14 tool definitions out of the prompt and into the weights: a ~20-token query goes in, a JSON tool call comes out, regardless of tool count. The result is 99.5% accuracy at 32x fewer prompt tokens than schema-in-prompt baselines, in a 291 MB Q8_0 GGUF that runs on phones, laptops, and Raspberry Pi.

  • Trained on a single consumer GPU (RTX 4060, 8GB VRAM): 418/420 eval accuracy, 14,033 training examples, 14 tools across 2 MCP servers, 153ms avg latency

The public Terraform and Docker deployment behind a framework-free multi-agent RAG system

AgentLens is a streaming pipeline debugger for multi-agent RAG, coordinating four roles (Retrieval Agent, Grader, Quality Judge, Fallback) across a 4-service FastAPI stack. This repo is the infrastructure-as-code only (app source stays private), so an infra reviewer can verify the full production architecture without app-repo access: Terraform provisions VPC, EC2 t3.small, and Elastic IP in us-east-1, behind a 7-container Docker Compose stack, path-based nginx, and Cloudflare edge SSL.

  • One-command deploy and teardown over SSH, Ollama Cloud API (no GPU on EC2), ChromaDB vector store, systemd boot automation, live at agentlens.adityonugroho.com

Four independent detection phases feeding a fault-tolerant orchestrator

Automates construction blueprint takeoffs by composing three independent CV modules (contour-based shape detection, Tesseract OCR with preprocessing, and a YOLOv8n model fine-tuned on synthetic symbols) into a multi-stage orchestrator. Each phase returns a typed failure object instead of aborting, so a first-run user without trained YOLO weights still gets shape and OCR results. Multi-page PDFs go in, structured JSON takeoff reports come out over a FastAPI endpoint.

  • 66 passing tests across 4 phases, mAP@50 0.992 on 5 synthetic construction symbol classes, YOLOv8n fine-tuned 20 epochs, FastAPI POST /analyze plus CLI

Six end-to-end telecom ML projects, each on synthetic data with embedded network physics

Six self-contained projects spanning classification, regression, time-series, and reinforcement learning. Most ML demos treat telecom as generic tabular data; here each project hand-crafts synthetic data with embedded telecom physics (temporal correlation, spatial clustering, equipment failure signatures), then runs the full paradigm end-to-end from data generation through feature engineering, training, evaluation, and business-insight translation. The repo is an index; source lives in six child repos.

  • Churn AUROC 0.86, RCA Acc@1 0.91, Anomaly F1 0.70, QoE RMSE 0.45, Capacity MAPE 14.5%, Network Optimization +61% vs random

 

Connect

Pinned Loading

  1. hackathon-telecom-ops hackathon-telecom-ops Public

    Multi-agent telecom ops assistant on Google ADK and Vertex Gemini over bundled SQLite via MCP Toolbox

    HTML 2

  2. gpu-autoscale-inference gpu-autoscale-inference Public

    Scale-to-zero GPU inference on Kubernetes with KEDA pod autoscaling and Cluster Autoscaler node provisioning

    Shell

  3. edge-mcp-caller edge-mcp-caller Public

    270M specialist baking MCP tool knowledge into weights: 99.5% accuracy, 32x fewer prompt tokens

    Python

  4. agentlens-infrastructure agentlens-infrastructure Public

    AWS infrastructure-as-code for AgentLens: Terraform, Docker Compose, nginx, Cloudflare on EC2

    Shell

  5. cv-pipeline cv-pipeline Public

    Progressive CV pipeline for construction blueprints: shape detection, OCR, YOLO symbols, multi-stage analyzer

    Python 1

  6. telecom-ml-portfolio telecom-ml-portfolio Public

    Six end-to-end ML projects on churn, anomaly detection, QoE, forecasting, and network optimization in telecom