Job applications, automated. You stay on the couch.
CouchHire is a fully agentic job application pipeline. Paste a job description, approve via Telegram, and it handles the rest — tailored resume, cover letter, email draft, and ATS form filling. A self-improving NLP match scorer learns from your outcomes over time.
Demo.mp4
- System Architecture
- How It Works
- Tech Stack
- Project Structure
- Prerequisites
- Getting Your API Keys
- Installation
- Configuration
- Database Setup
- CV Setup
- Running the Project
- Standalone Resume Generation
- Docker
- Self-Improving Loop
- Dashboard
- Contributing
- License
flowchart TD
%% ═══════════════════════════════════════════
%% ONE-TIME SETUP
%% ═══════════════════════════════════════════
INIT["One-Time Setup
cv/embed_cv.py
Parse master CV → embed sections"]
CHROMA[("ChromaDB
cv/chroma_store/
Embeddings + templates")]
INIT --> CHROMA
%% ═══════════════════════════════════════════
%% ENTRY POINTS
%% ═══════════════════════════════════════════
IN1["Paste JD
Raw job description text"]
IN2["Job URL
Scrapes JD from link"]
IN3["Job Search
Indeed · LinkedIn · Google
jobs/job_search.py + job_filter.py"]
TG1["/apply
bot/telegram_bot.py"] --> IN1 & IN2
TG2["/search
bot/telegram_bot.py"] --> IN3
IN1 --> A2
IN2 --> A2
IN3 -->|filtered results| A2
%% ═══════════════════════════════════════════
%% PIPELINE
%% ═══════════════════════════════════════════
A2["Scrape JD
Fetch from URL or pass raw text"]
A2 --> B["JD Parser
agents/jd_parser.py & nlp/ner_model.py
Extract skills + requirements"]
B --> C["CV RAG Retrieval
agents/cv_rag.py
ChromaDB similarity"]
CHROMA -.->|retrieve sections| C
C --> D["Match Scorer
agents/match_scorer.py
NLP score 0–100"]
D --> E{"Score ≥ Threshold?"}
E -->|No| F["Rejected
Notify + update DB"]
E -->|Yes| G["Resume Tailor
agents/resume_tailor.py
Retrieve template + instructions from ChromaDB"]
CHROMA -.->|retrieve template + instructions| G
G --> G1["LLM Selector
agents/llm_selector.py
Pick sections, projects, bullets"]
G1 --> G1B["Skills Assembly
agents/resume_tailor.py
Build skill categories"]
G1B --> G1C["Resume Assembler
agents/resume_assembler.py
LaTeX assembly → pdflatex → PDF
writes resume_content"]
G1C -->|"cover_letter_required?"| G2["Cover Letter
agents/cover_letter.py
Complements resume, never repeats"]
G1C -->|"no cover letter"| G3["Email Drafter
agents/email_drafter.py
Subject + body from resume_content"]
G2 --> G2B["Compile Cover Letter PDF
pdflatex"]
G2B --> G3
G3 --> H["Telegram Gate 1
bot/telegram_bot.py
Approve / Regenerate / Cancel"]
H -->|Regenerate| G
H -->|Cancel| F
H -->|Approve| I["Apply Router
agents/apply_router.py
Detect route"]
%% ═══════════════════════════════════════════
%% APPLY ROUTES
%% ═══════════════════════════════════════════
I --> J1["Email Route
apply/gmail_sender.py
MCP → create Gmail draft"]
I --> J2["Form Route
apply/browser_agent.py
Playwright ATS fill"]
I -->|"no apply method"| J4["Manual Route
Resume + draft created
User applies manually"]
J2 -->|"needs manual help"| J3["Manual Takeover
apply/session_handoff.py
CDP handoff to user"]
J3 -->|"back to auto"| J2
J1 --> H2["Telegram Gate 2
bot/telegram_bot.py
Review draft → Send / Cancel"]
H2 -->|Cancel| F
H2 -->|Send| J1B["Execute Send
apply/gmail_sender.py
MCP → send email"]
J1B --> N["Notify
bot/telegram_bot.py
Telegram confirmation"]
J2 --> N
J3 --> N
J4 --> N
%% ═══════════════════════════════════════════
%% POST-PIPELINE
%% ═══════════════════════════════════════════
N --> K["Log + Save
db/supabase_client.py
Supabase PostgreSQL"]
K --> L["Outcome Feedback
Telegram /outcome label
nlp/retrain.py → self-improving scorer"]
L -.->|retrain| D
K -.->|read| DASH["Dashboard
dashboard/app.py
Streamlit: Tracker · Analytics · Retrain"]
- A job description comes in — pasted, scraped from a URL, or discovered via multi-board job search.
- The JD Parser extracts structured requirements: skills, apply method, cover letter needed, and email instructions.
- CV RAG retrieves the most relevant sections of your master CV from ChromaDB.
- The Match Scorer produces a 0–100 fit score using NLP embeddings. Below the threshold → auto-rejected.
- Resume Tailor generates a targeted LaTeX resume and compiles it to PDF. If a cover letter is required, the Cover Letter agent writes one that complements (never repeats) the resume.
- The Email Drafter writes a human-quality application email referencing specific projects.
- Telegram Gate 1 — you review everything and approve, edit, or cancel.
- The Apply Router detects the application method:
- Email → Gmail draft via MCP → Gate 2 → send
- Form → Playwright browser agent fills the ATS form with LLM-assisted field mapping
- Manual → draft created, you apply yourself
- Everything is logged to Supabase. Label outcomes in Telegram and the scorer retrains on your data.
| Category | Technology |
|---|---|
| Orchestration | LangGraph |
| LLM Gateway | LiteLLM (9-model fallback across 5 providers) |
| Embeddings | ChromaDB + sentence-transformers |
| NLP | spaCy (NER) + custom match scorer |
| Database | Supabase (PostgreSQL) |
| Resume Compilation | pdflatex (texlive) |
| Gmail via MCP (Streamable HTTP) | |
| Form Filling | Playwright + LLM field mapping |
| Notifications | python-telegram-bot |
| Dashboard | Streamlit |
| Job Search | JobSpy (Indeed, LinkedIn, Google, Glassdoor, ZipRecruiter) |
couchhire/
├── agents/ # LangGraph agents
│ ├── jd_parser.py # JD → structured requirements
│ ├── cv_rag.py # ChromaDB retrieval → ranked CV sections
│ ├── match_scorer.py # NLP match scoring (0–100)
│ ├── resume_tailor.py # LaTeX resume tailoring + skills assembly
│ ├── cover_letter.py # Cover letter generation
│ ├── email_drafter.py # Application email generation
│ ├── apply_router.py # Route detection (email | form | manual)
│ ├── llm_selector.py # LLM-driven content selection
│ ├── resume_assembler.py # LaTeX block extraction → PDF compilation
│ └── cv_content_helpers.py # Section → named block extraction
├── apply/ # Application execution
│ ├── gmail_sender.py # MCP client for Gmail
│ ├── browser_agent.py # ATS form filler (Playwright + LLM)
│ └── session_handoff.py # CDP session manager
├── bot/
│ └── telegram_bot.py # Notifications + approval gates
├── jobs/
│ ├── job_search.py # Multi-board concurrent search
│ └── job_filter.py # Score + filter results
├── cv/
│ ├── embed_cv.py # Parse → embed pipeline
│ ├── cv_parser.py # .tex/.pdf/.docx → sections
│ ├── uploads/ # Your CV + optional template (gitignored)
│ ├── chroma_store/ # ChromaDB embeddings (gitignored)
│ ├── defaults/ # Fallback templates + instructions
│ └── output/ # Generated PDFs + .tex (gitignored)
├── db/
│ ├── supabase_client.py # CRUD helpers
│ ├── schema.sql # Table definitions (idempotent)
│ └── create_tables.py # One-time table creation script
├── nlp/
│ ├── ner_model.py # Skill extraction
│ ├── retrain.py # Fine-tune scorer on outcomes
│ └── models/ # Trained model artifacts
├── llm/
│ └── client.py # 9-model fallback chain
├── dashboard/
│ ├── app.py # Streamlit dashboard
│ └── helpers.py # Dashboard utility functions
├── tests/ # Test suite (290 tests)
│ ├── conftest.py # Shared fixtures
│ └── test_*.py # Unit + integration tests
├── pipeline.py # LangGraph orchestrator (19 nodes, 7 conditional edges)
├── generate_resume.py # Standalone resume generator
├── config.py # Environment validation
├── entrypoint.sh # Docker entrypoint script
├── Dockerfile
├── docker-compose.yml
├── .env.example # Environment template
├── LICENSE
└── requirements.txt
- Python 3.11+
- pdflatex —
sudo apt install texlive-latex-full(Linux) or install MacTeX (macOS) - Chromium — installed automatically by Playwright
- At least one LLM provider API key (Groq is free and recommended to start)
- A Supabase project (free tier works)
- A Telegram bot (free, takes 2 minutes)
You need at least one. CouchHire has a 9-model fallback chain — configure as many as you like for resilience.
| Provider | Free Tier | How to Get |
|---|---|---|
| Groq | Yes (generous) | console.groq.com → API Keys → Create |
| Gemini | Yes | aistudio.google.com → Create API Key |
| Mistral | Yes (phone verification) | console.mistral.ai → API Keys → Create |
| OpenRouter | Yes | openrouter.ai → Keys → Create |
| Anthropic | Paid | console.anthropic.com → API Keys → Create |
| OpenAI | Paid | platform.openai.com/api-keys → Create |
- Go to supabase.com and create a new project
- Settings → API → copy Project URL →
SUPABASE_URL - Copy anon / public key →
SUPABASE_KEY
- Open Telegram → search for
@BotFather→ send/newbot - Follow the prompts → copy the token →
TELEGRAM_BOT_TOKEN - Start your bot (send it
/start), then visithttps://api.telegram.org/bot<YOUR_TOKEN>/getUpdates - Find
"chat": {"id": ...}in the response →TELEGRAM_CHAT_ID
CouchHire uses google_workspace_mcp as its Gmail server. CouchHire connects as an MCP client over Streamable HTTP.
Google Cloud setup:
- Go to Google Cloud Console → create a project
- Enable the Gmail API (APIs & Services → Library → Gmail API)
- Configure OAuth consent screen → External → add scopes:
gmail.compose,gmail.modify→ add your email as a test user - Create OAuth credentials → Desktop application → copy Client ID and Client Secret
Start the MCP server:
-
Install uv:
curl -LsSf https://astral.sh/uv/install.sh | sh -
Create a launcher script (e.g.
~/start-gmail-mcp.sh):#!/bin/bash export GOOGLE_OAUTH_CLIENT_ID="your-client-id.apps.googleusercontent.com" export GOOGLE_OAUTH_CLIENT_SECRET="your-client-secret" export OAUTHLIB_INSECURE_TRANSPORT=1 export MCP_SINGLE_USER_MODE=true export USER_GOOGLE_EMAIL="your-email@gmail.com" uvx workspace-mcp --tools gmail --transport streamable-http
-
chmod +x ~/start-gmail-mcp.shand run it — server starts athttp://localhost:8000/mcp -
On first run it opens your browser for Google OAuth — sign in and authorize
-
Set
GMAIL_MCP_URL=http://localhost:8000/mcpin your.env
The MCP server runs via
uvx(system-level), separate from your project venv. Start it in a dedicated terminal before running the pipeline.
Job Search (JobSpy)
No API keys required. JobSpy scrapes job boards directly. Each board is scraped individually so a failure on one doesn't discard results from the others.
⚠️ LinkedIn is rate-limit aggressive (~10 pages per IP). Use proxies for heavy LinkedIn scraping.
# Clone
git clone https://github.com/RaySatish/CouchHire.git
cd couchhire
# Create virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Install Playwright browsers
playwright install chromium
# Install spaCy model
python -m spacy download en_core_web_sm
# Set up environment
cp .env.example .env
# Edit .env with your API keys (see above)All config lives in .env (see .env.example for the full list with comments). You never edit config.py directly.
Required variables:
| Variable | Description |
|---|---|
LLM_PROVIDER |
groq | gemini | mistral | openrouter | anthropic | openai |
| Provider API key | The key for your chosen provider |
SUPABASE_URL |
Supabase project URL |
SUPABASE_KEY |
Supabase anon/public key |
TELEGRAM_BOT_TOKEN |
From @BotFather |
TELEGRAM_CHAT_ID |
Your personal chat ID |
GMAIL_MCP_URL |
Gmail MCP server URL (e.g. http://localhost:8000/mcp) |
APPLICANT_NAME |
Your full name |
APPLICANT_EMAIL |
Your email address |
GITHUB_URL |
Your GitHub profile URL |
Optional variables:
| Variable | Default | Description |
|---|---|---|
MATCH_THRESHOLD |
60 |
Minimum match score (0–100) to proceed |
OUTPUT_BASE_DIR |
cv/output/ |
Where generated PDFs are saved |
APPLICANT_PHONE |
— | For ATS form fields |
APPLICANT_LINKEDIN |
— | For ATS form fields |
BROWSER_HEADLESS |
false |
Run browser agent headless |
CDP_PORT |
9222 |
Chrome DevTools Protocol port |
JOBSPY_SITES |
indeed,linkedin,google |
Job boards to search |
JOBSPY_COUNTRY |
USA |
Country for searches |
MIN_RETRAIN_LABELS |
10 |
Minimum labeled outcomes before retraining |
RETRAIN_EVERY |
10 |
Auto-retrain every N new labels (0 = manual only) |
- In your Supabase project → SQL Editor → New query
- Paste the contents of
db/schema.sqland click Run - Verify:
python -c "from db.supabase_client import get_all_applications; print('OK')"
CouchHire accepts your master CV in LaTeX, PDF, or Word format.
| File | Required | Description |
|---|---|---|
master_cv.tex / .pdf / .docx |
Yes | Your full master CV |
resume_template.tex |
No | Your preferred LaTeX resume layout |
cover_letter_template.tex |
No | Your preferred cover letter layout |
instructions.md |
No | Tailoring preferences (e.g. "always keep to 1 page", "lead with projects for ML roles") |
If you don't provide a template or instructions, CouchHire uses sensible defaults.
python cv/embed_cv.pyThis parses your CV into sections, embeds each one with sentence-transformers, and stores everything in ChromaDB. Re-run whenever you update your CV.
python -c "
import chromadb
client = chromadb.PersistentClient(path='cv/chroma_store')
col = client.get_collection('master_cv')
print(f'Chunks embedded: {col.count()}')
"Run each in a separate terminal. All three need to be running for the full pipeline.
Terminal 1 — Telegram bot:
python bot/telegram_bot.pyTerminal 2 — Dashboard:
streamlit run dashboard/app.pyTerminal 3 — Pipeline:
# Paste a JD
python pipeline.py --jd "Full job description text here"
# Scrape from URL
python pipeline.py --url "https://jobs.example.com/job/12345"
# From a file
python pipeline.py --file path/to/jd.txt
# Search job boards
python pipeline.py --search --query "machine learning engineer" --location "London"After running, check Telegram — you'll get a notification for each job that passes the match threshold.
Generate a tailored resume without the full pipeline (no Telegram, no email, no apply routing):
python generate_resume.py --jd "Full job description text here"
python generate_resume.py --file path/to/jd.txt
python generate_resume.py # uses a built-in default JD for testingDocker handles Python, pdflatex, Playwright, and all dependencies for you.
# Fill in your .env (same as manual setup)
cp .env.example .env
# Start everything
docker compose up --build
# One-off pipeline run
docker compose run app python pipeline.py --jd "Job description here"
# Stop
docker compose downThis starts three containers:
- app — Telegram bot + pipeline
- dashboard — Streamlit on port 8501
- gmail-mcp — Gmail MCP server on port 8000
CouchHire gets better at predicting which jobs are worth applying for the more you use it.
- An application is sent and logged in Supabase
- You hear back from the employer → open Telegram → tap the outcome label
- Labels: No Reply · Screening · Interview · Rejected · Offer
- Every N new labels (configurable), the match scorer fine-tunes on your personal
(JD, resume, outcome)pairs - The fine-tuned model loads automatically on the next pipeline run
The more you label, the more personalised the scoring becomes.
Available at localhost:8501 once running.
| Tab | Description |
|---|---|
| Tracker | All applications, sortable by date / score / status |
| Analytics | Response rates, match score distribution, resume performance |
| Retrain | Outcome labels, manual retrain button, model accuracy |
| Settings | LLM provider, match threshold, ChromaDB status, connection tests |
- Fork the repository
- Create a feature branch:
git checkout -b feature/your-feature-name - Make your changes
- Test:
python pipeline.py --jd "test"should run without errors - Open a pull request
For bugs, open an issue with the error message, your OS, Python version, and LLM provider.
Never commit .env, cv/uploads/, or cv/chroma_store/.
MIT
