Skip to content

mrp2003/AskTheDoc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

AskTheDoc: AI-Powered PDF Assistant

image AskTheDoc is a powerful AI assistant that lets users chat with one or more PDF documents via a Streamlit web interface or CLI. It uses LangChain's RAG (Retrieval-Augmented Generation) pipeline with OpenAI LLMs, FAISS vector search, and chunked document embeddings to deliver accurate, source-backed answers.

Features

  • Multi-PDF upload and selection
  • Per-file vector store creation and merging
  • Real-time chat with source chunk display
  • Exportable chat history
  • Stats panel for embedded document analytics
  • Persistent vector store (saved to disk)

Tech Stack

  • LangChain: Document loading, splitting, QA chains
  • PyPDFLoader: PDF parsing
  • FAISS: Fast Approximate Nearest Neighbor vector DB
  • OpenAIEmbeddings: Embedding generation
  • ChatOpenAI: Language model (GPT-style)
  • Streamlit: Interactive web UI
  • dotenv: Environment variable management

Installation

  1. Clone the repository
git clone https://github.com/mrp2003/askthedoc
cd askthedoc
  1. Install dependencies
pip install -r requirements.txt
  1. Set your OpenAI key Create a .env file:
OPENAI_API_KEY=your_openai_key

Usage

streamlit run app.py
  • Upload one or more PDFs.
  • Select the PDFs you want to query.
  • Ask questions and see source-backed answers.
  • View stats, reset session, and export chat logs.

Screenshot from 2025-06-13 06-23-56

Contributing

Pull requests are welcome! For major changes, please open an issue first to discuss what you would like to change or improve. If you found this helpful, give it a ⭐ on GitHub!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages