GoogleDrive-Agent is an intelligent document question-answering system that leverages RAG (Retrieval-Augmented Generation) to process documents from your Google Drive. Using LangChain, Google Generative AI, and Qdrant vector database, it enables conversational interactions with your documents.
- 📁 Google Drive Integration - Direct access to your Drive folders
- 🤖 AI-Powered Q&A - Ask questions about your documents
- 📚 RAG Architecture - Retrieval-augmented generation for accurate answers
- 🔍 Vector Search - Semantic similarity search on documents
- 💬 Conversational Memory - Context-aware multi-turn conversations
- 🌐 Multi-Format Support - PDF, DOCX, TXT, and more
- ⚡ Fast Processing - Real-time document indexing
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt- Create project at Google Cloud Console
- Enable "Google Drive API" and "Generative Language API"
- Create OAuth 2.0 credentials (Desktop app)
- Download as
credentials.json
cp .env.example .env
# Edit .env with your credentialsstreamlit run app.pyVisit http://localhost:8501
# Google APIs
GOOGLE_API_KEY=your_api_key_here
GOOGLE_CREDENTIALS_PATH=credentials.json
GOOGLE_DRIVE_FOLDER_ID=folder_id_here
# Vector Database (Qdrant)
QDRANT_URL=http://localhost:6333
# LLM Settings
LLM_MODEL=gemini-1.5-flash
LLM_TEMPERATURE=0.7
# Document Processing
CHUNK_SIZE=1000
CHUNK_OVERLAP=200- Open Google Drive
- Right-click desired folder → Share
- Copy folder ID from URL:
https://drive.google.com/drive/folders/FOLDER_ID
# Start all services (app + Qdrant)
docker-compose up
# Access at http://localhost:8501# PostgreSQL not required - uses embedded vector DB
# Install dependencies
pip install -r requirements.txt
# Run Qdrant (if using local)
# Download from: https://qdrant.tech/documentation/quick-start/
# Run application
streamlit run app.py- Click "Data Synchronization" in sidebar
- Click "Download and Process Files"
- Wait for documents to be processed
- Indexed files appear in the UI
- Type your question in the chat box
- System searches indexed documents
- AI generates answer based on found content
- View source documents in expandable sections
- View number of indexed documents
- Clear and rebuild vector store if needed
- Monitor processing status
Google Drive
↓
[OAuth Authentication]
↓
Document Downloader (drive_loader.py)
↓
Document Processor (processor.py)
↓
Text Splitter + Embeddings
↓
Vector Store (vector_store.py)
↓
Qdrant Database
↓
RAG Agent (agent.py)
↓
LangChain + Google GenAI
↓
Streamlit UI (app.py)
├── app.py # Main Streamlit application
├── requirements.txt # Python dependencies
├── credentials.json # Google OAuth (not committed)
├── .env.example # Configuration template
├── .gitignore # Git ignore rules
└── src/
├── __init__.py
├── drive_loader.py # Google Drive integration
├── processor.py # Document processing
├── vector_store.py # Vector database management
└── agent.py # RAG agent and LLM integration
Problem: "Invalid credentials" Solution:
- Delete
token.jsonif exists - Re-authenticate with Google account
- Ensure OAuth consent screen is configured
Problem: "Failed to connect to Qdrant" Solution:
# Check if Qdrant is running
# If using Docker: docker-compose up -d qdrant
# If local: ensure port 6333 is accessibleProblem: "Out of memory" when processing large files Solution:
- Reduce CHUNK_SIZE in .env
- Process fewer documents at once
- Increase available system RAM
Problem: Questions taking >30 seconds Solution:
- Verify internet connection
- Check Qdrant database performance
- Reduce number of retrieved documents (k parameter)
EMBEDDING_MODEL=sentence-transformers/all-mpnet-base-v2QDRANT_URL=https://your-qdrant-instance.com
QDRANT_API_KEY=your_api_key# Google models
LLM_MODEL=gemini-1.5-pro
LLM_MODEL=gemini-1.5-flash # Faster, cheaper
# Others via Groq
# Requires langchain-groq package- 🔐 Never commit
credentials.jsonor.env - 🔐 Use service accounts for production
- 🔐 Implement authentication layer
- 🔐 Use HTTPS in production
- 🔐 Rotate API keys regularly
- Batch Processing - Upload multiple documents together
- Chunk Optimization - Adjust CHUNK_SIZE based on document length
- Vector Search - Use similarity threshold to filter irrelevant results
- Caching - Enable Streamlit caching for repeated queries
# Setup development environment
pip install -r requirements.txt
pip install pytest black flake8
# Format code
black .
# Check quality
flake8 .MIT License - See LICENSE file
Q: Does it work offline? A: No, requires Google API and internet connection
Q: What's the maximum file size? A: 100MB per file; adjust MAX_FILE_SIZE_MB in .env
Q: Can I use private folders? A: Yes, authenticate with account that has access
Q: Is conversation history saved? A: Only during session; cleared on refresh
Q: How accurate are answers? A: Depends on document quality and chunk size
Aviasole is an AI development company specializing in cutting-edge artificial intelligence solutions. We create innovative POCs and production-ready applications that demonstrate the power and potential of modern AI technologies.
GoogleDrive-Agent is one of our showcase projects demonstrating RAG (Retrieval-Augmented Generation) patterns and intelligent document processing using LangChain and Google Generative AI.
Learn more: https://aviasole.com
Aviasole - AI Development & Innovation Website: https://aviasole.com
This project is proudly developed and maintained by Aviasole, a leading AI development company focused on creating innovative AI solutions for document processing and enterprise knowledge management.
For more AI projects and solutions, visit aviasole.com
Last Updated: March 2026 Documentation Version: 1.0