GoogleDrive-Agent

Overview

GoogleDrive-Agent is an intelligent document question-answering system that leverages RAG (Retrieval-Augmented Generation) to process documents from your Google Drive. Using LangChain, Google Generative AI, and Qdrant vector database, it enables conversational interactions with your documents.

Key Features

📁 Google Drive Integration - Direct access to your Drive folders
🤖 AI-Powered Q&A - Ask questions about your documents
📚 RAG Architecture - Retrieval-augmented generation for accurate answers
🔍 Vector Search - Semantic similarity search on documents
💬 Conversational Memory - Context-aware multi-turn conversations
🌐 Multi-Format Support - PDF, DOCX, TXT, and more
⚡ Fast Processing - Real-time document indexing

Quick Start

1. Install Dependencies

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

2. Setup Google Cloud

Create project at Google Cloud Console
Enable "Google Drive API" and "Generative Language API"
Create OAuth 2.0 credentials (Desktop app)
Download as credentials.json

3. Configure Environment

cp .env.example .env
# Edit .env with your credentials

4. Run Application

streamlit run app.py

Visit http://localhost:8501

Configuration

Required Environment Variables

# Google APIs
GOOGLE_API_KEY=your_api_key_here
GOOGLE_CREDENTIALS_PATH=credentials.json
GOOGLE_DRIVE_FOLDER_ID=folder_id_here

# Vector Database (Qdrant)
QDRANT_URL=http://localhost:6333

# LLM Settings
LLM_MODEL=gemini-1.5-flash
LLM_TEMPERATURE=0.7

# Document Processing
CHUNK_SIZE=1000
CHUNK_OVERLAP=200

Setting Up Google Drive Folder ID

Open Google Drive
Right-click desired folder → Share
Copy folder ID from URL: https://drive.google.com/drive/folders/FOLDER_ID

Installation Methods

Docker Compose

# Start all services (app + Qdrant)
docker-compose up

# Access at http://localhost:8501

Manual Setup

# PostgreSQL not required - uses embedded vector DB

# Install dependencies
pip install -r requirements.txt

# Run Qdrant (if using local)
# Download from: https://qdrant.tech/documentation/quick-start/

# Run application
streamlit run app.py

Usage Workflow

1. Sync Documents from Google Drive

Click "Data Synchronization" in sidebar
Click "Download and Process Files"
Wait for documents to be processed
Indexed files appear in the UI

2. Ask Questions

Type your question in the chat box
System searches indexed documents
AI generates answer based on found content
View source documents in expandable sections

3. Manage Vector Store

View number of indexed documents
Clear and rebuild vector store if needed
Monitor processing status

Architecture

Google Drive
      ↓
[OAuth Authentication]
      ↓
Document Downloader (drive_loader.py)
      ↓
Document Processor (processor.py)
      ↓
Text Splitter + Embeddings
      ↓
Vector Store (vector_store.py)
      ↓
Qdrant Database
      ↓
RAG Agent (agent.py)
      ↓
LangChain + Google GenAI
      ↓
Streamlit UI (app.py)

File Structure

├── app.py                    # Main Streamlit application
├── requirements.txt          # Python dependencies
├── credentials.json          # Google OAuth (not committed)
├── .env.example             # Configuration template
├── .gitignore               # Git ignore rules
└── src/
    ├── __init__.py
    ├── drive_loader.py      # Google Drive integration
    ├── processor.py         # Document processing
    ├── vector_store.py      # Vector database management
    └── agent.py             # RAG agent and LLM integration

Troubleshooting

Google OAuth Issues

Problem: "Invalid credentials" Solution:

Delete token.json if exists
Re-authenticate with Google account
Ensure OAuth consent screen is configured

Vector Database Connection

Problem: "Failed to connect to Qdrant" Solution:

# Check if Qdrant is running
# If using Docker: docker-compose up -d qdrant
# If local: ensure port 6333 is accessible

Memory Issues

Problem: "Out of memory" when processing large files Solution:

Reduce CHUNK_SIZE in .env
Process fewer documents at once
Increase available system RAM

Slow Responses

Problem: Questions taking >30 seconds Solution:

Verify internet connection
Check Qdrant database performance
Reduce number of retrieved documents (k parameter)

Advanced Configuration

Custom Embedding Model

EMBEDDING_MODEL=sentence-transformers/all-mpnet-base-v2

Remote Qdrant Instance

QDRANT_URL=https://your-qdrant-instance.com
QDRANT_API_KEY=your_api_key

Different LLM Models

# Google models
LLM_MODEL=gemini-1.5-pro
LLM_MODEL=gemini-1.5-flash  # Faster, cheaper

# Others via Groq
# Requires langchain-groq package

Security

🔐 Never commit credentials.json or .env
🔐 Use service accounts for production
🔐 Implement authentication layer
🔐 Use HTTPS in production
🔐 Rotate API keys regularly

Performance Tips

Batch Processing - Upload multiple documents together
Chunk Optimization - Adjust CHUNK_SIZE based on document length
Vector Search - Use similarity threshold to filter irrelevant results
Caching - Enable Streamlit caching for repeated queries

Contributing

# Setup development environment
pip install -r requirements.txt
pip install pytest black flake8

# Format code
black .

# Check quality
flake8 .

License

MIT License - See LICENSE file

FAQ

Q: Does it work offline? A: No, requires Google API and internet connection

Q: What's the maximum file size? A: 100MB per file; adjust MAX_FILE_SIZE_MB in .env

Q: Can I use private folders? A: Yes, authenticate with account that has access

Q: Is conversation history saved? A: Only during session; cleared on refresh

Q: How accurate are answers? A: Depends on document quality and chunk size

About Aviasole

Aviasole is an AI development company specializing in cutting-edge artificial intelligence solutions. We create innovative POCs and production-ready applications that demonstrate the power and potential of modern AI technologies.

GoogleDrive-Agent is one of our showcase projects demonstrating RAG (Retrieval-Augmented Generation) patterns and intelligent document processing using LangChain and Google Generative AI.

Learn more: https://aviasole.com

Company

Aviasole - AI Development & Innovation Website: https://aviasole.com

This project is proudly developed and maintained by Aviasole, a leading AI development company focused on creating innovative AI solutions for document processing and enterprise knowledge management.

For more AI projects and solutions, visit aviasole.com

Last Updated: March 2026 Documentation Version: 1.0

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
src		src
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

GoogleDrive-Agent

Overview

Key Features

Quick Start

1. Install Dependencies

2. Setup Google Cloud

3. Configure Environment

4. Run Application

Configuration

Required Environment Variables

Setting Up Google Drive Folder ID

Installation Methods

Docker Compose

Manual Setup

Usage Workflow

1. Sync Documents from Google Drive

2. Ask Questions

3. Manage Vector Store

Architecture

File Structure

Troubleshooting

Google OAuth Issues

Vector Database Connection

Memory Issues

Slow Responses

Advanced Configuration

Custom Embedding Model

Remote Qdrant Instance

Different LLM Models

Security

Performance Tips

Contributing

License

FAQ

About Aviasole

Company

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages