Skip to content

aviasoletechnologies/GoogleDrive-Agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GoogleDrive-Agent

License: MIT Python 3.10+ Streamlit App Made by Aviasole

Overview

GoogleDrive-Agent is an intelligent document question-answering system that leverages RAG (Retrieval-Augmented Generation) to process documents from your Google Drive. Using LangChain, Google Generative AI, and Qdrant vector database, it enables conversational interactions with your documents.

Key Features

  • 📁 Google Drive Integration - Direct access to your Drive folders
  • 🤖 AI-Powered Q&A - Ask questions about your documents
  • 📚 RAG Architecture - Retrieval-augmented generation for accurate answers
  • 🔍 Vector Search - Semantic similarity search on documents
  • 💬 Conversational Memory - Context-aware multi-turn conversations
  • 🌐 Multi-Format Support - PDF, DOCX, TXT, and more
  • Fast Processing - Real-time document indexing

Quick Start

1. Install Dependencies

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

2. Setup Google Cloud

  1. Create project at Google Cloud Console
  2. Enable "Google Drive API" and "Generative Language API"
  3. Create OAuth 2.0 credentials (Desktop app)
  4. Download as credentials.json

3. Configure Environment

cp .env.example .env
# Edit .env with your credentials

4. Run Application

streamlit run app.py

Visit http://localhost:8501


Configuration

Required Environment Variables

# Google APIs
GOOGLE_API_KEY=your_api_key_here
GOOGLE_CREDENTIALS_PATH=credentials.json
GOOGLE_DRIVE_FOLDER_ID=folder_id_here

# Vector Database (Qdrant)
QDRANT_URL=http://localhost:6333

# LLM Settings
LLM_MODEL=gemini-1.5-flash
LLM_TEMPERATURE=0.7

# Document Processing
CHUNK_SIZE=1000
CHUNK_OVERLAP=200

Setting Up Google Drive Folder ID

  1. Open Google Drive
  2. Right-click desired folder → Share
  3. Copy folder ID from URL: https://drive.google.com/drive/folders/FOLDER_ID

Installation Methods

Docker Compose

# Start all services (app + Qdrant)
docker-compose up

# Access at http://localhost:8501

Manual Setup

# PostgreSQL not required - uses embedded vector DB

# Install dependencies
pip install -r requirements.txt

# Run Qdrant (if using local)
# Download from: https://qdrant.tech/documentation/quick-start/

# Run application
streamlit run app.py

Usage Workflow

1. Sync Documents from Google Drive

  • Click "Data Synchronization" in sidebar
  • Click "Download and Process Files"
  • Wait for documents to be processed
  • Indexed files appear in the UI

2. Ask Questions

  • Type your question in the chat box
  • System searches indexed documents
  • AI generates answer based on found content
  • View source documents in expandable sections

3. Manage Vector Store

  • View number of indexed documents
  • Clear and rebuild vector store if needed
  • Monitor processing status

Architecture

Google Drive
      ↓
[OAuth Authentication]
      ↓
Document Downloader (drive_loader.py)
      ↓
Document Processor (processor.py)
      ↓
Text Splitter + Embeddings
      ↓
Vector Store (vector_store.py)
      ↓
Qdrant Database
      ↓
RAG Agent (agent.py)
      ↓
LangChain + Google GenAI
      ↓
Streamlit UI (app.py)

File Structure

├── app.py                    # Main Streamlit application
├── requirements.txt          # Python dependencies
├── credentials.json          # Google OAuth (not committed)
├── .env.example             # Configuration template
├── .gitignore               # Git ignore rules
└── src/
    ├── __init__.py
    ├── drive_loader.py      # Google Drive integration
    ├── processor.py         # Document processing
    ├── vector_store.py      # Vector database management
    └── agent.py             # RAG agent and LLM integration

Troubleshooting

Google OAuth Issues

Problem: "Invalid credentials" Solution:

  1. Delete token.json if exists
  2. Re-authenticate with Google account
  3. Ensure OAuth consent screen is configured

Vector Database Connection

Problem: "Failed to connect to Qdrant" Solution:

# Check if Qdrant is running
# If using Docker: docker-compose up -d qdrant
# If local: ensure port 6333 is accessible

Memory Issues

Problem: "Out of memory" when processing large files Solution:

  1. Reduce CHUNK_SIZE in .env
  2. Process fewer documents at once
  3. Increase available system RAM

Slow Responses

Problem: Questions taking >30 seconds Solution:

  1. Verify internet connection
  2. Check Qdrant database performance
  3. Reduce number of retrieved documents (k parameter)

Advanced Configuration

Custom Embedding Model

EMBEDDING_MODEL=sentence-transformers/all-mpnet-base-v2

Remote Qdrant Instance

QDRANT_URL=https://your-qdrant-instance.com
QDRANT_API_KEY=your_api_key

Different LLM Models

# Google models
LLM_MODEL=gemini-1.5-pro
LLM_MODEL=gemini-1.5-flash  # Faster, cheaper

# Others via Groq
# Requires langchain-groq package

Security

  • 🔐 Never commit credentials.json or .env
  • 🔐 Use service accounts for production
  • 🔐 Implement authentication layer
  • 🔐 Use HTTPS in production
  • 🔐 Rotate API keys regularly

Performance Tips

  1. Batch Processing - Upload multiple documents together
  2. Chunk Optimization - Adjust CHUNK_SIZE based on document length
  3. Vector Search - Use similarity threshold to filter irrelevant results
  4. Caching - Enable Streamlit caching for repeated queries

Contributing

# Setup development environment
pip install -r requirements.txt
pip install pytest black flake8

# Format code
black .

# Check quality
flake8 .

License

MIT License - See LICENSE file


FAQ

Q: Does it work offline? A: No, requires Google API and internet connection

Q: What's the maximum file size? A: 100MB per file; adjust MAX_FILE_SIZE_MB in .env

Q: Can I use private folders? A: Yes, authenticate with account that has access

Q: Is conversation history saved? A: Only during session; cleared on refresh

Q: How accurate are answers? A: Depends on document quality and chunk size


About Aviasole

Aviasole is an AI development company specializing in cutting-edge artificial intelligence solutions. We create innovative POCs and production-ready applications that demonstrate the power and potential of modern AI technologies.

GoogleDrive-Agent is one of our showcase projects demonstrating RAG (Retrieval-Augmented Generation) patterns and intelligent document processing using LangChain and Google Generative AI.

Learn more: https://aviasole.com


Company

Aviasole - AI Development & Innovation Website: https://aviasole.com

This project is proudly developed and maintained by Aviasole, a leading AI development company focused on creating innovative AI solutions for document processing and enterprise knowledge management.

For more AI projects and solutions, visit aviasole.com


Last Updated: March 2026 Documentation Version: 1.0

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages