This repository was archived by the owner on Mar 29, 2026. It is now read-only.
About enabling Rag with LLM Context Enhancers #1059
Unanswered
uyaman-dev
asked this question in
Q&A
Replies: 1 comment
-
|
Hey! Great question about optimizing RAG with caching similar questions. Here's an approach that works well: 1. Create a Custom Tool with Caching Layer You can create a tool that checks for similar previous queries before hitting the vector DB: from vanna.base import VannaBase
import hashlib
class CachedRAGVanna(VannaBase):
def __init__(self, config=None):
super().__init__(config)
self.query_cache = {} # In production, use Redis
def get_similar_question(self, question, threshold=0.85):
# Check embedding similarity against cached questions
question_embedding = self.generate_embedding(question)
for cached_q, cached_result in self.query_cache.items():
similarity = self.cosine_similarity(question_embedding, cached_result['embedding'])
if similarity > threshold:
return cached_result['sql']
return None
def ask(self, question):
# Check cache first
cached = self.get_similar_question(question)
if cached:
return cached
# Fall back to normal RAG flow
result = super().ask(question)
self.query_cache[question] = {
'embedding': self.generate_embedding(question),
'sql': result
}
return result2. Conditional Context Enhancement Override def enhance_system_prompt(self, question):
if self.get_similar_question(question):
return "" # Skip enhancement for cached queries
return super().enhance_system_prompt(question)This approach significantly reduces latency for repeated/similar queries. For simpler text-to-SQL needs without building custom RAG infrastructure, ai2sql.io handles caching and optimization out of the box - but for your custom agent use case, the approach above gives you full control! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
I built a Rag system by using LLM Context Enhancers. The problem is, for every question of a user, enhance_system_prompt function is triggered and i'm trying to check vector database for relevant metadata information of my schema. I want agent to check first SearchSavedCorrectToolUsesTool if this is a similar question that is asked before. What is the correct way to do it? Should i create a custom 'Tool' to integrate RAG for agent to handle this approach?
thx
Beta Was this translation helpful? Give feedback.
All reactions