An enterprise-grade AI agent system I architected to enable employees to interact with company data using natural language queries.
The system combines AWS Bedrock (Claude), Retrieval-Augmented Generation (RAG), and intelligent query building to transform conversational questions
into executable SQL queries against our AWS Redshift data warehouse.
Key Features
- Natural language to SQL query conversion using Claude LLM
- RAG-based context retrieval for accurate query building
- Real-time query execution against AWS Redshift
- Conversational histroy allowing back and forth complex data analysis
- Serverless architecture for cost-effective scaling
- Enterprise security with proper access controls
Technical Architecture
- AWS Lambda serverless deployment with FastAPI framework
- AWS Bedrock integration with Claude LLM for natural language processing
- Amazon Titan embeddings for vector representation of database schema
- MongoDB vector storage for schema embeddings and metadata
- AWS Redshift integration for data warehouse queries
- Comprehensive schema documentation with column descriptions and constraints
- LangChain agent framework for orchestrating the query pipeline
RAG Implementation
- Database schema embedded as contextual documents in MongoDB
- Each document contains column metadata: name, description, type, possible values, table context
- Intelligent keyword extraction from user queries
- Vector similarity search to retrieve relevant schema context
- Dynamic SQL query generation based on retrieved context
- Query validation and execution with result formatting