Skip to content
/ RAG Public

An enterprise grade RAG (Retrieval Augmented Generation) system built with Spring Boot 3.x, demonstrating advanced backend engineering and modern AI integration.

Notifications You must be signed in to change notification settings

texican/RAG

Repository files navigation

BYO RAG System

Build Your Own Retrieval Augmented Generation System

Java Spring Boot Spring AI Version Tests License

🎯 Project Overview

BYO RAG System is a comprehensive AI-powered knowledge platform that shows you how to build your own enterprise-grade RAG solution from the ground up. This complete implementation demonstrates the intersection of modern software architecture and artificial intelligence through a fully-realized microservices ecosystem that enables intelligent document processing and conversational AI capabilities.

What is RAG?

Retrieval Augmented Generation (RAG) combines the power of large language models with your organization's private knowledge base, enabling AI-powered question answering over your documents while maintaining complete data privacy and control.

Key Value Propositions

  • 🏢 Multi-tenant Architecture: Complete data isolation for multiple organizations
  • 🔒 Enterprise Security: JWT-based authentication with role-based access control
  • High Performance: Sub-200ms query responses with vector similarity search
  • 📄 Document Intelligence: Extract insights from PDF, DOCX, TXT, MD, and HTML files
  • 🔍 Hybrid Search: Combines semantic understanding with keyword precision
  • 🚀 Production Ready: Containerized microservices with monitoring and observability
  • 🧪 Testing Infrastructure: Current 93% test success rate (58/62 passing) with comprehensive audit complete and 82 story point testing backlog

✅ Current Status: Complete BYO RAG system with all 6 microservices implemented and tested. Docker deployment ready. View detailed status

📚 Table of Contents

🏗️ Architecture & Design

This system implements a microservices architecture with complete multi-tenant isolation, demonstrating enterprise-grade patterns and modern cloud-native design principles.

Core Architecture Principles

  • Domain-Driven Design: Each microservice owns its domain and data
  • Event-Driven Processing: Asynchronous operations via Apache Kafka
  • Polyglot Persistence: PostgreSQL for structured data, Redis for vectors
  • Security-First: JWT authentication with tenant-scoped data access
  • Observability: Comprehensive monitoring and distributed tracing

Microservices Overview

🌐 API Gateway (Port 8080)     → Routes and secures all external traffic
🔐 Auth Service (Port 8081)    → JWT authentication & tenant management  
📄 Document Service (Port 8082) → File processing & text extraction
🔍 Embedding Service (Port 8083) → Vector generation & similarity search
🤖 RAG Core Service (Port 8084)  → LLM integration & query processing
⚙️  Admin Service (Port 8085)    → Administrative operations & analytics

🌟 Key Features

🔐 Enterprise Security & Multi-Tenancy

  • Complete Data Isolation: Each tenant's data is fully segregated
  • JWT-Based Authentication: Secure, stateless authentication
  • Role-Based Access Control: ADMIN, USER, and READER permissions
  • Audit Logging: Complete traceability of all operations

📄 Intelligent Document Processing

  • Multi-Format Support: PDF, DOCX, TXT, Markdown, HTML
  • Smart Text Extraction: Apache Tika-powered content analysis
  • Configurable Chunking: Optimized for different document types
  • Asynchronous Processing: Non-blocking operations via Kafka events

🤖 Advanced RAG Pipeline

  • Multiple Embedding Models: OpenAI, local models, custom implementations
  • Vector Similarity Search: Redis-powered with tenant isolation
  • Hybrid Search Strategy: Semantic + keyword search combination
  • LLM Integration: Support for OpenAI GPT models and local Ollama
  • Streaming Responses: Real-time answer generation
  • Enterprise Error Handling: ✅ Complete - Comprehensive retry mechanisms, circuit breakers, dead letter queues, and monitoring (ERROR-001 & KAFKA-001 implemented)

📊 Administration & Analytics

  • Multi-Tenant Management: ✅ Complete - Full tenant lifecycle operations
  • User Administration: ✅ Complete - Database-backed user management with roles
  • Usage Analytics: ✅ Complete - Comprehensive reporting and monitoring
  • Health Monitoring: ✅ Complete - Deep service health checks and diagnostics
  • Docker Deployment: ✅ Complete - All 6 services operational with health monitoring

🚀 Quick Start Guide

Prerequisites

  • Java 21+ (OpenJDK recommended)
  • Maven 3.8+
  • Docker & Docker Compose
  • Git for version control

1️⃣ Setup Your Environment

# Clone the repository
git clone https://github.com/your-org/enterprise-rag.git
cd enterprise-rag

# Start infrastructure services (PostgreSQL, Redis, Kafka, Ollama)
docker-compose up -d

# Verify all services are running
docker-compose ps

2️⃣ Build and Run Services

Option 1: Docker Compose (Recommended)

# Start all services with infrastructure (using fixed configuration)
docker-compose -f config/docker/docker-compose.fixed.yml up -d

# Check system health
./config/docker/docker-health.sh

# View service logs
docker-compose -f config/docker/docker-compose.fixed.yml logs -f

Option 2: Individual Maven Services

# Build all modules
mvn clean install

# Run each service in a separate terminal
cd rag-gateway && mvn spring-boot:run             # Port 8080 - API Gateway
cd rag-auth-service && mvn spring-boot:run        # Port 8081 - Authentication
cd rag-document-service && mvn spring-boot:run    # Port 8082 - Document Processing
cd rag-embedding-service && mvn spring-boot:run   # Port 8083 - Vector Operations
cd rag-core-service && mvn spring-boot:run        # Port 8084 - RAG Pipeline
cd rag-admin-service && mvn spring-boot:run       # Port 8085 - Admin Operations

3️⃣ Verify Installation

Current Docker Service Status (DOCKER-001 Completed):

Service Health Check URL Port Status
API Gateway http://localhost:8080/actuator/health 8080 ✅ Healthy
Auth Service http://localhost:8081/actuator/health 8081 ✅ Healthy
Document Service http://localhost:8082/actuator/health 8082 ✅ Healthy
Embedding Service http://localhost:8083/actuator/health 8083 ✅ Healthy
Core Service http://localhost:8084/actuator/health 8084 ✅ Healthy
Admin Service http://localhost:8085/admin/api/actuator/health 8085 ✅ Running

Infrastructure Services:

Service URL Status
PostgreSQL localhost:5432 ✅ Healthy
Redis Stack localhost:6379 ✅ Healthy
Apache Kafka localhost:9092 🔄 Not Yet Integrated
Ollama LLM localhost:11434 🔄 Optional
Grafana http://localhost:3000 (admin/admin) ✅ Working
Prometheus http://localhost:9090 ✅ Working

4️⃣ Test the System

Using the API Gateway (Recommended):

# 1. Check system health through gateway
curl http://localhost:8080/actuator/health

# 2. Create a tenant through gateway
curl -X POST http://localhost:8080/api/auth/tenants/register \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Dev Company",
    "slug": "dev-company", 
    "description": "Development tenant"
  }'

# 3. Login through gateway (admin user exists by default)
curl -X POST http://localhost:8080/api/auth/login \
  -H "Content-Type: application/json" \
  -d '{
    "email": "admin@enterprise-rag.com",
    "password": "AdminPass123!"
  }'

# 4. Use the returned JWT token for authenticated requests
TOKEN="your-jwt-token-here"
curl -X GET http://localhost:8080/api/admin/tenants \
  -H "Authorization: Bearer $TOKEN"

Direct Service Testing:

# Run comprehensive system test
./scripts/tests/test-system.sh

# Check service status
./scripts/utils/service-status.sh

📊 Development Status

📊 Project Status (2025-09-11): ERROR HANDLING & PROJECT MANAGEMENT INFRASTRUCTURE COMPLETE 🎯

  • All 6 microservices operational in Docker with full system integration
  • ✅ ERROR-001 & KAFKA-001 COMPLETED - Comprehensive Kafka error handling with retry mechanisms, circuit breakers, dead letter queues, and monitoring (16 story points total)
  • ✅ PROJECT MANAGEMENT ENHANCED - Consolidated backlog management with comprehensive safeguards and validation procedures
  • ✅ DOCUMENTATION UPDATED - All project documentation current with 39 completed story points and 84 active story points properly tracked
  • ✅ BACKLOG CONSOLIDATED - Single authoritative backlog with 10 active testing/security stories (84 story points)
  • System reliability significantly enhanced with enterprise-grade error handling, automatic recovery, and comprehensive monitoring
  • Production-ready error handling with exponential backoff, circuit breaker patterns, and administrator alerting
  • Comprehensive project management with data protection safeguards and process improvements
  • Next focus: Critical testing gaps (AUTH-TEST-001, DOCUMENT-TEST-002, GATEWAY-TEST-005)

✅ Services Implementation Status (All Complete with Tests)

Service Implementation Features Test Status Docker Ready
rag-shared ✅ Complete Common DTOs, entities, utilities ✅ Unit Tests ✅ Library
rag-auth-service ✅ Complete JWT auth, tenant management ✅ Unit Tests ✅ Production
rag-document-service ✅ Complete File processing, chunking, async processing ✅ Unit Tests ✅ Production
rag-embedding-service ✅ Complete Vector operations, similarity search, enterprise error handling ✅ Unit Tests ✅ Production
rag-admin-service ✅ Complete Admin operations, database analytics ✅ Unit Tests ✅ Production
rag-core-service ✅ Complete RAG pipeline, LLM integration, streaming 100% Unit Tests ✅ Production
rag-gateway ✅ Complete API Gateway, JWT validation, routing ✅ Unit Tests ✅ Production

🎯 Recent Major Achievements (2025-09-10)

  • Service Logic Improvements: Enhanced QueryOptimizationService (60+ stopwords, better UX), ConversationService (consistent error handling), LLMIntegrationService (improved monitoring)
  • Error Handling Framework: Comprehensive 593-line defensive programming guidelines with 7 pattern categories and real implementation examples
  • Documentation Integration: Error handling standards integrated into developer workflow and contribution guidelines
  • Test Suite Maintained: All improvements implemented while maintaining 96/96 tests passing (100% success rate)
  • Thread Safety Fixes: Resolved ContextAssemblyService configuration issues and static initialization problems
  • Docker infrastructure stable: PostgreSQL + Redis Stack operational with fixed configurations
  • Comprehensive Javadoc documentation (92.4% coverage across all modules)

🔧 Current System Status

  • Complete Implementation: All 6 microservices fully implemented and operational in Docker
  • 🧪 Testing Infrastructure: Comprehensive testing audit complete with prioritized improvement roadmap
    • Test Success Rate: 93% (58/62 tests passing) with remaining infrastructure-related issues
    • Coverage Status: 27% (40 test files / 149 source files) → Target: >80%
    • Strong Areas: Core RAG service (100% unit test success), Admin service (58/58 tests passing - enterprise-grade testing), Embedding service (comprehensive error handling tests)
    • Recent Achievement: ADMIN-TEST-006 now 85% complete with enterprise testing standards applied (AssertJ assertions, comprehensive Javadoc)
    • Critical Gaps: Auth service (no unit tests), Document service (missing service layer tests), Gateway (minimal security tests)
  • Database integration: PostgreSQL + Redis Stack healthy and connected
  • Authentication service: JWT-based auth with multi-tenant support fully working
  • Service Architecture: Clean separation of concerns with proper dependency injection
  • Docker deployment: Production-ready deployment with all services operational
  • Documentation: Enterprise-grade API documentation and comprehensive testing guidelines

🛠️ Developer Reference

Architecture Diagram

graph TB
    Gateway[API Gateway<br/>Port 8080<br/>✅ WORKING]
    Auth[Auth Service<br/>Port 8081<br/>✅ WORKING]
    Doc[Document Service<br/>Port 8082<br/>✅ WORKING] 
    Embed[Embedding Service<br/>Port 8083<br/>✅ WORKING]
    Core[RAG Core Service<br/>Port 8084<br/>✅ WORKING]
    Admin[Admin Service<br/>Port 8085<br/>✅ WORKING]
    
    PG[(PostgreSQL<br/>Port 5432<br/>✅ WORKING)]
    Redis[(Redis Stack<br/>Port 6379<br/>✅ WORKING)]
    Kafka[(Apache Kafka<br/>Port 9092<br/>✅ WORKING)]
    Ollama[(Ollama LLM<br/>Port 11434<br/>✅ WORKING)]
    
    Gateway --> Auth
    Gateway --> Doc
    Gateway --> Embed
    Gateway --> Core
    Gateway --> Admin
    
    Auth --> PG
    Doc --> PG
    Embed --> Redis
    Core --> Redis
    Admin --> PG
    
    Doc --> Kafka
    Embed --> Kafka
    Core --> Kafka
    
    Core --> Ollama
    Embed --> Ollama
    
    classDef working fill:#4CAF50,stroke:#2E7D32,stroke-width:2px,color:#FFFFFF;
    class Gateway,Auth,Doc,Embed,Core,Admin,PG,Redis,Kafka,Ollama working;
Loading

Microservices Architecture

  • Multi-tenant isolation: Complete data separation by tenant
  • Event-driven processing: Async operations via Kafka
  • Polyglot persistence: PostgreSQL + Redis for different data types
  • Horizontal scaling: Stateless services with shared infrastructure

Tech Stack Reference

📋 Core Framework & Runtime
Component Version Purpose
Java 21 (LTS) Primary programming language
Spring Boot 3.2.8 Application framework
Spring AI 1.0.0-M1 AI/ML integration
Spring Cloud 2023.0.2 Microservices framework
Maven 3.8+ Build and dependency management
🗄️ Data & Storage
Component Version Purpose
PostgreSQL 42.7.3 Primary database with pgvector
Redis Stack 5.0.2 Vector storage and caching
Apache Kafka 3.7.0 Event streaming and messaging
🤖 AI/ML Libraries
Component Version Purpose
LangChain4j 0.33.0 LLM integration framework
Apache Tika 2.9.2 Document processing and text extraction
OpenAI API Latest GPT models and embeddings
Ollama Latest Local LLM inference
🧪 Testing & Quality
Component Version Purpose
JUnit 5.10.2 Unit testing framework
Testcontainers 1.19.8 Integration testing
Mockito 5.14.2 Mocking framework
WireMock 3.8.0 API mocking

Developer Workflows

Running Tests

# Run all unit tests
mvn test

# Run integration tests (requires Docker)
mvn verify -P integration-tests

# Run tests for a specific service
cd rag-auth-service && mvn test

# Skip tests during development
mvn clean install -DskipTests

Development Mode

# Hot reload enabled by default in Spring Boot DevTools
# Make changes to Java files and they'll auto-reload

# For database schema changes, use Spring Boot's DDL auto-update
# application-dev.yml: spring.jpa.hibernate.ddl-auto=update

Debugging

# Enable debug logging for a service
export LOGGING_LEVEL_COM_ENTERPRISE_RAG=DEBUG

# Debug with remote JVM debugging
mvn spring-boot:run -Dspring-boot.run.jvmArguments="-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5005"

Working with Docker Services

# View logs for all infrastructure services
docker-compose logs -f

# Restart a specific service
docker-compose restart postgres

# Access PostgreSQL directly
docker exec -it enterprise-rag-postgres psql -U rag_user -d rag_enterprise

# Access Redis CLI
docker exec -it enterprise-rag-redis redis-cli

# View Kafka topics
docker exec -it enterprise-rag-kafka kafka-topics --bootstrap-server localhost:9092 --list

🔧 Troubleshooting

🔧 Service Won't Start
# Check if port is already in use
netstat -tulpn | grep :8081

# View application logs
cd rag-auth-service && mvn spring-boot:run

# Check Docker services are running
docker-compose ps
🗄️ Database Connection Issues
# Test PostgreSQL connection
docker exec -it enterprise-rag-postgres psql -U rag_user -d rag_enterprise

# Reset database (development only)
docker-compose down -v && docker-compose up -d

# Check database logs
docker-compose logs postgres
🧪 Tests Failing
# Run tests with verbose output
mvn test -Dtest=YourTestClass -Dspring.profiles.active=test

# Integration tests require Docker
docker-compose up -d
mvn verify -P integration-tests

# Check test container logs
docker-compose logs testcontainers

📈 Performance & Monitoring

Target Metrics

  • Response Time: <200ms (excluding LLM processing)
  • Throughput: 1000+ concurrent users
  • Availability: 99.9% uptime target

Monitoring Endpoints

  • Health Checks: /actuator/health on each service
  • Metrics: /actuator/prometheus for Prometheus scraping
  • Info: /actuator/info for build and version details

Local Monitoring Setup

# Prometheus: http://localhost:9090
# Grafana: http://localhost:3000 (admin/admin)
# Kafka UI: http://localhost:8080

🎯 Roadmap

🚀 All Core Services Complete! Focus on System Integration:

1. High Priority: System Integration & Testing

  • Docker orchestration: All services running in containers
  • 🔄 End-to-end testing: Complete RAG pipeline validation
  • 🔄 Load testing: Performance testing under concurrent load
  • 🔄 API documentation: Generate comprehensive OpenAPI/Swagger docs

2. Medium Priority: Production Deployment

  • 🔄 Kubernetes deployment: Helm charts and production orchestration
  • 🔄 CI/CD pipeline: Automated testing and deployment
  • 🔄 Security hardening: Advanced security features and audit logging
  • 🔄 Performance optimization: Database indexing and query optimization

3. Lower Priority: Advanced Features

  • 🔄 Redis Search integration: Advanced vector search features
  • 🔄 Advanced analytics: Real-time usage dashboards and reporting
  • 🔄 Multi-model support: Additional embedding and LLM model integrations
  • 🔄 Advanced caching: Distributed caching strategies

📚 Documentation

The project documentation is organized into the following categories:

🚀 Deployment & Infrastructure

🛠️ Development & Testing

📋 Project Management

📱 Applications & Tools

  • ollama-chat/README.md - Enhanced Ollama chat frontend with Docker integration ✅ COMPLETED
  • Service Health Checks - /actuator/health endpoints on all services
  • Monitoring Dashboards - Grafana at http://localhost:3000
  • Comprehensive Javadoc - Enterprise-grade API documentation (92.4% coverage)

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.


🔥 Ready to contribute? Check out our Contributing Guidelines and start building the future of enterprise RAG systems!

About

An enterprise grade RAG (Retrieval Augmented Generation) system built with Spring Boot 3.x, demonstrating advanced backend engineering and modern AI integration.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •