SANDI Solr: Search and Index API with AI-Powered Search

SANDI Solr is a hybrid search system that combines the battle-tested Apache Solr search engine with cutting-edge AI technologies: dense vector embeddings for semantic understanding, LLM-based query enhancement (spell correction, query expansion, and RAG answers), and cross-encoder reranking for maximum relevance. It leverages the latest large language model technologies to deliver intelligent, context-aware search that understands natural language — going far beyond traditional keyword matching.

The Enterprise Search Challenge

The Search Problem in Software Development

Almost every system that serves end users needs good search functionality. Whether it's an internal knowledge base, customer support portal, e-commerce site, or document management system, users expect to find information quickly and easily.

This is why companies look to integrate search engines into their software. But modern users have changed what they expect from search.

What Users Want Today

Users don't want to learn special commands or complex query syntax. They want to ask a question in plain human language and get direct answer. This means search needs to go beyond simple keyword matching. It requires integration with AI to understand natural language and provide intelligent answers.

Why This Is Hard

Building AI-powered search is not a simple task. It requires skills from multiple fields:

Search engine knowledge (Solr, Elasticsearch, indexing strategies)
Natural language processing (understanding queries, extracting entities)
Machine learning (training models, embeddings, vectors)
AI integration (LLMs, semantic understanding, answer generation)
And all of this on top of understanding your specific business needs

Each of these fields requires different skills and different technologies. You need:

Java or Python for search engine integration
Python for AI and machine learning
Understanding of vector databases and embeddings
DevOps skills for deploying and scaling AI services
Knowledge of hardware for GPU based model deployment

This means that before you even start building your actual business application, you need to build a complex search system. This takes significant resources that not every company has, and it costs a lot of money.

When facing this challenge, companies usually choose one of two paths:

Route A: Build It Yourself

This means learning almost everything from scratch.

Learn search engine technology
Learn AI and machine learning
Figure out how to integrate everything
Debug and optimize the system
Maintain and update it over time

This takes a lot of time, a lot of resources, and a lot of money. The result is often unknown, and many home-grown search systems don't provide expected results.

Route B: Hire a Search Company

You can hire a company that specializes in search to build and integrate a solution for you. This usually produces better results than building from scratch. However, it will cost a significant amount. Licensing, development and customization costs for proprietary search software can easily run into hundreds of thousands of dollars. For complex enterprise systems, costs can reach millions.

And you'll always be dependent on that vendor. Need a change? Pay for it. Need support? Pay for it. Want to switch vendors? Start over.

The Precision Problem

Enterprise search has another challenge: it needs both semantic understanding AND precision.

You need semantic understanding so that searches for "car" also find "automobile" and "vehicle". But you also need precision for company-specific terms, abbreviations, and technical identifiers.

A search for "Project Phoenix Q4 deliverables" needs to understand that this is about a specific project, in a specific quarter, and find exact matches for those terms - not just documents generally about projects or quarterly reports.

The SANDI-Solr Solution

This is where SANDI-Solr helps. It addresses both semantic understanding and precision by orchestrating a full AI-powered search pipeline on your behalf:

Client System

→

SANDI

→

Query NLP

→

Query Dense Vector Embeddings

→

Solr Hybrid Search

→

Cross-Encoder Reranker

→

RAG

Every step in this pipeline — from understanding the query with NLP, converting it to dense vectors for semantic matching, executing hybrid search in Solr, reranking results with a cross-encoder, and generating a natural language answer via RAG — is already built, integrated, configured for you and available with the simplicity of a single installation command:

docker-compose up -d

No months of development. No hiring specialized consultants. No expensive licensing fees.

You get:

Working search engine (Apache Solr)
AI integration (embeddings, LLMs, NLP)
Hybrid search (combining keywords and semantics)
Answer generation (RAG)
Multi-tenant support
All pre-configured and ready to use

Then you can configure and modify it to address your specific company needs:

Add your own synonyms
Configure field mappings
Adjust search weights
Replace language models with your own
Integrate with your existing systems
Deploy on your own infrastructure

Data Security and Privacy:

One of the biggest advantages of SANDI-Solr is that all AI processing happens on your own infrastructure. Your sensitive data never leaves your servers. You don't need to send documents to external companies like OpenAI, Google, or other third-party AI services for processing. Everything - from document indexing to embedding generation to answer generation - runs locally on your machines. This means complete data security and compliance with privacy regulations.

You own it. You control it. You can modify it. No licensing fees. No vendor lock-in. Your data stays secure.

Overview

SANDI Solr is a search and indexing API that combines Apache Solr with AI language models to provide semantic search, natural language processing, and answer generation. Built with Spring Boot 3 and Solr 9, it provides a complete containerized solution for AI-enhanced search.

Simple Deployment with Docker Compose

The entire platform can be started with one command:

/opt/sandi-solr$ docker-compose up -d

This starts all services including Solr cluster, Zookeeper, embedding services, language models, and NLP engine. No complex setup needed.

SANDI Solr includes multiple AI services:

Embedding Services:

Qwen3 Embeddings: Converts text to vectors for semantic search
GPU-accelerated for better performance

Language Models:

Qwen3 4B: provided by default, can be replaced with different model
LLM allows generating answers from search results.
GPU-accelerated for better performance

Natural Language Processing:

SpaCy NLP: Extracts entities and analyzes text
Helps understand search queries to build the search query

Re-Ranking:

Qwen3 Re-Ranker: Reorders search results by semantic relevance
GPU-accelerated for better performance

Multi-Tenant Support

Multi-tenant architecture allows a single SANDI Solr deployment to serve multiple applications simultaneously, each with its own isolated search configuration. For each tenant (client) admin maintains independent settings for collections, field mappings, synonym sets, and AI service endpoints, ensuring that different applications never interfere with each other. By consolidating multiple search tasks into one deployment, multi-tenancy significantly reduces operational overhead, hardware costs, and maintenance complexity compared to running separate search instances for each application.

SANDI supports multiple clients with separate configurations:

Each client may have their own Solr collection or may reuse existing collections
Custom field mappings provided separately per client
Client-specific synonyms
Flexible indexing is configured for each client
Each client can have own AI service endpoints

High Availability OOB

The Docker setup includes:

2 Solr nodes
3-node ZooKeeper cluster for coordination
Automatic failover
Configurable memory per node

APIs

Search API: Handles search queries combining vector and keyword search
Index API: Processes and indexes documents
Admin API: Client administration and configuration

Service Ports

8081 - Search API

8082 - Index API (Admin API)

8083 - Client Search API

8084 - Client Index API

8085 - Embedding Service

8086 - NLP Service

8087 - Language Model Service

8088 - Re-Ranking Service

8981-8982 - Solr Nodes

2181-2183 - ZooKeeper Ensemble

Search Features

Hybrid Search

SANDI combines keyword search with semantic vector search

This enables:

Legacy Search: Traditional keyword matching (BM25)
Vector Search: Finds similar meaning using embeddings
Weighted Fusion: Combines both approaches with configurable weights

Answer Generation (RAG)

Built-in support for generating answers from search results

The system:

Finds relevant documents based on similarity
Filters by minimum score
Sends context to LLM to generate answers

Re-Ranking

Results can be reordered by relevance

This two-stage approach (search then rerank) improves accuracy

Document Processing

The indexing module handles multiple formats:

Apache Tika: Extracts text from PDFs, Word docs, HTML, and 1000+ formats
JSONL: Batch import of JSON documents
XML Sitemaps: Automated website indexing
Scheduled Jobs: Background processing with status tracking

Technology Stack

Core Technologies

Java 17 with Spring Boot 3.0.13
Apache Solr 9.8.1 with SolrCloud
ZooKeeper 3.9.2 for distributed coordination
Apache Tika 3.2.2 for document parsing
Tomcat 10.1.44 as web server

AI/ML Stack

PyTorch for embeddings and LLM services
Gunicorn for Python service deployment
NVIDIA GPU support for faster inference
Flask microservice architecture
Optional external OpenAI services (no GPU required)

Containerization

Docker Compose for orchestration
Official images: Solr, ZooKeeper, Tomcat
Custom Python services with GPU support
Bridge networking for service communication

Production Features

Resilience: Distributed architecture
Security: Configurable Solr authentication
Graceful Shutdown: Configurable shutdown periods
Memory Limits: Docker resource constraints
Restart Policies: Auto-recovery from failures

Getting Started

Prerequisites

Docker and Docker Compose
NVIDIA Docker runtime (for GPU services)
At least 32GB RAM recommended
CUDA GPU (optional, for embedding/LLM services)

Basic Deployment

# Navigate to deployment directory and unzip the downloaded zip file:
unzip sandi-solr-X.X.X.zip
cd sandi-solr-X.X.X
# Start core services:
docker-compose up -d
# Check service status:
docker-compose ps
# View logs:
docker-compose logs -f sandi_search1

Production Deployment

The single-machine setup is intended for learning, evaluation, and development.

For production environments, SANDI-Solr should be deployed across multiple machines for optimal performance, scalability, and high availability.

Typical production deployment architecture:

Solr Cluster: Deploy Solr nodes across multiple machines for distributed search and indexing
ZooKeeper Ensemble: Run ZooKeeper nodes on separate machines for coordination (minimum 3 nodes recommended)
Search API Servers: Deploy search API instances on dedicated machines for handling query load
Indexing API Servers: Run indexing API on separate machines to handle document processing
AI Services: Deploy embedding, LLM, NLP, and re-ranking services on GPU-equipped machines

For production, services should be distributed across multiple machines using Docker Swarm, Kubernetes, or manual deployment to ensure scalability, fault tolerance, and optimal resource utilization.

This distributed architecture allows you to scale individual components based on your workload - for example, adding more search API servers during peak query times or more GPU nodes for faster embedding generation.

Use Cases

SANDI Solr works well for:

Enterprise Search: Semantic search across internal documents with AI-generated answers
E-commerce: Combines exact matches with semantic understanding
Multi-tenant Platforms: Publishing platforms with isolated client data
Research: Search scientific papers with semantic similarity
Customer Support: Generate answers from knowledge base articles

Maven Project Structure

The SANDI Java project has three modules:

base: Shared functions library
search: WAR module for search service
index: WAR module for document indexing

Summary

SANDI Solr combines traditional search with AI to provide semantic search and answer generation. It provides a ready-to-use foundation for building AI-powered search applications. The Docker Compose deployment makes it easy to set up and run. Get started with one command.