OceanBase Releases seekdb: Single-Node AI Native Hybrid Search for RAG and Agents
'seekdb unifies vector, full text and relational search in a single MySQL-compatible engine, enabling RAG and agent workflows to run hybrid retrieval and in-database AI functions with one SQL query.'
What is seekdb?
seekdb is an open source, AI-native search database from OceanBase made for AI applications that need to combine many data types. It runs as a single-node engine that can operate in embedded, client, or server mode, remains MySQL-compatible, and supports SQL syntax familiar to developers. Unlike full OceanBase, seekdb is not distributed; it targets local, edge, and embedded AI workloads.
Unified multi-model data storage
seekdb stores relational tables, vectors, full text, JSON metadata, and GIS data in the same storage and indexing layer. That means documents, embeddings, and associated metadata can live together and be queried without moving data between different systems.
Supported data types include:
- Relational data with standard SQL
- Dense and sparse vectors
- Full text indexes with advanced tokenization
- JSON documents and JSON indexes
- Spatial GIS indexes
Hybrid search as a first-class capability
Hybrid search is the primary feature. seekdb lets you combine semantic vector search, keyword full text search, and scalar relational filters in a single query and ranking step. OceanBase exposes this functionality via the DBMS_HYBRID_SEARCH package with two entry points:
- DBMS_HYBRID_SEARCH.SEARCH which returns JSON results sorted by relevance
- DBMS_HYBRID_SEARCH.GET_SQL which returns the SQL string used for execution
Hybrid queries can run pure vector, pure full text, or combined flows, while pushing filters and joins down into storage. The engine supports reranking strategies such as weighted scoring, reciprocal rank fusion, and integration with LLM-based re-rankers.
For RAG and agent memory, this enables one SQL query to do semantic matching on embeddings, exact matching on product codes or names, and relational filtering by user, tenant, or other metadata.
Vector and full text engine details
seekdb provides a modern vector and text stack. For vectors it supports dense and sparse vectors and multiple distance metrics including Manhattan, Euclidean, inner product, and cosine. Index types include in-memory HNSW variants and disk-based IVF and PQ variants. seekdb can also manage hybrid vector indexes and auto-generate embeddings for raw text, reducing the need for separate preprocessing pipelines.
Full text capabilities include keyword, phrase, and Boolean queries, BM25 ranking, and several tokenizer modes. Both vector and text indexes are integrated into the same query planner alongside scalar and GIS indexes, which simplifies hybrid retrieval without external orchestration.
AI functions inside the database
seekdb exposes AI functions callable from SQL so applications can invoke models without a separate service for every request. Key functions are:
- AI_EMBED to generate embeddings from text
- AI_COMPLETE for text generation via chat or completion models
- AI_RERANK to rerank candidate lists
- AI_PROMPT to build prompt templates and dynamic inputs for AI_COMPLETE
Model configuration and endpoints are managed with DBMS_AI_SERVICE, letting DBAs register providers, set URLs, and configure keys on the database side.
Multimodal workloads and indexing
seekdb is designed for multimodal workloads: vectors, text, JSON, and GIS are indexable and queryable within the same node. You can run queries that:
- Find semantically similar documents by embedding search
- Filter results by JSON metadata such as tenant, region, or category
- Constrain results by spatial range or polygon
Because seekdb is built on the OceanBase engine, it inherits ACID transactions, hybrid row/column storage, and vectorized execution. For very large distributed needs, full OceanBase remains the intended product.
Who should consider seekdb?
seekdb fits teams building RAG systems, AI agents, code assistants, or local/edge AI services that benefit from a unified data plane. Its MySQL compatibility and embedded single-node design make it practical for projects that want integrated hybrid search and database-native AI capabilities without assembling separate vector stores, search engines, and OLTP databases.
Сменить язык
Читать эту статью на русском