OceanBase Releases seekdb: Single-Node AI Native Hybrid Search for RAG and Agents

What is seekdb?

seekdb is an open source, AI-native search database from OceanBase made for AI applications that need to combine many data types. It runs as a single-node engine that can operate in embedded, client, or server mode, remains MySQL-compatible, and supports SQL syntax familiar to developers. Unlike full OceanBase, seekdb is not distributed; it targets local, edge, and embedded AI workloads.

Unified multi-model data storage

seekdb stores relational tables, vectors, full text, JSON metadata, and GIS data in the same storage and indexing layer. That means documents, embeddings, and associated metadata can live together and be queried without moving data between different systems.

Supported data types include:

Relational data with standard SQL
Dense and sparse vectors
Full text indexes with advanced tokenization
JSON documents and JSON indexes
Spatial GIS indexes

Hybrid search as a first-class capability

Hybrid search is the primary feature. seekdb lets you combine semantic vector search, keyword full text search, and scalar relational filters in a single query and ranking step. OceanBase exposes this functionality via the DBMS_HYBRID_SEARCH package with two entry points:

DBMS_HYBRID_SEARCH.SEARCH which returns JSON results sorted by relevance
DBMS_HYBRID_SEARCH.GET_SQL which returns the SQL string used for execution

Hybrid queries can run pure vector, pure full text, or combined flows, while pushing filters and joins down into storage. The engine supports reranking strategies such as weighted scoring, reciprocal rank fusion, and integration with LLM-based re-rankers.

For RAG and agent memory, this enables one SQL query to do semantic matching on embeddings, exact matching on product codes or names, and relational filtering by user, tenant, or other metadata.

Vector and full text engine details

seekdb provides a modern vector and text stack. For vectors it supports dense and sparse vectors and multiple distance metrics including Manhattan, Euclidean, inner product, and cosine. Index types include in-memory HNSW variants and disk-based IVF and PQ variants. seekdb can also manage hybrid vector indexes and auto-generate embeddings for raw text, reducing the need for separate preprocessing pipelines.

Full text capabilities include keyword, phrase, and Boolean queries, BM25 ranking, and several tokenizer modes. Both vector and text indexes are integrated into the same query planner alongside scalar and GIS indexes, which simplifies hybrid retrieval without external orchestration.

AI functions inside the database

seekdb exposes AI functions callable from SQL so applications can invoke models without a separate service for every request. Key functions are:

AI_EMBED to generate embeddings from text
AI_COMPLETE for text generation via chat or completion models
AI_RERANK to rerank candidate lists
AI_PROMPT to build prompt templates and dynamic inputs for AI_COMPLETE

Model configuration and endpoints are managed with DBMS_AI_SERVICE, letting DBAs register providers, set URLs, and configure keys on the database side.

Multimodal workloads and indexing

seekdb is designed for multimodal workloads: vectors, text, JSON, and GIS are indexable and queryable within the same node. You can run queries that:

Find semantically similar documents by embedding search
Filter results by JSON metadata such as tenant, region, or category
Constrain results by spatial range or polygon

Because seekdb is built on the OceanBase engine, it inherits ACID transactions, hybrid row/column storage, and vectorized execution. For very large distributed needs, full OceanBase remains the intended product.

Who should consider seekdb?

seekdb fits teams building RAG systems, AI agents, code assistants, or local/edge AI services that benefit from a unified data plane. Its MySQL compatibility and embedded single-node design make it practical for projects that want integrated hybrid search and database-native AI capabilities without assembling separate vector stores, search engines, and OLTP databases.