Vector search as SPARQL operators, not a separate API.
Loka extends SPARQL 1.1 with two operators that integrate vector similarity search directly into graph query patterns:
Filters results to those with similarity above a threshold:
# Find documents similar to a query vector that discuss AI entities
SELECT ?doc ?entity WHERE {
?entity rdf:type :Person .
?doc :mentions ?entity .
VECTOR_SIMILAR(?doc :hasEmbedding "0.23 -0.11 ..."^^loka:f32vec, 0.85)
}
The key: VECTOR_SIMILAR participates in the same query plan as regular triple patterns. The planner decides whether to run the vector search first or the graph pattern first based on cost.
Returns the similarity score for use in ORDER BY and projections:
SELECT ?doc ?score WHERE {
?doc :hasEmbedding ?emb .
BIND(VECTOR_SCORE(?doc :hasEmbedding "0.23 -0.11 ..."^^loka:f32vec) AS ?score)
}
ORDER BY DESC(?score)
LIMIT 10
Both operators accept optional hints for fine-tuning HNSW search behavior:
# Explicit ef_search for higher recall
VECTOR_SIMILAR(?doc :hasEmbedding "..."^^loka:f32vec, 0.85, ef:=200)
# Explicit k limit
VECTOR_SIMILAR(?doc :hasEmbedding "..."^^loka:f32vec, 0.80, k:=50)
The query planner treats VECTOR_SIMILAR as another pattern with a cost estimate. The planning rules:
| Situation | Strategy |
|---|---|
| Subject unbound before VECTOR_SIMILAR | Run vector search first (returns top-k candidates), then evaluate graph patterns over candidates |
| Subject bound before VECTOR_SIMILAR | Run graph pattern first (binds subjects), then filter by vector similarity |
This means a query like "find papers similar to X that are written by authors from Tokyo" will automatically choose the most efficient execution order based on the data distribution.
A separate vector search API (like Qdrant's REST API) requires the application to:
With VECTOR_SIMILAR in SPARQL, it is one query, one round trip, one result set. The planner handles the interleaving internally.