Design decisions for deletion, entry points, persistence, and concurrency.
When a vector is deleted from the HNSW index, the node is not removed. It is flagged as deleted (tombstoned) but remains in the graph structure. Searches can still traverse through deleted nodes to reach their live neighbors.
Why: removing a node from an HNSW graph requires rewiring all its neighbors' connections, which is expensive and can fragment the graph. Tombstoning preserves graph connectivity at the cost of wasted traversal steps.
When the tombstone ratio exceeds a threshold (configurable, default 30%), the background maintenance cycle rebuilds the entire HNSW graph from scratch, producing a fresh graph with no tombstones.
Standard HNSW uses a single entry point at the top layer. This can cause searches to get stuck in local optima on large, clustered vector spaces — if the entry point is far from the query region, greedy descent may converge to a suboptimal cluster.
Loka maintains multiple entry points with geometric probability. Searches begin from the entry point closest to the query vector, reducing the number of hops needed to reach the target region.
The HNSW graph is not persisted to the .sdb file. On startup, it is rebuilt from the stored vector triples (which are persisted in SPO/POS/OSP). This guarantees the HNSW graph is always consistent with the stored data.
For faster cold start on large indexes, optional snapshots can be saved and loaded. If the snapshot is stale (doesn't match stored triples), it is discarded and the graph is rebuilt from scratch.
HNSW search takes &self (shared reference), meaning any number of searches can execute concurrently without blocking each other. Each search maintains its own visited set (thread-local allocation), following the Qdrant pattern.
Insert and delete take &mut self (exclusive reference). In practice, writes are batched and applied between search rounds, or the maintenance cycle handles bulk rebuilds during low-usage periods.
| Parameter | Default | Range | Effect |
|---|---|---|---|
M | 16 | 8–64 | Max connections per node per layer. Higher = better recall, more memory. |
ef_construction | 200 | 50–500 | Beam width during build. Higher = better graph quality, slower construction. |
ef_search | Tunable per query | 10–1000 | Beam width during search. Higher = better recall, slower search. |
dimensions | Declared per predicate | Any | Fixed at schema time, enforced on insert. Mismatch = hard error. |