Semantic Search
Free-text search over the record log. Embeds the query through a configured provider, ranks records by cosine similarity, and returns the top K. Envelope filters (thread, actor, kind) narrow the result after ranking so near-misses don't crowd out the best answer.
Overview
Semantic search answers the question "give me records that mean something like X", where X is free-text and "means like" is cosine similarity in an embedding space. It complements Query: query is the right tool when you know the shape of what you want (body.kind, body.priority, exact thread); search is the right tool when you only know the idea and want records nearby that idea regardless of surface form.
Three entry points:
- CLI:
spl search "query" - HTTP:
POST /v1/records/searchwith a JSON body - SDK:
client.search({ query })(TypeScript) orawait client.search(query)(Python)
All three talk to the same embedding provider — configured once per daemon, used by every search. The Ollama provider (local, zero-cost) ships today; additional hosted providers land in later releases.
Prerequisites — configure an embedding provider
Semantic search is off by default. Before any search works, configure a provider:
# One-time setup — installs nothing; requires an Ollama daemon already running.
spl config embedding-provider set ollama \
--endpoint http://localhost:11434 \
--model nomic-embed-textThis writes an embedding_provider LEARN record on th_engine_config. The daemon rebuilds its embedder on the next broadcast of that record — no restart. Verify:
spl config embedding-provider showWhen no provider is configured the daemon returns HTTP 503 on search requests. Both SDKs surface this distinctly (disabled: true on the result) so a UI can render a "set up Ollama" hint instead of treating it as an error.
To disable search without clearing other config:
spl config embedding-provider clearQuick Start
# Top 5 records semantically near "authentication failure logs".
spl search "authentication failure logs" -k 5#1 0.847 r_3f4a... KNOW th_incident_42 ...
#2 0.812 r_8b21... KNOW th_incident_42 ...
#3 0.794 r_c0de... DO th_session_93 ...
...From TypeScript:
import { Client, Identity } from "@syncropel/sdk";
const client = new Client({
endpoint: "http://localhost:9100",
identity: Identity.static("did:sync:user:alice"),
});
const result = await client.search({
query: "authentication failure logs",
k: 5,
});
if (result.disabled) {
console.log("Semantic search not configured — run `spl config embedding-provider set ollama`");
} else {
for (const hit of result.hits) {
console.log(hit.score.toFixed(3), hit.record.id, hit.record.body);
}
}From Python:
from syncropel import Client, Identity
client = Client(
endpoint="http://localhost:9100",
identity=Identity.static("did:sync:user:alice"),
)
result = await client.search("authentication failure logs", k=5)
if result.get("disabled"):
print("Semantic search not configured — run `spl config embedding-provider set ollama`")
else:
for hit in result["hits"]:
print(f"{hit['score']:.3f} {hit['record']['id']} {hit['record']['body']}")Request Shape
{
"query": "free-text",
"k": 10,
"thread": "th_optional",
"actor": "did:sync:agent:dev",
"kind": "core.task.record",
"after_clock": 12345
}query(required) — the free-text string. Embedded verbatim through the configured provider.k— top-K to return. Clamped to[1, 100]. Default10.thread— restrict to one thread id.actor— restrict to records emitted by a specific actor DID.kind— restrict to records whosebody.kindmatches.after_clock— restrict to records withclock >= after_clock. Useful for time-windowed searches.
Filters are post-ranking
This is the important invariant: envelope filters narrow the top-K after cosine ranking, not before. The engine embeds the query, scores every record that has an embedding, picks the top K globally, then applies your thread / actor / kind / after_clock predicates.
Why: pre-filter search on a narrow slice of records can produce fewer than K hits (or none) when the closest matches sit just outside the filter. Post-filter preserves rank quality — you get the best near-matches that also satisfy your constraints.
If you need strict pre-filtering (e.g. "only look inside this thread"), combine with Query instead: search broadly, then query each hit's thread. In practice the post-filter behavior is what you want 95% of the time.
Response Shape
{
"embedder": "ollama/nomic-embed-text",
"k": 5,
"hits": [
{
"score": 0.847,
"record": { /* full Record envelope */ }
},
...
]
}embedder— provider identifier.nullwhen search is disabled.k— echoes the requested K (after clamp).hits— ordered list, highest cosine similarity first.score— cosine similarity in[-1, 1], though in practice[0, 1]for any healthy embedding model.record— the full Record —id,thread,actor,act,body,clock, and all the rest.
When the provider is unreachable (daemon returns 503), the SDK returns:
{
"embedder": null,
"k": 5,
"hits": [],
"disabled": true
}Render this as a setup hint, not an error.
Patterns
"What have we said about X recently?"
spl search "migration concerns with the new auth layer" \
--after-clock $(($(date +%s) - 7 * 86400)) \
-k 20Combines free-text relevance with a 7-day time window.
"Search this thread only"
spl search "race condition" --thread th_incident_42 -k 10Useful for long-running threads where you want to find references to a concept inside the conversation.
"Search this actor's output"
spl search "feature flag rollout" --actor did:sync:agent:dev -k 10Finds records written by a specific actor that match an idea. Handy for "what has the dev agent said about X?" or "what has this team member promised?"
"Search a specific kind"
spl search "login" --kind core.task.record -k 10Narrows to task records whose body is semantically near "login."
Combining with query
Search is often the top of a funnel: find candidates by meaning, then use Query to drill into the structure. Example — find incident records semantically near "slow queries" and then query for their resolution status:
# 1. Find candidates
HIT_THREADS=$(spl search "slow query performance" --kind incident.report -k 5 --json \
| jq -r '.hits[].record.thread')
# 2. For each thread, find its KNOW-with-resolved body
for thread in $HIT_THREADS; do
curl -s -X POST http://localhost:9100/v1/records/query \
-H 'content-type: application/json' \
-d "{\"filter\":{\"thread\":\"$thread\",\"act\":\"KNOW\",\"body.resolved\":true}}"
doneHow Embeddings Enter the Record
Records get embeddings at ingest time, before the response to POST /v1/records returns. The engine:
- Extracts text from
body(currently: top-level string fields, concatenated). - POSTs to the configured provider's embedding endpoint.
- Stores the returned vector in the sqlite-vec shadow table, keyed by record id.
- Returns the record (the embedding store is transparent to the caller).
This means every new record is searchable immediately. There's no background indexer; there's no latency between ingest and searchability.
Records that existed before you configured a provider do not get back-filled automatically. To embed existing records, the easiest path today is re-ingest from a backup (see Backup & recovery). A dedicated spl embed backfill command is on the roadmap.
Providers
One provider ships today:
| Provider | Default model | Runs where | Cost |
|---|---|---|---|
ollama | nomic-embed-text | Local (your host) | $0 |
nomic-embed-text is a 137M-param open-weights model producing 768-dim vectors. Pull it once:
ollama pull nomic-embed-textHosted provider kinds land in a subsequent release. The spl config embedding-provider set CLI will accept additional <KIND> values; the request/response shape for search stays identical.
Limits & Failure Modes
- Embedding provider reachability: if Ollama isn't running, the daemon logs an embedder error at ingest time (the record ingests fine, the embedding is skipped) and returns 503 on search requests. Restart the provider and re-ingest if you care about the skipped records.
kclamp:k > 100is silently clamped to 100. For very large result sets, combine search (for relevance) with Query (for structural pagination).- Empty query: both SDKs raise / throw on empty query — this is a programmer error, never fail-open.
- Filter + k interaction: if
k=10and your filters reject 9 of the top 10, you'll get a single hit — not re-ranked. For dense filters where you need a guaranteed return size, request a largerkupfront. - Fail-open SDK transport: both SDKs return empty hits on network or 5xx failures. Check
result.hits.length— zero withdisabled: falseis ambiguous between "nothing similar" and "transport failed." The CLI raises explicitly.
Next Steps
- Query — the structural complement. Use search to find candidates, query to drill into their structure.
- TypeScript SDK, Python SDK — full client reference including transport details and identity configuration.
- Body-Kind Manifests — for when you want rich-query filters on body fields to stay fast at scale.
Query
Filter records server-side with structured query documents. Supports nested body fields, logical combinators, pagination, and an EXPLAIN plan so you know when a filter hits an index.
Body-Kind Manifests
Declare which body fields for a given body.kind should be indexed. The daemon creates SQLite expression indexes at config reload so rich-query filters on nested body fields stay fast as your record log grows.