Glossary
Key terms and concepts in GNO.
Core Concepts
Collection
A named group of documents from a single directory. Collections define:
- Path to source files
- Glob patterns for matching
- Include/exclude rules
- Optional language hint
gno collection add ~/notes --name notes --pattern "**/*.md"
Context
Semantic hint attached to a scope to improve search relevance. Contexts provide additional meaning beyond the raw text.
Scope types:
- Global (
/): Applies to all documents - Collection (
notes:): Applies to a collection - Prefix (
gno://notes/projects): Applies to path prefix
Document
A single indexed file. Each document has:
docid: Unique identifier (8-char hash prefix)sourceHash: SHA-256 of original file contentmirrorHash: SHA-256 of canonical markdown
Virtual URI
GNO’s internal document identifier format:
gno://collection/relative/path/to/file.md
Used in search results and resource access.
Search Terms
BM25
Best Matching 25 - a ranking function for full-text search. Matches keywords based on term frequency and document length. Fast and works without models.
GNO uses document-level BM25: entire documents are indexed, not individual chunks. This means a query for “authentication JWT” finds documents where these terms appear anywhere, even in different sections.
gno search "keyword match"
Strong Signal Detection
Optimization that skips expensive query expansion when BM25 already has a confident match. Triggered when the top result’s normalized score is ≥ 0.84 AND the gap to #2 is ≥ 0.14. Saves 1-3 seconds per query.
Vector Search
Semantic similarity search using embeddings. Finds conceptually similar content even without exact keyword matches.
gno vsearch "concept to find"
Hybrid Search
Combines BM25 and vector search using Reciprocal Rank Fusion (RRF). Best of both approaches.
gno query "semantic plus keywords"
Reranking
Cross-encoder model that rescores results for better relevance. More accurate but slower. Enabled by default with gno query.
gno query "topic" # reranking enabled by default
gno query "topic" --no-rerank # disable for speed
RRF (Reciprocal Rank Fusion)
Algorithm for combining multiple ranked lists. Score = Σ(weight / (k + rank)) where k=60.
GNO applies 2× weight to original query results to prevent dilution by LLM-generated variants.
See How Search Works for detailed explanation.
Tiered Top-Rank Bonus
Score boost applied to top-ranked documents before reranking: +0.05 for rank #1, +0.02 for ranks #2-3. Preserves strong initial retrieval signals through the pipeline.
Query Expansion
LLM-powered technique that generates query variants to improve recall. Creates lexical variants (for BM25), semantic variants (for vectors), and HyDE passages.
See How Search Works for details.
HyDE (Hypothetical Document Embeddings)
Technique where an LLM generates a hypothetical document answering the query. The embedding of this synthetic document often better matches real answer documents than the original question embedding.
See How Search Works for details.
Link Terms
Wiki Link
Internal document link using [[double bracket]] syntax. GNO supports:
[[Target]]- basic link[[Target|Display]]- link with custom display text[[Target#Heading]]- link to section anchor[[collection:Target]]- cross-collection link
Markdown Link
Standard markdown [text](path.md) links to other documents. Only internal links are tracked—external URLs are ignored.
Backlink
A document that links TO a given document. If “Note A” contains [[Note B]], then “Note A” is a backlink of “Note B”. Enables Zettelkasten-style bidirectional navigation.
Outgoing Link
A link FROM a document to another document. The inverse of backlink.
Similar Documents
Documents that are semantically related based on vector similarity. Found using the hybrid search pipeline on document content.
Link Resolution
Process of matching link targets to actual documents. Wiki links match normalized titles with path-style fallbacks (basename/rel_path, optional .md); markdown links use resolved paths. Resolution happens at query time, not during indexing.
Cross-Collection Link
Link between documents in different collections using [[collection:Target]] syntax.
Storage Terms
Source
Original file on disk. Tracked by absolute path and sourceHash.
Mirror
Canonical markdown representation of source content. Identified by mirrorHash.
Multiple sources can share the same mirror (content deduplication).
Chunk
Text segment (~800 tokens) created during indexing. Each chunk is:
- Part of document-level FTS5 index
- Optionally embedded for vector search with contextual prefix
Contextual Chunking
Technique where each chunk is embedded with its document title prepended: title: My Doc | text: chunk content.... Helps the embedding model understand context. A chunk about “configuration” in a React doc is semantically different from one in a database doc. Based on Anthropic’s contextual retrieval research.
Embedding
Vector representation of a chunk. 1024-dimensional float array from bge-m3 model, with contextual title prefix.
mirrorHash
SHA-256 hash of canonical markdown. Used for content-addressed storage and deduplication.
Model Terms
Embed Model
Neural network that converts text to vectors. Default: bge-m3 (multilingual, 1024 dims).
Rerank Model
Cross-encoder that scores query-document pairs. Default: Qwen3-Reranker-0.6B. GNO passes the best chunk per document (up to 4K chars) to the reranker for efficiency.
Gen Model
Language model for answer generation. Options:
- Qwen3-1.7B (slim preset)
- SmolLM3-3B (balanced preset)
- Qwen3-4B (quality preset)
GGUF
Quantized model format for efficient inference. Used by llama.cpp.
Model Preset
Predefined model configuration. Available presets: slim, balanced, quality.
Database Terms
FTS5
SQLite’s full-text search extension. Provides BM25 ranking.
sqlite-vec
SQLite extension for vector storage and KNN search. Required for vector search.
Tokenizer
Text segmentation method for FTS5:
snowball english: Snowball stemmer (default, 20+ languages supported)unicode61: Unicode-aware, no stemmingporter: English-only stemming (legacy)trigram: Substring matching
Snowball Stemmer
Multilingual stemming algorithm for FTS5. Reduces words to their root form: “running” → “run”, “scored” → “score”. Supports 20+ languages including English, German, French, Spanish, and more. GNO uses Snowball English by default.
MCP Terms
MCP (Model Context Protocol)
Protocol for AI assistants to access external tools and resources. GNO runs as an MCP server.
Tool
MCP function that AI can invoke. GNO provides: gno_search, gno_vsearch, gno_query, gno_get, gno_multi_get, gno_status.
Resource
MCP content accessible by URI. Format: gno://collection/path
Exit Codes
| Code | Name | Meaning |
|---|---|---|
| 0 | SUCCESS | Command completed |
| 1 | VALIDATION | Bad input or arguments |
| 2 | RUNTIME | System or IO error |
Abbreviations
| Term | Meaning |
|---|---|
| BM25 | Best Matching 25 (ranking algorithm) |
| FTS | Full-Text Search |
| KNN | K-Nearest Neighbors |
| RAG | Retrieval-Augmented Generation |
| RRF | Reciprocal Rank Fusion |