Train Local, Ship Portable
Fine-tune retrieval models locally, publish portable GGUF artifacts, benchmark them against the shipped baseline, and use them in GNO with HF-backed custom presets.
Key Benefits
- Local MLX LoRA training on Apple Silicon
- Automatic checkpoint selection
- Portable GGUF export
- Real benchmark-based promotion
- Custom preset install snippets
Example Commands
gno models use slim-tuned
gno models pull --gen
gno query 'ECONNREFUSED 127.0.0.1:5432' --thorough
Get Started
Ready to try Fine-Tuned Models?
Why This Matters
Fine-tuning is only useful if the resulting model can be exported, benchmarked, and installed cleanly. GNO’s fine-tuning workflow is built around that full loop:
- train locally
- select the best checkpoint
- export a portable GGUF
- benchmark the exported artifact
- publish the promoted artifact
- install it in GNO with an
hf:preset
Current Promoted Model
The current promoted slim retrieval model is slim-retrieval-v1, produced from auto-entity-lock-default-mix-lr95.
- repeated benchmark median
nDCG@10:0.925 - repeated benchmark median schema success:
1.0 - repeated benchmark median p95:
4775.99ms - HF repo: guiltylemon/gno-expansion-slim-retrieval-v1
Local Training, Portable Artifact
The current local training backend is MLX LoRA on Apple Silicon. That is a training implementation detail, not a deployment requirement.
The deployable artifact is a GGUF file that GNO can load via hf: or file: URI.
models:
activePreset: slim-tuned
presets:
- id: slim-tuned
name: GNO Slim Retrieval v1
embed: hf:gpustack/bge-m3-GGUF/bge-m3-Q4_K_M.gguf
rerank: hf:ggml-org/Qwen3-Reranker-0.6B-Q8_0-GGUF/qwen3-reranker-0.6b-q8_0.gguf
gen: hf:guiltylemon/gno-expansion-slim-retrieval-v1/gno-expansion-auto-entity-lock-default-mix-lr95-f16.gguf
Use file: only for private or unpublished models.
Promotion Flow
One command drives the full promotion path:
bun run research:finetune:finalize slim-retrieval-v1 auto-entity-lock-default-mix-lr95
That command:
- materializes the canonical release bundle
- writes a public model card and install snippet
- stages the HF upload bundle
For the full training-to-promotion flow:
bun run research:finetune:promote <run>
Why Benchmark The Exported Model
Training loss is not enough.
The model that looks best during training can still perform worse after export on real retrieval tasks. GNO promotes based on exported-model benchmark results, not loss alone.