Information Retrieval & RAG¶
Hybrid Retrieval¶
Spikuit combines multiple retrieval signals into a single score:
- Keyword similarity: BM25-style text matching
- Semantic similarity: sqlite-vec KNN search when an embedder is configured
- Retrievability: FSRS-based memory strength (concepts you know well rank higher)
- Centrality: graph-structural importance
- Pressure: LIF-based urgency from neighbor reviews
- Feedback boost: accumulated through QABotSession accept/reject signals
Retrieval-Augmented Generation (RAG)¶
Traditional RAG pipelines require significant preprocessing: document chunking,
metadata extraction, embedding pipeline setup. Spikuit replaces this with
conversational curation — the agent handles chunking, tagging, and connecting
through dialogue via /spkt-ingest.
References¶
- Robertson, S. & Zaragoza, H. (2009). The Probabilistic Relevance Framework: BM25 and Beyond. Foundations and Trends in Information Retrieval, 3(4), 333–389.
- Lewis, P. et al. (2020). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. NeurIPS 2020.