Retrieval-Augmented Generation
- Search for relevant info based on similarity (e.g. Jaccard similarity)
Lexical Retrieval
- Jaccard similarity
- Stop words removal and stemming
- Down side: common word and less common word with more specific meaning are
given the same emphasis
- Upside: easy to implement and runs fast
- Frequency-inverse document frequency (TF*IDF) and BM25 take word
importance into consideration.
- Lexical retrieval still powers most of online searches
Neural Retrieval
- Snippetize documents (by paragraphs, or by running windows)
- Run snippets through embedding models, and store in vector-db.
- Perform vector lookup to find relevant snippets.
Pipeline
- Obtain dynamic and static context, along with user instructions.
- Further search for relevant context (preferably with neural retrieval, through
vector-db)
- Assemble the prompt.
- Feed the prompt to LLM.