Getting Started With Hybrid Search

rw-book-cover

Metadata

Highlights

  • The capabilities of vector search are impressive, but it isn’t a perfect technology. In fact, without big domain-specific datasets to fine-tune models on, a traditional search still has some advantages. (View Highlight)
  • We repeatedly see that vector search unlocks incredible and intelligent retrieval but struggles to adapt to new domains. Whereas traditional search can cope with new domains but is fundamentally limited to a set performance level. (View Highlight)
  • Vector search or dense retrieval has been shown to significantly outperform traditional methods when the embedding models have been fine-tuned on the target domain. However, this changes when we try using these models for “out-of-domain” tasks. (View Highlight)
  • That means if we have a large amount of data covering a specific domain like “Medical question-answering”, we can fine-tune an embedding model. With that embedding model, we can create dense vectors and get outstanding vector search performance. (View Highlight)
  • Combining dense and sparse search takes work. In the past, engineering teams needed to run different solutions for dense and sparse search engines and another system to combine results in a meaningful way. (View Highlight)
  • The Pinecone approach to hybrid search uses a single hybrid index. It enables search across any modality; text, audio, images, etc. Finally, the weighting of dense vs. sparse can be chosen via the alpha parameter, making it easy to adjust. (View Highlight)