This is interesting. I recently built a search tool that needed to locate docume...

pmc00 · on April 5, 2024

It depends on the scenario. For example, for concept-seeking queries, vectors tend to do better (less likely to be an overlap in words between query and content), whereas for keyword searches (a product name, a serial number, project codenames, etc.) BM25 + keywords does much better. If your workload is all concept-seeking queries, it's reasonable that keywords don't add much.

If you look at the table in the section "3. Hybrid Retrieval brings out the best of Keyword and Vector Search" of that article, we shared there the significant variability of metrics as a function of query types.