HyperAIHyperAI

Command Palette

Search for a command to run...

BM25S: Orders of magnitude faster lexical search via eager sparse scoring

Xing Han Lù

Abstract

We introduce BM25S, an efficient Python-based implementation of BM25 thatonly depends on Numpy and Scipy. BM25S achieves up to a 500x speedup comparedto the most popular Python-based framework by eagerly computing BM25 scoresduring indexing and storing them into sparse matrices. It also achievesconsiderable speedups compared to highly optimized Java-based implementations,which are used by popular commercial products. Finally, BM25S reproduces theexact implementation of five BM25 variants based on Kamphuis et al. (2020) byextending eager scoring to non-sparse variants using a novel score shiftingmethod. The code can be found at https://github.com/xhluca/bm25s


Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp