HyperAI

Blocking

Entity resolution (ER) is the task of identifying records that refer to the same real-world entity across different data sources. Blocking is a crucial step in the ER process, which generates a set of candidate record pairs using computationally inexpensive methods, significantly reducing the workload of the matcher and thus improving the efficiency and scalability of entity resolution. The goal of blocking techniques is to filter out record pairs that are unlikely to match, ensuring that the subsequent, more complex matching processes can focus on high-potential candidate pairs, thereby enhancing the overall performance of the system.