Distributed Query-Aware Quantization for High-Dimensional Similarity Searches
NorCal DB Day 2018
Redwood City, CA
In this project we design a Query dependent Equi-Depth (QED) on-the-fly quantization method to improve high-dimensional similarity searches. The quantization is done for each dimension at query time and localized scores are generated for the closest p fraction of the points while a constant penalty is applied for the rest of the points. QED not only improves the quality of the distance metric, but also improves query time performance by filtering out non relevant data. We propose a distributed indexing and query algorithm to efficiently compute QED. Our experimental results show improvements in classification accuracy as well as query performance up to one order of magnitude faster than Manhattan-based sequential scan Nearest Neighbor queries over datasets with hundreds of dimensions.
Gheorghi Guzun. "Distributed Query-Aware Quantization for High-Dimensional Similarity Searches" NorCal DB Day 2018 (2018).