Multi-model fusion and re-ranking for spatial image retrieval
Abstract
Image retrieval has become a central component of large-scale visual understanding systems, particularly as real-world datasets grow in volume, diversity, and semantic complexity, and numerous methods have been proposed to improve retrieval accuracy across diverse scenarios [1]. However, the performance of individual models often varies significantly depending on the characteristics of real-world datasets, making it challenging for a single technique to consistently achieve robust results. To address this limitation, we introduce a fusion-based retrieval framework that leverages the complementary strengths of three state-of-the-art models: SALAD [2] and CliqueMining [3] MegaLoc [4]. Each model independently generates an initial ranked list, capturing different visual cues and retrieval patterns. To further enhance reliability and reduce model-specific biases, we apply a re-ranking stage using the Distribution-based Score Fusion method [5] an aggregation technique designed to normalize heterogeneous score distributions and emphasize consistent cross-model evidence. Our proposed approach provides a unified and efficient strategy for improving retrieval accuracy without requiring additional training or architectural modifications. Experimental evaluations demonstrate that the combined system consistently outperforms individual models, offering improved robustness and more stable performance across varying image domains.
Authors

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.