8 months ago

Abstract

In text-based person search endeavors, data generation has emerged as aprevailing practice, addressing concerns over privacy preservation and thearduous task of manual annotation. Although the number of synthesized data canbe infinite in theory, the scientific conundrum persists that how muchgenerated data optimally fuels subsequent model training. We observe that onlya subset of the data in these constructed datasets plays a decisive role.Therefore, we introduce a new Filtering-WoRA paradigm, which contains afiltering algorithm to identify this crucial data subset and WoRA (WeightedLow-Rank Adaptation) learning strategy for light fine-tuning. The filteringalgorithm is based on the cross-modality relevance to remove the lots of coarsematching synthesis pairs. As the number of data decreases, we do not need tofine-tune the entire model. Therefore, we propose a WoRA learning strategy toefficiently update a minimal portion of model parameters. WoRA streamlines thelearning process, enabling heightened efficiency in extracting knowledge fromfewer, yet potent, data instances. Extensive experimentation validates theefficacy of pretraining, where our model achieves advanced and efficientretrieval performance on challenging real-world benchmarks. Notably, on theCUHK-PEDES dataset, we have achieved a competitive mAP of 67.02% while reducingmodel training time by 19.82%.

Source PDF