Logo
User: Guest  Login
Authors:
Rösch, Philipp J.; Oswald, Norbert; Geierhos, Michaela; Libovický, Jindřich 
Document type:
Konferenzbeitrag / Conference Paper 
Title:
Enhancing Conceptual Understanding in Multimodal Contrastive Learning through Hard Negative Samples 
Collection editors:
Gu, Jing; Fu, Tsu-Jui (Ray); Hudson, Drew; Celikyilmaz, Asli; Wang, William 
Title of conference publication:
Proceedings of the 3rd Workshop on Advances in Language and Vision Research (ALVR) 
Conference title:
Workshop on Advances in Language and Vision Research (3., 2024, Bangkok) 
Venue:
Bangkok, Thailand 
Year of conference:
2024 
Date of conference beginning:
16.08.2024 
Publishing institution:
Association for Computational Linguistics 
Year:
2024 
Pages from - to:
102–115 
Language:
Englisch 
Abstract:
Current vision-language models leveraging contrastive learning often face limitations in developing fine-grained conceptual understanding. This is due to random negative samples during pretraining, causing almost exclusively very dissimilar concepts to be compared in the loss function. Consequently, the models struggle with fine-grained semantic differences. To address this problem, we introduce a novel pretraining method incorporating synthetic hard negative text examples. The hard negatives re...    »
 
Department:
Fakultät für Informatik; Fakultät für Elektrotechnik und Technische Informatik 
Institute:
INF 7 - Institut für Datensicherheit; ETTI 2 - Institut für Verteilte Intelligente Systeme 
Chair:
Geierhos, Michaela; Oswald, Norbert 
Research Hub UniBw M:
CODE 
Open Access yes or no?:
Ja / Yes 
Type of OA license:
CC BY 4.0