AtheneForschung - Informationsportal der UniBw M

Home / Alle InhaltePublikationen (Universitätsbibliografie)Fakultäten (HAW)Fakultät für Elektrotechnik und Technische InformatikETTI 2 - Institut für Verteilte Intelligente Systeme

Zurück
Zurück zum Anfang der Trefferliste
Dauerhafter Link zum angezeigten Objekt

Autoren:

Rösch, Philipp J.; Oswald, Norbert; Geierhos, Michaela; Libovický, Jindřich

Dokumenttyp:

Konferenzbeitrag / Conference Paper

Titel:

Enhancing Conceptual Understanding in Multimodal Contrastive Learning through Hard Negative Samples

Herausgeber Sammlung:

Gu, Jing; Fu, Tsu-Jui (Ray); Hudson, Drew; Celikyilmaz, Asli; Wang, William

Titel Konferenzpublikation:

Proceedings of the 3rd Workshop on Advances in Language and Vision Research (ALVR)

Konferenztitel:

Workshop on Advances in Language and Vision Research (3., 2024, Bangkok)

Tagungsort:

Bangkok, Thailand

Jahr der Konferenz:

2024

Datum Beginn der Konferenz:

16.08.2024

Verlegende Institution:

Association for Computational Linguistics

Jahr:

2024

Seiten von - bis:

102–115

Sprache:

Englisch

Abstract:

Current vision-language models leveraging contrastive learning often face limitations in developing fine-grained conceptual understanding. This is due to random negative samples during pretraining, causing almost exclusively very dissimilar concepts to be compared in the loss function. Consequently, the models struggle with fine-grained semantic differences. To address this problem, we introduce a novel pretraining method incorporating synthetic hard negative text examples. The hard negatives re... »

URL zum Inhalt:

https://aclanthology.org/2024.alvr-1.9.pdf

URL zum Preprint:

https://doi.org/10.48550/arXiv.2403.02875

Fakultät:

Fakultät für Informatik; Fakultät für Elektrotechnik und Technische Informatik

Institut:

INF 7 - Institut für Datensicherheit; ETTI 2 - Institut für Verteilte Intelligente Systeme

Professur:

Geierhos, Michaela; Oswald, Norbert

(Forschungs)einrichtung UniBw M:

CODE