Unifying Event Detection and Captioning as Sequence Generation via Pre-training (English)
- New search for: Zhang, Qi
- New search for: Song, Yuqing
- New search for: Jin, Qin
- New search for: Zhang, Qi
- New search for: Song, Yuqing
- New search for: Jin, Qin
In:
Computer vision – ECCV 2022 ; Part 36
; 363-379
;
2022
-
ISBN:
- Conference paper / Print
-
Title:Unifying Event Detection and Captioning as Sequence Generation via Pre-training
-
Contributors:
-
Conference:ECCV ; 17. ; 2022 ; Tel Aviv; Online
-
Published in:Computer vision – ECCV 2022 ; Part 36 ; 363-379
-
Publisher:
- New search for: Springer
-
Place of publication:Cham
-
Publication date:2022
-
ISBN:
-
DOI:
-
Type of media:Conference paper
-
Type of material:Print
-
Language:English
- New search for: 54.74
- Further information on Basic classification
-
Keywords:
-
Classification:
BKL: 54.74 Maschinelles Sehen -
Source:
The tables of contents are generated automatically and are based on the data records of the individual contributions available in the index of the TIB portal. The display of the Tables of Contents may therefore be incomplete.
- 1
-
Making the Most of Text Semantics to Improve Biomedical Vision-Language ProcessingBoecking, Benedikt / Usuyama, Naoto / Bannur, Shruthi / Castro, Daniel C. / Schwaighofer, Anton / Hyland, Stephanie / Wetscherek, Maria / Naumann, Tristan / Nori, Aditya / Alvarez-Valle, Javier et al. | 2022
- 22
-
Generative Negative Text Replay for Continual Vision-Language PretrainingYan, Shipeng / Hong, Lanqing / Xu, Hang / Han, Jianhua / Tuytelaars, Tinne / Li, Zhenguo / He, Xuming et al. | 2022
- 39
-
Video Graph Transformer for Video Question AnsweringXiao, Junbin / Zhou, Pan / Chua, Tat-Seng / Yan, Shuicheng et al. | 2022
- 59
-
Trace Controlled Text to Image GenerationYan, Kun / Ji, Lei / Wu, Chenfei / Bao, Jianmin / Zhou, Ming / Duan, Nan / Ma, Shuai et al. | 2022
- 76
-
Video Question Answering with Iterative Video-Text Co-tokenizationPiergiovanni, AJ / Morton, Kairo / Kuo, Weicheng / Ryoo, Michael S. / Angelova, Anelia et al. | 2022
- 95
-
Rethinking Data Augmentation for Robust Visual Question AnsweringChen, Long / Zheng, Yuhang / Xiao, Jun et al. | 2022
- 113
-
Explicit Image Caption EditingWang, Zhen / Chen, Long / Ma, Wenbo / Han, Guangxing / Niu, Yulei / Shao, Jian / Xiao, Jun et al. | 2022
- 130
-
Can Shuffling Video Benefit Temporal Bias Problem: A Novel Training Framework for Temporal GroundingHao, Jiachang / Sun, Haifeng / Ren, Pengfei / Wang, Jingyu / Qi, Qi / Liao, Jianxin et al. | 2022
- 148
-
Reliable Visual Question Answering: Abstain Rather Than Answer IncorrectlyWhitehead, Spencer / Petryk, Suzanne / Shakib, Vedaad / Gonzalez, Joseph / Darrell, Trevor / Rohrbach, Anna / Rohrbach, Marcus et al. | 2022
- 167
-
GRIT: Faster and Better Image Captioning Transformer Using Dual Visual FeaturesNguyen, Van-Quang / Suganuma, Masanori / Okatani, Takayuki et al. | 2022
- 185
-
Selective Query-Guided Debiasing for Video Corpus Moment RetrievalYoon, Sunjae / Hong, Ji Woo / Yoon, Eunseop / Kim, Dahyun / Kim, Junyeong / Yoon, Hee Suk / Yoo, Chang D. et al. | 2022
- 201
-
Spatial and Visual Perspective-Taking via View Rotation and Relation Reasoning for Embodied Reference UnderstandingShi, Cheng / Yang, Sibei et al. | 2022
- 219
-
Object-Centric Unsupervised Image CaptioningMeng, Zihang / Yang, David / Cao, Xuefei / Shah, Ashish / Lim, Ser-Nam et al. | 2022
- 236
-
Contrastive Vision-Language Pre-training with Limited ResourcesCui, Quan / Zhou, Boyan / Guo, Yu / Yin, Weidong / Wu, Hao / Yoshie, Osamu / Chen, Yubo et al. | 2022
- 254
-
Learning Linguistic Association Towards Efficient Text-Video RetrievalFang, Sheng / Wang, Shuhui / Zhuo, Junbao / Han, Xinzhe / Huang, Qingming et al. | 2022
- 271
-
ASSISTER: Assistive Navigation via Conditional Instruction GenerationHuang, Zanming / Shangguan, Zhongkai / Zhang, Jimuyang / Bar, Gilad / Boyd, Matthew / Ohn-Bar, Eshed et al. | 2022
- 290
-
X-DETR: A Versatile Architecture for Instance-wise Vision-Language TasksCai, Zhaowei / Kwon, Gukyeong / Ravichandran, Avinash / Bas, Erhan / Tu, Zhuowen / Bhotika, Rahul / Soatto, Stefano et al. | 2022
- 309
-
Learning Disentanglement with Decoupled Labels for Vision-Language NavigationCheng, Wenhao / Dong, Xingping / Khan, Salman / Shen, Jianbing et al. | 2022
- 330
-
Switch-BERT: Learning to Model Multimodal Interactions by Switching Attention and InputGuo, Qingpei / Yao, Kaisheng / Chu, Wei et al. | 2022
- 347
-
Word-Level Fine-Grained Story VisualizationLi, Bowen et al. | 2022
- 363
-
Unifying Event Detection and Captioning as Sequence Generation via Pre-trainingZhang, Qi / Song, Yuqing / Jin, Qin et al. | 2022
- 380
-
Multimodal Transformer with Variable-Length Memory for Vision-and-Language NavigationLin, Chuang / Jiang, Yi / Cai, Jianfei / Qu, Lizhen / Haffari, Gholamreza / Yuan, Zehuan et al. | 2022
- 398
-
Fine-Grained Visual EntailmentThomas, Christopher / Zhang, Yipeng / Chang, Shih-Fu et al. | 2022
- 417
-
Bottom Up Top Down Detection Transformers for Language Grounding in Images and Point CloudsJain, Ayush / Gkanatsios, Nikolaos / Mediratta, Ishita / Fragkiadaki, Katerina et al. | 2022
- 434
-
New Datasets and Models for Contextual Reasoning in Visual DialogZhang, Yifeng / Jiang, Ming / Zhao, Qi et al. | 2022
- 452
-
VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature SelectionHong, Joanna / Kim, Minsu / Ro, Yong Man et al. | 2022
- 469
-
Classification-Regression for Chart ComprehensionLevy, Matan / Ben-Ari, Rami / Lischinski, Dani et al. | 2022
- 485
-
AssistQ: Affordance-Centric Question-Driven Task Completion for Egocentric AssistantWong, Benita / Chen, Joya / Wu, You / Lei, Stan Weixian / Mao, Dongxing / Gao, Difei / Shou, Mike Zheng et al. | 2022
- 502
-
FindIt: Generalized Localization with Natural Language QueriesKuo, Weicheng / Bertsch, Fred / Li, Wei / Piergiovanni, A. J. / Saffar, Mohammad / Angelova, Anelia et al. | 2022
- 521
-
UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language ModelingYang, Zhengyuan / Gan, Zhe / Wang, Jianfeng / Hu, Xiaowei / Ahmed, Faisal / Liu, Zicheng / Lu, Yumao / Wang, Lijuan et al. | 2022
- 540
-
Scaling Open-Vocabulary Image Segmentation with Image-Level LabelsGhiasi, Golnaz / Gu, Xiuye / Cui, Yin / Lin, Tsung-Yi et al. | 2022
- 558
-
The Abduction of Sherlock Holmes: A Dataset for Visual Abductive ReasoningHessel, Jack / Hwang, Jena D. / Park, Jae Sung / Zellers, Rowan / Bhagavatula, Chandra / Rohrbach, Anna / Saenko, Kate / Choi, Yejin et al. | 2022
- 576
-
Speaker-Adaptive Lip Reading with User-Dependent PaddingKim, Minsu / Kim, Hyunjun / Ro, Yong Man et al. | 2022
- 594
-
TISE: Bag of Metrics for Text-to-Image Synthesis EvaluationDinh, Tan M. / Nguyen, Rang / Hua, Binh-Son et al. | 2022
- 610
-
SemAug: Semantically Meaningful Image Augmentations for Object Detection Through Language GroundingHeisler, Morgan / Banitalebi-Dehkordi, Amin / Zhang, Yong et al. | 2022
- 627
-
Referring Object Manipulation of Natural Images with Conditional Classifier-Free GuidanceChoi, Myungsub et al. | 2022
- 644
-
NewsStories: Illustrating Articles with Visual SummariesTan, Reuben / Plummer, Bryan A. / Saenko, Kate / Lewis, JP / Sud, Avneesh / Leung, Thomas et al. | 2022
- 662
-
Webly Supervised Concept Expansion for General Purpose Vision ModelsKamath, Amita / Clark, Christopher / Gupta, Tanmay / Kolve, Eric / Hoiem, Derek / Kembhavi, Aniruddha et al. | 2022
- 682
-
FedVLN: Privacy-Preserving Federated Vision-and-Language NavigationZhou, Kaiwen / Wang, Xin Eric et al. | 2022
- 700
-
CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text RetrievalWang, Haoran / He, Dongliang / Wu, Wenhao / Xia, Boyang / Yang, Min / Li, Fu / Yu, Yunlong / Ji, Zhong / Ding, Errui / Wang, Jingdong et al. | 2022
- 717
-
Language-Driven Artistic Style TransferFu, Tsu-Jui / Wang, Xin Eric / Wang, William Yang et al. | 2022
- 735
-
Single-Stream Multi-level Alignment for Vision-Language PretrainingKhan, Zaid / Vijay Kumar, B. G. / Yu, Xiang / Schulter, Samuel / Chandraker, Manmohan / Fu, Yun et al. | 2022