Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding (English)
- New search for: Shi, Fengyuan
- Further information on Shi, Fengyuan:
- https://orcid.org/0000-0003-3239-9389
- New search for: Gao, Ruopeng
- Further information on Gao, Ruopeng:
- https://orcid.org/0009-0002-5522-5488
- New search for: Huang, Weilin
- Further information on Huang, Weilin:
- https://orcid.org/0000-0002-1520-4140
- New search for: Wang, Limin
- Further information on Wang, Limin:
- https://orcid.org/0000-0002-3674-7718
- New search for: Shi, Fengyuan
- Further information on Shi, Fengyuan:
- https://orcid.org/0000-0003-3239-9389
- New search for: Gao, Ruopeng
- Further information on Gao, Ruopeng:
- https://orcid.org/0009-0002-5522-5488
- New search for: Huang, Weilin
- Further information on Huang, Weilin:
- https://orcid.org/0000-0002-1520-4140
- New search for: Wang, Limin
- Further information on Wang, Limin:
- https://orcid.org/0000-0002-3674-7718
In:
IEEE Transactions on Pattern Analysis and Machine Intelligence
;
46
, 2
;
1181-1198
;
2024
- Article (Journal) / Electronic Resource
-
Title:Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding
-
Contributors:Shi, Fengyuan ( author ) / Gao, Ruopeng ( author ) / Huang, Weilin ( author ) / Wang, Limin ( author )
-
Published in:IEEE Transactions on Pattern Analysis and Machine Intelligence ; 46, 2 ; 1181-1198
-
Publisher:
- New search for: IEEE
-
Publication date:2024-02-01
-
Size:11806254 byte
-
ISSN:
-
DOI:
-
Type of media:Article (Journal)
-
Type of material:Electronic Resource
-
Language:English
-
Source:
Table of contents – Volume 46, Issue 2
The tables of contents are generated automatically and are based on the data records of the individual contributions available in the index of the TIB portal. The display of the Tables of Contents may therefore be incomplete.
- 667
-
Diffusion Mechanism in Residual Neural Network: Theory and ApplicationsWang, Tangjun / Dou, Zehao / Bao, Chenglong / Shi, Zuoqiang et al. | 2024
- 681
-
OPAL: Occlusion Pattern Aware Loss for Unsupervised Light Field Disparity EstimationLi, Peng / Zhao, Jiayin / Wu, Jingyao / Deng, Chao / Han, Yuqi / Wang, Haoqian / Yu, Tao et al. | 2024
- 695
-
An Asynchronous Linear Filter Architecture for Hybrid Event-Frame CamerasWang, Ziwei / Ng, Yonhon / Scheerlinck, Cedric / Mahony, Robert et al. | 2024
- 712
-
Generalizable Heterogeneous Federated Cross-Correlation and Instance Similarity LearningHuang, Wenke / Ye, Mang / Shi, Zekun / Du, Bo et al. | 2024
- 729
-
Blockchain Data Mining With Graph Learning: A SurveyQi, Yuxin / Wu, Jun / Xu, Hansong / Guizani, Mohsen et al. | 2024
- 749
-
Understanding and Accelerating Neural Architecture Search With Training-Free and Theory-Grounded MetricsChen, Wuyang / Gong, Xinyu / Wu, Junru / Wei, Yunchao / Shi, Humphrey / Yan, Zhicheng / Yang, Yi / Wang, Zhangyang et al. | 2024
- 764
-
Image Captioning With Controllable and Adaptive Length LevelsDing, Ning / Deng, Chaorui / Tan, Mingkui / Du, Qing / Ge, Zhiwei / Wu, Qi et al. | 2024
- 780
-
Tessellating the Latent Space for Non-Adversarial Generative Auto-EncodersGai, Kuo / Zhang, Shihua et al. | 2024
- 793
-
Progressive Learning of 3D Reconstruction Network From 2D GAN DataDundar, Aysegul / Gao, Jun / Tao, Andrew / Catanzaro, Bryan et al. | 2024
- 805
-
COLD Fusion: Calibrated and Ordinal Latent Distribution Fusion for Uncertainty-Aware Multimodal Emotion RecognitionTellamekala, Mani Kumar / Amiriparian, Shahin / Schuller, Bjorn W. / Andre, Elisabeth / Giesbrecht, Timo / Valstar, Michel et al. | 2024
- 823
-
Video Frame Interpolation With Many-to-Many Splatting and Spatial Selective RefinementHu, Ping / Niklaus, Simon / Zhang, Lu / Sclaroff, Stan / Saenko, Kate et al. | 2024
- 837
-
Non-Fluent Synthetic Target-Language Data Improve Neural Machine TranslationSanchez-Cartagena, Victor M. / Espla-Gomis, Miquel / Perez-Ortiz, Juan Antonio / Sanchez-Martinez, Felipe et al. | 2024
- 851
-
Explanatory Object Part Aggregation for Zero-Shot LearningChen, Xin / Deng, Xiaoling / Lan, Yubin / Long, Yongbing / Weng, Jian / Liu, Zhiquan / Tian, Qi et al. | 2024
- 869
-
Cost Function Unrolling in Unsupervised Optical FlowLifshitz, Gal / Raviv, Dan et al. | 2024
- 881
-
Deep Image Matting With Sparse User InteractionsWei, Tianyi / Chen, Dongdong / Zhou, Wenbo / Liao, Jing / Zhao, Hanqing / Zhang, Weiming / Hua, Gang / Yu, Nenghai et al. | 2024
- 896
-
MetaFormer Baselines for VisionYu, Weihao / Si, Chenyang / Zhou, Pan / Luo, Mi / Zhou, Yichen / Feng, Jiashi / Yan, Shuicheng / Wang, Xinchao et al. | 2024
- 913
-
CGOF++: Controllable 3D Face Synthesis With Conditional Generative Occupancy FieldsSun, Keqiang / Wu, Shangzhe / Zhang, Ning / Huang, Zhaoyang / Wang, Quan / Li, Hongsheng et al. | 2024
- 927
-
Reliable Event Generation With Invertible Conditional Normalizing FlowGu, Daxin / Li, Jia / Zhu, Lin / Zhang, Yu / Ren, Jimmy S. et al. | 2024
- 944
-
WOOD: Wasserstein-Based Out-of-Distribution DetectionWang, Yinan / Sun, Wenbo / Jin, Jionghua / Kong, Zhenyu / Yue, Xiaowei et al. | 2024
- 957
-
Mitigating Confounding Bias in Practical Recommender Systems With Partially Inaccessible Exposure StatusCao, Tianwei / Xu, Qianqian / Yang, Zhiyong / Huang, Qingming et al. | 2024
- 975
-
3-D Point Cloud Attribute Compression With -Laplacian Embedding Graph Dictionary LearningLi, Xin / Dai, Wenrui / Li, Shaohui / Li, Chenglin / Zou, Junni / Xiong, Hongkai et al. | 2024
- 994
-
Room-Object Entity Prompting and Reasoning for Embodied Referring ExpressionGao, Chen / Liu, Si / Chen, Jinyu / Wang, Luting / Wu, Qi / Li, Bo / Tian, Qi et al. | 2024
- 1011
-
Temporal Action Segmentation: An Analysis of Modern TechniquesDing, Guodong / Sener, Fadime / Yao, Angela et al. | 2024
- 1031
-
Variance Reduced Domain Randomization for Reinforcement Learning With Policy GradientJiang, Yuankun / Li, Chenglin / Dai, Wenrui / Zou, Junni / Xiong, Hongkai et al. | 2024
- 1049
-
Learning Hierarchical Modular Networks for Video CaptioningLi, Guorong / Ye, Hanhua / Qi, Yuankai / Wang, Shuhui / Qing, Laiyun / Huang, Qingming / Yang, Ming-Hsuan et al. | 2024
- 1065
-
A Theoretical Analysis of DeepWalk and Node2vec for Exact Recovery of Community Structures in Stochastic BlockmodelsZhang, Yichi / Tang, Minh et al. | 2024
- 1079
-
SPLiT: Single Portrait Lighting Estimation via a Tetrad of Face IntrinsicsFei, Fan / Cheng, Yean / Zhu, Yongjie / Zheng, Qian / Li, Si / Pan, Gang / Shi, Boxin et al. | 2024
- 1093
-
Image Restoration via Frequency SelectionCui, Yuning / Ren, Wenqi / Cao, Xiaochun / Knoll, Alois et al. | 2024
- 1109
-
A Theoretical Analysis of Density Peaks Clustering and the Component-Wise Peak-Finding AlgorithmTobin, Joshua / Zhang, Mimi et al. | 2024
- 1121
-
Learning Interpretable Rules for Scalable Data Representation and ClassificationWang, Zhuo / Zhang, Wei / Liu, Ning / Wang, Jianyong et al. | 2024
- 1134
-
Optimal Composite Likelihood Estimation and Prediction for Distributed Gaussian Process ModelingLi, Yongxiang / Zhou, Qiang / Jiang, Wei / Tsui, Kwok-Leung et al. | 2024
- 1148
-
Differentiable Image Data Augmentation and Its Applications: A SurveyShi, Jian / Ghazzai, Hakim / Massoud, Yehia et al. | 2024
- 1165
-
Back to Reality: Learning Data-Efficient 3D Object Detector With Shape GuidanceXu, Xiuwei / Wang, Ziwei / Zhou, Jie / Lu, Jiwen et al. | 2024
- 1181
-
Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual GroundingShi, Fengyuan / Gao, Ruopeng / Huang, Weilin / Wang, Limin et al. | 2024
- 1199
-
False Correlation Reduction for Offline Reinforcement LearningDeng, Zhihong / Fu, Zuyue / Wang, Lingxiao / Yang, Zhuoran / Bai, Chenjia / Zhou, Tianyi / Wang, Zhaoran / Jiang, Jing et al. | 2024
- 1212
-
ViTPose++: Vision Transformer for Generic Body Pose EstimationXu, Yufei / Zhang, Jing / Zhang, Qiming / Tao, Dacheng et al. | 2024
- 1231
-
Importance Weighted Structure Learning for Scene Graph GenerationLiu, Daqi / Bober, Miroslaw / Kittler, Josef et al. | 2024
- 1243
-
Multi-Stage Asynchronous Federated Learning With Adaptive Differential PrivacyLi, Yanan / Yang, Shusen / Ren, Xuebin / Shi, Liang / Zhao, Cong et al. | 2024
- 1257
-
LayerNet: High-Resolution Semantic 3D Reconstruction of Clothed PeopleCorona, Enric / Alenya, Guillem / Pons-Moll, Gerard / Moreno-Noguer, Francesc et al. | 2024
- 1273
-
PFENet++: Boosting Few-Shot Semantic Segmentation With the Noise-Filtered Context-Aware Prior MaskLuo, Xiaoliu / Tian, Zhuotao / Zhang, Taiping / Yu, Bei / Tang, Yuan Yan / Jia, Jiaya et al. | 2024
- 1290
-
Tobias: A Random CNN Sees ObjectsCao, Yun-Hao / Wu, Jianxin et al. | 2024
- 1305
-
Inequality-Constrained 3D Morphable Face Model FittingSariyanidi, Evangelos / Zampella, Casey J. / Schultz, Robert T. / Tunc, Birkan et al. | 2024