D2Former: A Fully Complex Dual-Path Dual-Decoder Conformer Network Using Joint Complex Masking and Complex Spectral Mapping for Monaural Speech Enhancement (English)

Zhao, Shengkui / Ma, Bin

In: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ; 1-5 ; 2023

ISBN:

978-1-7281-6327-7

ISSN:

2379-190X

Conference paper / Electronic Resource

How to get this title?

Check access

Download

Commercial Copyright fee: €30.47 Basic fee: €4.00 Total price: €34.47

Academic Copyright fee: €30.47 Basic fee: €2.00 Total price: €32.47

Export, share and cite

Monaural speech enhancement has been widely studied using real networks in the time-frequency (TF) domain. However, the input and the target are naturally complex-valued in the TF domain, a fully complex network is highly desirable for effectively learning the feature representation and modelling the sequence in the complex domain. Moreover, phase, an important factor for perceptual quality of speech, has been proved learnable together with magnitude from noisy speech using complex masking or complex spectral mapping. Many recent studies focus on either complex masking or complex spectral mapping, ignoring their performance boundaries. To address above issues, we propose a fully complex dual-path dual-decoder conformer network (D2Former) using joint complex masking and complex spectral mapping for monaural speech enhancement. In D2Former, we extend the conformer network into the complex domain and form a dual-path complex TF self-attention architecture for effectively modelling the complex-valued TF sequence. We further boost the TF feature representation in the encoder and the decoders using a dual-path learning structure by exploiting complex dilated convolutions on time dependency and complex feedforward sequential memory networks (CFSMN) for frequency recurrence. In addition, we improve the performance boundaries of complex masking and complex spectral mapping by combining the strengths of the two training targets into a joint-learning framework. As a consequence, D2Former takes fully advantages of the complex-valued operations, the dual-path processing, and the joint-training targets. Compared to the previous models, D2Former achieves state-of-the-art results on the VoiceBank+Demand benchmark with the smallest model size of 0.87M parameters.

Title:

D2Former: A Fully Complex Dual-Path Dual-Decoder Conformer Network Using Joint Complex Masking and Complex Spectral Mapping for Monaural Speech Enhancement
Contributors:

Zhao, Shengkui ( author ) / Ma, Bin ( author )
Published in:

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ; 1-5
Publisher:

IEEE

Publication date:

2023-06-04
Size:

1908555 byte
ISBN:

978-1-7281-6327-7
ISSN:

2379-190X
DOI:

https://doi.org/10.1109/ICASSP49357.2023.10096259
Type of media:

Conference paper
Type of material:

Electronic Resource
Language:

English
Source:

IEEE

Table of contents conference proceedings

The tables of contents are generated automatically and are based on the data records of the individual contributions available in the index of the TIB portal. The display of the Tables of Contents may therefore be incomplete.

1: Diagonal State Space Augmented Transformers for Speech Recognition
Saon, George / Gupta, Ankit / Cui, Xiaodong et al. | 2023
digital version
1: Unrestricted Anchor Graph Based GCN for Incomplete Multi-View Clustering
Zhao, Liang / Wang, Zihao / Yuan, Yukun / Ding, Feng et al. | 2023
digital version
1: Wave-U-Net Discriminator: Fast and Lightweight Discriminator for Generative Adversarial Network-Based Speech Synthesis
Kaneko, Takuhiro / Kameoka, Hirokazu / Tanaka, Kou / Seki, Shogo et al. | 2023
digital version
1: High-Dimensional Confidence Regions in Sparse MRI
Hoppe, Frederik / Krahmer, Felix / Mayrink Verdun, Claudio / Menzel, Marion I. / Rauhut, Holger et al. | 2023
digital version
1: Towards Efficient and Optimal Joint Beamforming and Antenna Selection: A Machine Learning Approach
Shrestha, Sagar / Fu, Xiao / Hong, Mingyi et al. | 2023
digital version
1: Quantum Graph Transformers
Kollias, Georgios / Kalantzis, Vassilis / Salonidis, Theodoros / Ubaru, Shashanka et al. | 2023
digital version
1: Deep3DSketch: 3D Modeling from Free-Hand Sketches with View- and Structural-Aware Adversarial Training
Chen, Tianrun / Fu, Chenglong / Zhu, Lanyun / Mao, Papa / Zhang, Jia / Zang, Ying / Sun, Lingyun et al. | 2023
digital version
1: PhaseAug: A Differentiable Augmentation for Speech Synthesis to Simulate One-to-Many Mapping
Lee, Junhyeok / Han, Seungu / Cho, Hyunjae / Jung, Wonbin et al. | 2023
digital version
1: A Method of Constructing and Automatically Labeling Radio Frequency Signal Training Dataset for UAV
Liu, Chao / Ma, Ruipeng / Si, Zheng / Chi, Mingmin et al. | 2023
digital version
1: An Online Algorithm for Contrastive Principal Component Analysis
Golkar, Siavash / Lipshutz, David / Tesileanu, Tiberiu / Chklovskii, Dmitri B. et al. | 2023
digital version
1: Small-Footprint Slimmable Networks for Keyword Spotting
Akhtar, Zuhaib / Khursheed, Mohammad Omar / Du, Dongsu / Liu, Yuzong et al. | 2023
digital version
1: UFO2: A Unified Pre-Training Framework for Online and Offline Speech Recognition
Fu, Li / Li, Siqi / Li, Qingtao / Deng, Liping / Li, Fangzhu / Fan, Lu / Chen, Meng / He, Xiaodong et al. | 2023
digital version
1: Audio Coding With Unified Noise Shaping And Phase Contrast Control
Jo, Byeongho / Beack, Seungkwon / Lee, Taejin et al. | 2023
digital version
1: Learning To Locate Visual Answer In Video Corpus Using Question
Li, Bin / Weng, Yixuan / Sun, Bin / Li, Shutao et al. | 2023
digital version
1: ECG Artifact Removal from Single-Channel Surface EMG Using Fully Convolutional Networks
Wang, Kuan-Chen / Liu, Kai-Chun / Peng, Sheng-Yu / Tsao, Yu et al. | 2023
digital version
1: K²NN: Self-Supervised Learning with Hierarchical Nearest Neighbors for Remote Sensing
Yuan, Jianlong / Xu, Yuanhong / Wang, Zhibin et al. | 2023
digital version
1: Approximation Error Back-Propagation for Q-Function in Scalable Reinforcement Learning with Tree Dependence Structure
Yan, Yuzi / Dong, Yu / Ma, Kai / Shen, Yuan et al. | 2023
digital version
1: Multi-Resolution Sequence Aggregation and Model-Agnostic Framework for Time-Series Forecasting
Lyu, Juhyun / Yang, Jinseok / Kim, Junghee / Lim, Woohyung / Ahn, Wonbin / Kang, Dongwan / Kim, Minjae / Kim, Nam Soo et al. | 2023
digital version
1: DMSA: Dynamic Multi-Scale Unsupervised Semantic Segmentation Based On Adaptive Affinity
Yang, Kun / Lu, Jun et al. | 2023
digital version
1: A Discriminative Multi-Channel Noise Feature Representation Model for Image Manipulation Localization
Zhou, Yang / Wang, Hongxia / Zeng, Qiang / Zhang, Rui / Meng, Sijiang et al. | 2023
digital version
1: Incorporating Visual Information Reconstruction into Progressive Learning for Optimizing audio-visual Speech Enhancement
Zhang, Chen-Yue / Chen, Hang / Du, Jun / Yin, Bao-Cai / Pan, Jia / Lee, Chin-Hui et al. | 2023
digital version
1: Equivalence of Aperture Reduction in Element Space and Constrained Combination of DFT Beams in Beamspace
Rakhimov, Damir / Haardt, Martin et al. | 2023
digital version
1: Contrastive Learning at the Relation and Event Level for Rumor Detection
Xu, Yingrui / Hu, Jingyuan / Ge, Jingguo / Wu, Yulei / Li, Tong / Li, Hui et al. | 2023
digital version
1: Beamforming Optimization in RIS-Aided Mimo Systems Under Multiple-Reflection Effects
Wijekoon, Dilki / Mezghani, Amine / Hossain, Ekram et al. | 2023
digital version
1: EEG2IMAGE: Image Reconstruction from EEG Brain Signals
Singh, Prajwal / Pandey, Pankaj / Miyapuram, Krishna / Raman, Shanmuganathan et al. | 2023
digital version
1: Dual Meta Calibration Mix for Improving Generalization in Meta-Learning
Mi, Ze-Yu / Yang, Yu-Bin et al. | 2023
digital version
1: Implicit Bayes Adaptation: A Collaborative Transport Approach
Jiang, Bo / Krim, Hamid / Wu, Tianfu / Cansever, Derya et al. | 2023
digital version
1: Blind Source Counting and Separation with Relative Harmonic Coefficients
Sun, Huiyuan / Samarasinghe, Prasanga / Abhayapala, Thushara et al. | 2023
digital version
1: YOLOX-B: A Better Yolox Model for Real-Time Driver Behavior Detection
Guo, Xu / Ma, Ming / Zhang, Jiaqiang / Li, Shaojie et al. | 2023
digital version
1: Active Noise Control over 3D Space: A Realistic Error Microphone Geometry Design
Sun, Huiyuan / Samarasinghe, Prasanga / Abhayapala, Thushara et al. | 2023
digital version
1: A Multi-Stage Hierarchical Relational Graph Neural Network for Multimodal Sentiment Analysis
Gong, Peizhu / Liu, Jin / Zhang, Xiliang / Li, Xingye et al. | 2023
digital version
1: Single-Sample Direction-of-Arrival Estimation for Fast and Robust 3D Localization With Real Measurements from a Massive MIMO System
Mazokha, Stepan / Naderi, Sanaz / Orfanidis, Georgios I. / Sklivanitis, George / Pados, Dimitris A. / Hallstrom, Jason O. et al. | 2023
digital version
1: Low in Resolution, High in Precision: UAV Detection with Super-Resolution and Motion Information Extraction
Wang, Hanzhuo / Wang, Xingjian / Zhou, Chengwei / Meng, Wenchao / Shi, Zhiguo et al. | 2023
digital version
1: Continuous Descriptor-Based Control for Deep Audio Synthesis
Devis, Ninon / Demerle, Nils / Nabi, Sarah / Genova, David / Esling, Philippe et al. | 2023
digital version
1: SSGD: A Smartphone Screen Glass Dataset for Defect Detection
Han, Haonan / Yang, Rui / Li, Shuyan / Hu, Runze / Li, Xiu et al. | 2023
digital version
1: Leveraging Phone-Level Linguistic-Acoustic Similarity For Utterance-Level Pronunciation Scoring
Liu, Wei / Fu, Kaiqi / Tian, Xiaohai / Shi, Shuju / Li, Wei / Ma, Zejun / Lee, Tan et al. | 2023
digital version
1: Learning Unbiased Rewards with Mutual Information in Adversarial Imitation Learning
Zhang, Lihua / Liu, Quan / Huang, Zhigang / Wu, Lan et al. | 2023
digital version
1: Parasympathetic-Sympathetic Causal Interactions and Perceived Workload for Varying Difficulty Affective Computing Tasks
Lavanuru, Pravallika / Pratiher, Sawon / Sahoo, Karuna P. / Acharya, Mrinal / S, Sreejith / Ghosh, Nirmalya / Patra, Amit et al. | 2023
digital version
1: Picking the Underused Heads: A Network Pruning Perspective of Attention Head Selection for Fusing Dialogue Coreference Information
Liu, Zhengyuan / Chen, Nancy F. et al. | 2023
digital version
1: Deep Plug-and-Play for Tensor Robust Principal Component Analysis
Tan, Hao / Wang, Jianjun / Kong, Weichao et al. | 2023
digital version
1: Contrastive Learning-Based Audio to Lyrics Alignment for Multiple Languages
Durand, Simon / Stoller, Daniel / Ewert, Sebastian et al. | 2023
digital version
1: Robust Knowledge Distillation from RNN-T Models with Noisy Training Labels Using Full-Sum Loss
Zeineldeen, Mohammad / Audhkhasi, Kartik / Baskar, Murali Karthick / Ramabhadran, Bhuvana et al. | 2023
digital version
1: Hiding Speaker’s Sex in Speech Using Zero-Evidence Speaker Representation in an Analysis/Synthesis Pipeline
Noe, Paul-Gauthier / Miao, Xiaoxiao / Wang, Xin / Yamagishi, Junichi / Bonastre, Jean-Francois / Matrouf, Driss et al. | 2023
digital version
1: ICEL: Learning with Inconsistent Explanations
Liu, Biao / Wu, Xiaoyu / Yuan, Bo et al. | 2023
digital version
1: Facial Texure Perceiver: Towards High-Fidelity Facial Texture Recovery with Input-Level Inductive Biased Perceiver IO
Lee, Seungeun et al. | 2023
digital version
1: Single-Shot Domain Adaptation via Target-Aware Generative Augmentations
Subramanyam, Rakshith / Thopalli, Kowshik / Berman, Spring / Turaga, Pavan / Thiagarajan, Jayaraman J. et al. | 2023
digital version
1: Distance-Based Weight Transfer for Fine-Tuning From Near-Field to Far-Field Speaker Verification
Zhang, Li / Wang, Qing / Wang, Hongji / Li, Yue / Rao, Wei / Wang, Yannan / Xie, Lei et al. | 2023
digital version
1: Efficient and Effective Multi-Camera Pose Estimation with Weighted M-Estimate Sample Consensus
Lin, Xinyu / Zhou, Yingjie / Zhang, Xun / Liu, Yipeng / Zhu, Ce et al. | 2023
digital version
1: Paaploss: A Phonetic-Aligned Acoustic Parameter Loss for Speech Enhancement
Yang, Muqiao / Konan, Joseph / Bick, David / Zeng, Yunyang / Han, Shuo / Kumar, Anurag / Watanabe, Shinji / Raj, Bhiksha et al. | 2023
digital version
1: A Novel Extrapolation Technique to Accelerate WMMSE
Zhou, Kaiwen / Chen, Zhilin / Liu, Guochen / Chen, Zhitang et al. | 2023
digital version
1: Improving Non-Autoregressive Speech Recognition with Autoregressive Pretraining
Li, Yanjia / Samarakoon, Lahiru / Fung, Ivan et al. | 2023
digital version
1: CORSD: Class-Oriented Relational Self Distillation
Yu, Muzhou / Tan, Sia Huat / Wu, Kailu / Dong, Runpei / Zhang, Linfeng / Ma, Karsheng et al. | 2023
digital version
1: Short-Segment Speaker Verification Using ECAPA-TDNN with Multi-Resolution Encoder
Han, Sangwook / Ahn, Youngdo / Kang, Kyeongmuk / Shin, Jong Won et al. | 2023
digital version
1: Prefix Tuning for Automated Audio Captioning
Kim, Minkyu / Sung-Bin, Kim / Oh, Tae-Hyun et al. | 2023
digital version
1: Real-Time Multichannel Speech Separation and Enhancement Using a Beamspace-Domain-Based Lightweight CNN
Olivieri, Marco / Comanducci, Luca / Pezzoli, Mirco / Balsarri, Davide / Menescardi, Luca / Buccoli, Michele / Pecorino, Simone / Grosso, Antonio / Antonacci, Fabio / Sarti, Augusto et al. | 2023
digital version
1: LongFNT: Long-Form Speech Recognition with Factorized Neural Transducer
Gong, Xun / Wu, Yu / Li, Jinyu / Liu, Shujie / Zhao, Rui / Chen, Xie / Qian, Yanmin et al. | 2023
digital version
1: WIFI-Based Robust Child Presence Detection for Smart Cars
Jayaweera, Sakila S. / Wang, Beibei / Zeng, Xiaolu / Wang, Wei-Hsiang / Ray Liu, K. J. et al. | 2023
digital version
1: CANDY: Category-Kernelized Dynamic Convolution for Instance Segmentation
Lu, Yao / Chen, Zhiyi / Chen, Zehui / Hu, Jie / Cao, Liujuan / Zhang, Shengchuan et al. | 2023
digital version
1: Distance-Based Online Label Inference Attacks Against Split Learning
Liu, Junlin / Lyu, Xinchen et al. | 2023
digital version
1: Combining the Silhouette and Skeleton Data for Gait Recognition
Wang, Likai / Han, Ruize / Feng, Wei et al. | 2023
digital version
1: Comparing Decentralized Gradient Descent Approaches and Guarantees
Moothedath, Shana / Vaswani, Namrata et al. | 2023
digital version
1: Multi-Speaker and Wide-Band Simulated Conversations as Training Data for End-to-End Neural Diarization
Landini, Federico / Diez, Mireia / Lozano-Diez, Alicia / Burget, Lukas et al. | 2023
digital version
1: D-CONFORMER: Deformable Sparse Transformer Augmented Convolution for Voxel-Based 3D Object Detection
Zhao, Xiao / Su, Liuzhen / Zhang, Xukun / Yang, Dingkang / Sun, Mingyang / Wang, Shunli / Zhai, Peng / Zhang, Lihua et al. | 2023
digital version
1: Spatial Inference Using Censored Multiple Testing with Fdr Control
Golz, Martin / Zoubir, Abdelhak M. / Koivunen, Visa et al. | 2023
digital version
1: Runtime Prediction of Machine Learning Algorithms in Automl Systems
Dube, Parijat / Salonidis, Theodoros / Ram, Parikshit / Verma, Ashish et al. | 2023
digital version
1: Transformer-Based Bioacoustic Sound Event Detection on Few-Shot Learning Tasks
You, Liwen / Coyotl, Erika Pelaez / Gunturu, Suren / Van Segbroeck, Maarten et al. | 2023
digital version
1: Unlimited Sampling in Phase Space
Zhang, Peiyu / Bhandari, Ayush et al. | 2023
digital version
1: Integrated Sensing and Full-Duplex Communication: Joint Transceiver Beamforming and Power Allocation
He, Zhenyao / Xu, Wei / Shen, Hong / Kwan Ng, Derrick Wing / Eldar, Yonina C. / You, Xiaohu et al. | 2023
digital version
1: Online Model Compression for Federated Learning with Large Models
Yang, Tien-Ju / Xiao, Yonghui / Motta, Giovanni / Beaufays, Francoise / Mathews, Rajiv / Chen, Mingqing et al. | 2023
digital version
1: Active Beam Tracking with Reconfigurable Intelligent Surface
Han, Han / Jiang, Tao / Yu, Wei et al. | 2023
digital version
1: A Magnetic Framelet-Based Convolutional Neural Network for Directed Graphs
Lin, Lequan / Gao, Junbin et al. | 2023
digital version
1: An Edge Alignment-Based Orientation Selection Method for Neutron Tomography
Yang, Diyu / Tang, Shimin / Venkatakrishnan, Singanallur V. / Chowdhury, Mohammad S. N. / Zhang, Yuxuan / Bilheux, Hassina Z. / Buzzard, Gregery T. / Bouman, Charles A. et al. | 2023
digital version
1: SMUG: Towards Robust Mri Reconstruction by Smoothed Unrolling
Li, Hui / Jia, Jinghan / Liang, Shijun / Yao, Yuguang / Ravishankar, Saiprasad / Liu, Sijia et al. | 2023
digital version
1: Weavspeech: Data Augmentation Strategy For Automatic Speech Recognition Via Semantic-Aware Weaving
Seo, Kyusung / Park, Joonhyung / Song, Jaeyun / Yang, Eunho et al. | 2023
digital version
1: CTTSR: A Hybrid CNN-Transformer Network for Scene Text Image Super-Resolution
Dai, Kaiwei / Kang, Nan / Kuang, Li et al. | 2023
digital version
1: M22: Rate-Distortion Inspired Gradient Compression
Liu, Yangyi / Salehkalaibar, Sadaf / Rini, Stefano / Chen, Jun et al. | 2023
digital version
1: Joint Training of Hierarchical GANs and Semantic Segmentation for Expression Translation
Bodur, Rumeysa / Bhattarai, Binod / Kim, Tae-Kyun et al. | 2023
digital version
1: Performance Comparison of TTS Models for Brazilian Portuguese to Establish a Baseline
Lobato, Wilmer / Farias, Felipe / Cruz, William / Amadeus, Marcellus et al. | 2023
digital version
1: On Adversarial Robustness of Audio Classifiers
Lu, Kangkang / Nguyen, Manh Cuong / Xu, Xun / Foo, Chuan Sheng et al. | 2023
digital version
1: Audio-Driven High Definetion and Lip-Synchronized Talking Face Generation Based on Face Reenactment
Wang, Xianyu / Zhang, Yuhan / He, Weihua / Wang, Yaoyuan / Li, Minglei / Wang, Yuchen / Zhang, Jingyi / Zhou, Shunbo / Zhang, Ziyang et al. | 2023
digital version
1: Text-To-Speech Synthesis Based on Latent Variable Conversion Using Diffusion Probabilistic Model and Variational Autoencoder
Yasuda, Yusuke / Toda, Tomoki et al. | 2023
digital version
1: Representation Learning of Clinical Multivariate Time Series with Random Filter Banks
Keshavarzian, Alireza / Salehinejad, Hojjat / Valaee, Shahrokh et al. | 2023
digital version
1: Make More of Your Data: Minimal Effort Data Augmentation for Automatic Speech Recognition and Translation
Lam, Tsz Kin / Schamoni, Shigehiko / Riezler, Stefan et al. | 2023
digital version
1: SDRNet: Shape Decoupled Regression Network for 3d face Reconstruction
Zhang, Shikun / Song, Fengyi / Song, Ge / Yang, Ming et al. | 2023
digital version
1: IR-ECG: Invertible Reconstruction of ECG
Wang, Peng / Huang, Xi / Cui, Li et al. | 2023
digital version
1: Data Leakage in Cross-Modal Retrieval Training: A Case Study
Weck, Benno / Serra, Xavier et al. | 2023
digital version
1: EfficientSpeech: An On-Device Text to Speech Model
Atienza, Rowel et al. | 2023
digital version
1: Subband Dependency Modeling for Sound Event Detection
Guan, Yadong / Zheng, Guibin / Han, Jiqing / Wang, Huanliang et al. | 2023
digital version
1: Tracking Targets in Hyper-Scale Cameras Using Movement Predication
Yu, Jiaping / Zhou, Tongqing / Cai, Zhiping / Kuang, Wenyuan et al. | 2023
digital version
1: Revisit Out-Of-Vocabulary Problem For Slot Filling: A Unified Contrastive Framework With Multi-Level Data Augmentations
Guo, Daichi / Dong, Guanting / Fu, Dayuan / Wu, Yuxiang / Zeng, Chen / Hui, Tingfeng / Wang, Liwen / Li, Xuefeng / Wang, Zechen / He, Keqing et al. | 2023
digital version
1: End-to-End Amp Modeling: from Data to Controllable Guitar Amplifier Models
Juvela, Lauri / Damskagg, Eero-Pekka / Peussa, Aleksi / Makinen, Jaakko / Sherson, Thomas / Mimilakis, Stylianos I. / Rauhanen, Kimmo / Gotsopoulos, Athanasios et al. | 2023
digital version
1: TAPLoss: A Temporal Acoustic Parameter Loss for Speech Enhancement
Zeng, Yunyang / Konan, Joseph / Han, Shuo / Bick, David / Yang, Muqiao / Kumar, Anurag / Watanabe, Shinji / Raj, Bhiksha et al. | 2023
digital version
1: Decaying Contrast for Fine-Grained Video Representation Learning
Zhang, Heng / Su, Bing et al. | 2023
digital version
1: EMCLR: Expectation Maximization Contrastive Learning Representations
Liu, Meng / Yi, Ran / Ma, Lizhuang et al. | 2023
digital version
1: Difference Guided VHR Remote Sensing Image Change Detection
Sun, Jiukai / Liu, Ganchao / Li, Xuelong / Yuan, Yuan et al. | 2023
digital version
1: Topology Uncertainty Modeling For Imbalanced Node Classification on Graphs
Gao, Jiayi / Li, Jiaxing / Zhang, Ke / Kong, Youyong et al. | 2023
digital version
1: SSI-Net: A Multi-Stage Speech Signal Improvement System for ICASSP 2023 SSI Challenge
Zhu, Weixin / Wang, Zilin / Lin, Jiuxin / Zeng, Chang / Yu, Tao et al. | 2023
digital version
1: Blind Acoustic Room Parameter Estimation Using Phase Features
Ick, Christopher / Mehrabi, Adib / Jin, Wenyu et al. | 2023
digital version
1: Exploiting Speaker Embeddings for Improved Microphone Clustering and Speech Separation in ad-hoc Microphone Arrays
Kindt, Stijn / Thienpondt, Jenthe / Madhu, Nilesh et al. | 2023
digital version
1: Classification of the Cervical Vertebrae Maturation (CVM) Stages Using the Tripod Network
Atici, Salih / Pan, Hongyi / Elnagar, Mohammed H. / Allareddy, Veerasathpurush / Suhaym, Omar / Ansari, Rashid / Cetin, Ahmet Enis et al. | 2023
digital version
1: A Deep Fusion Rule for Infrared and Visible Image Fusion: Feature Communication for Importance Assessment
Lv, Xuran / Cheng, Jinyong / Lv, Guohua / Wei, Zhonghe et al. | 2023
digital version
1: On the Role of Visual Context in Enriching Music Representations
Avramidis, Kleanthis / Stewart, Shanti / Narayanan, Shrikanth et al. | 2023
digital version
1: Designing A 3d-Aware Stylenerf Encoder for Face Editing
Yang, Songlin / Wang, Wei / Peng, Bo / Dong, Jing et al. | 2023
digital version
1: Sensor Selection for Angle of Arrival Estimation Based on the Two-Target Cramér-Rao Bound
Kokke, Costas A. / Coutino, Mario / Anitori, Laura / Heusdens, Richard / Leus, Geert et al. | 2023
digital version
1: A Meta-Gnn Approach to Personalized Seizure Detection and Classification
Rahmani, Abdellah / Venkitaraman, Arun / Frossard, Pascal et al. | 2023
digital version
1: Does a Quieter City Mean Fewer Complaints? The Sounds of New York City During Covid-19 Lockdown
Cartwright, Mark / Fuentes, Magdalena / Mydlarz, Charlie / Miranda, Fabio / Bello, Juan Pablo et al. | 2023
digital version
1: ECGT2T: Towards Synthesizing Twelve-Lead Electrocardiograms from Two Asynchronous Leads
Jo, Yong-Yeon / Choi, Young Sang / Jang, Jong-Hwan / Kwon, Joon-Myoung et al. | 2023
digital version
1: Once-for-All Sequence Compression for Self-Supervised Speech Models
Chen, Hsuan-Jui / Meng, Yen / Lee, Hung-yi et al. | 2023
digital version
1: UX-Net: Filter-and-Process-Based Improved U-Net for real-time time-domain audio Separation
Patel, Kashyap / Kovalyov, Anton / Panahi, Issa et al. | 2023
digital version
1: Dasformer: Deep Alternating Spectrogram Transformer For Multi/Single-Channel Speech Separation
Wang, Shuo / Kong, Xiangyu / Peng, Xiulian / Movassagh, Hesam / Prakash, Vinod / Lu, Yan et al. | 2023
digital version
1: Audio Barlow Twins: Self-Supervised Audio Representation Learning
Anton, Jonah / Coppock, Harry / Shukla, Pancham / Schuller, Bjorn W. et al. | 2023
digital version
1: Confidence-Based Event-Centric Online Video Question Answering on a Newly Constructed ATBS Dataset
Kong, Weikai / Ye, Shuhong / Yao, Chenglin / Ren, Jianfeng et al. | 2023
digital version
1: Mcrood: Multi-Class Radar Out-Of-Distribution Detection
Kahya, Sabri Mustafa / Sami Yavuz, Muhammet / Steinbach, Eckehard et al. | 2023
digital version
1: Pre-Training Strategies Using Contrastive Learning and Playlist Information for Music Classification and Similarity
Alonso-Jimenez, Pablo / Favory, Xavier / Foroughmand, Hadrien / Bourdalas, Grigoris / Serra, Xavier / Lidy, Thomas / Bogdanov, Dmitry et al. | 2023
digital version
1: Multimodal Dyadic Impression Recognition via Listener Adaptive Cross-Domain Fusion
Li, Yuanchao / Bell, Peter / Lai, Catherine et al. | 2023
digital version
1: Forensics for Adversarial Machine Learning Through Attack Mapping Identification
Yan, Allen / Kim, Jinsub / Raich, Raviv et al. | 2023
digital version
1: Sketch Less Face Image Retrieval: A New Challenge
Dai, Dawei / Li, Yutang / Wang, Liang / Fu, Shiyu / Xia, Shuyin / Wang, Guoyin et al. | 2023
digital version
1: Sample-Adapt Fusion Network for RGB-D Hand Detection in the Wild
Liu, Xingyu / Ren, Pengfei / Chen, Yuchen / Liu, Cong / Wang, Jing / Sun, Haifeng / Qi, Qi / Wang, Jingyu et al. | 2023
digital version
1: Semantic Preserving Learning for Task-Oriented Point Cloud Downsampling
Xiong, Jianyu / Dai, Tao / Zha, Yaohua / Wang, Xin / Xia, Shu-Tao et al. | 2023
digital version
1: Subgradient Descent Learning with Over-the-Air Computation
Gez, Tamir L. S. / Cohen, Kobi et al. | 2023
digital version
1: Rigid-Body Sound Synthesis with Differentiable Modal Resonators
Diaz, Rodrigo / Hayes, Ben / Saitis, Charalampos / Fazekas, Gyorgy / Sandler, Mark et al. | 2023
digital version
1: Better Together: Dialogue Separation and Voice Activity Detection for Audio Personalization in TV
Torcoli, Matteo / Habets, Emanuel A. P. et al. | 2023
digital version
1: An Attention-Based Approach to Hierarchical Multi-Label Music Instrument Classification
Zhong, Zhi / Hirano, Masato / Shimada, Kazuki / Tateishi, Kazuya / Takahashi, Shusuke / Mitsufuji, Yuki et al. | 2023
digital version
1: Hadamard Layer to Improve Semantic Segmentation
Hoyos, Angello / Rivera, Mariano et al. | 2023
digital version
1: Decoding Musical Pitch from Human Brain Activity with Automatic Voxel-Wise Whole-Brain FMRI Feature Selection
Cheung, Vincent K.M. / Peng, Yueh-Po / Lin, Jing-Hua / Su, Li et al. | 2023
digital version
1: Graph Wavelet-Based Point Cloud Geometric Denoising with Surface-Consistent Non-Negative Kernel Regression
Watanabe, Ryosuke / Nonaka, Keisuke / Pavez, Eduardo / Kobayashi, Tatsuya / Ortega, Antonio et al. | 2023
digital version
1: Semi-Swinderain: Semi-Supervised Image Deraining Network Using SWIN Transformer
Ren, Chun / Yan, Danfeng / Cai, Yuanqiang / Li, Yangchun et al. | 2023
digital version
1: Hierarchical Multi-Task Learning for Fabric Component Analysis Based on NIR Spectral Signals
Kim, Joseph / Wu, Dong / Chi, Mingmin / Xu, Gaoqi et al. | 2023
digital version
1: Transferring Quantified Emotion Knowledge for the Detection of Depression in Alzheimer’s Disease Using Forestnets
Perez-Toro, P. A. / Rodriguez-Salas, D. / Arias-Vergara, T. / Bayerl, S. P. / Klumpp, P. / Riedhammer, K. / Schuster, M. / Noth, E. / Maier, A. / Orozco-Arroyave, J. R. et al. | 2023
digital version
1: End-to-End Classification of Cell-Cycle Stages with Center-Cell Focus Tracker Using Recurrent Neural Networks
Jose, Abin / Roy, Rijo / Eschweiler, Dennis / Laube, Ina / Azad, Reza / Moreno-Andres, Daniel / Stegmaier, Johannes et al. | 2023
digital version
1: Client Selection for Generalization in Accelerated Federated Learning: A Bandit Approach
Ami, Dan Ben / Cohen, Kobi / Zhao, Qing et al. | 2023
digital version
1: Efficient Speech Translation with Dynamic Latent Perceivers
Tsiamas, Ioannis / Gallego, Gerard I. / Fonollosa, Jose A. R. / Costa-jussa, Marta R. et al. | 2023
digital version
1: Towards Privacy and Utility in Tourette TIC Detection Through Pretraining Based on Publicly Available Video Data of Healthy Subjects
Sophie Brugge, Nele / Mohammadi, Esfandiar / Munchau, Alexander / Baumer, Tobias / Frings, Christian / Beste, Christian / Roessner, Veit / Handels, Heinz et al. | 2023
digital version
1: Mixer: DNN Watermarking using Image Mixup
Kallas, Kassem / Furon, Teddy et al. | 2023
digital version
1: Targeted Adversarial Attacks Against Neural Machine Translation
Sadrizadeh, Sahar / Aghdam, AmirHossein Dabiri / Dolamic, Ljiljana / Frossard, Pascal et al. | 2023
digital version
1: Supervised Hierarchical Clustering Using Graph Neural Networks for Speaker Diarization
Singh, Prachi / Kaul, Amrit / Ganapathy, Sriram et al. | 2023
digital version
1: FindAdaptNet: Find and Insert Adapters by Learned Layer Importance
Huang, Junwei / Ganesan, Karthik / Maiti, Soumi / Min Kim, Young / Chang, Xuankai / Liang, Paul / Watanabe, Shinji et al. | 2023
digital version
1: An Effective Anomalous Sound Detection Method Based on Representation Learning with Simulated Anomalies
Chen, Han / Song, Yan / Zhuo, Zhu / Zhou, Yu / Li, Yu-Hong / Xue, Hui / McLoughlin, Ian et al. | 2023
digital version
1: Batch Normalization Damages Federated Learning on NON-IID Data: Analysis and Remedy
Wang, Yanmeng / Shi, Qingjiang / Chang, Tsung-Hui et al. | 2023
digital version
1: Convolution-Based Channel-Frequency Attention for Text-Independent Speaker Verification
Li, Jingyu / Tian, Yusheng / Lee, Tan et al. | 2023
digital version
1: Learning Properties of Holomorphic Neural Networks of Dual Variables
Kozlov, Dmitry / Bakulin, Mikhail / Pavlov, Stanislav / Zuev, Aleksandr / Krylova, Mariya / Kharchikov, Igor et al. | 2023
digital version
1: Recursive/Iterative Unique Projection-Aggregation Decoding of Reed-Muller Codes
Hashemipour-Nazari, Marzieh / Debets, Renate / Goossens, Kees / Balatsoukas-Stimming, Alexios et al. | 2023
digital version
1: Improved Deep Speaker Localization and Tracking: Revised Training Paradigm and Controlled Latency
Bohlender, Alexander / Roelens, Liesbeth / Madhu, Nilesh et al. | 2023
digital version
1: Static-Scene Constrained Optimization for Matrix/Tensor-Decomposition-free Foreground-Background Separation
Naganuma, Kazuki / Ono, Shunsuke et al. | 2023
digital version
1: Image Inpainting with Semantic-Aware Transformer
Chen, Shiyu / Yu, Wenxin / Wang, Qi / Gong, Jun / Chen, Peng et al. | 2023
digital version
1: MCNET: Fuse Multiple Cues for Multichannel Speech Enhancement
Yang, Yujie / Quan, Changsheng / Li, Xiaofei et al. | 2023
digital version
1: Co-Operative CNN for Visual Saliency Prediction on WCE Images
Dimas, George / Koulaouzidis, Anastasios / Iakovidis, Dimitris K. et al. | 2023
digital version
1: ISmallNet: Densely Nested Network with Label Decoupling for Infrared Small Target Detection
Hu, Zhiheng / Wang, Yongzhen / Li, Peng / Qin, Jie / Xie, Haoran / Wei, Mingqiang et al. | 2023
digital version
1: Improved Projection Learning for Lower Dimensional Feature Maps
Price, Ilan / Tanner, Jared et al. | 2023
digital version
1: Wordreg: Mitigating the Gap between Training and Inference with Worst-Case Drop Regularization
Xia, Jun / Wang, Ge / Hu, Bozhen / Tan, Cheng / Zheng, Jiangbin / Xu, Yongjie / Li, Stan Z. et al. | 2023
digital version
1: The NIO System for Audio-Visual Diarization and Recognition in MISP Challenge 2022
Xu, Gaopeng / Wang, Xianliang / Wang, Sang / Yuan, Junfeng / Guo, Wei / Li, Wei / Gao, Jie et al. | 2023
digital version
1: Vision Transformer with Progressive Tokenization for CT Metal Artifact Reduction
Zheng, Songwei / Zhang, Dong / Yu, Chunyan / Zhu, Danhong / Zhu, Longlong / Liu, Hao / Huang, Zhongzheng et al. | 2023
digital version
1: A Critical Look at Recent Trends in Compression of Channel State Information
Ornhag, Marcus Valtonen / Adalbjornsson, Stefan / Guler, Puren / Mahdavi, Mojtaba et al. | 2023
digital version
1: Speech Emotion Recognition Via Two-Stream Pooling Attention With Discriminative Channel Weighting
Liu, Ke / Wang, Dekui / Wu, Dongya / Feng, Jun et al. | 2023
digital version
1: DATA2VEC-SG: Improving Self-Supervised Learning Representations for Speech Generation Tasks
Wang, Heming / Qian, Yao / Yang, Hemin / Kanda, Nauyuki / Wang, Peidong / Yoshioka, Takuya / Wang, Xiaofei / Wang, Yiming / Liu, Shujie / Chen, Zhuo et al. | 2023
digital version
1: Spherical Vector Quantization for Spatial Direction Coding
Ragot, Stephane / Vasilache, Adriana et al. | 2023
digital version
1: Perceive and Predict: Self-Supervised Speech Representation Based Loss Functions for Speech Enhancement
Close, George / Ravenscroft, William / Hain, Thomas / Goetze, Stefan et al. | 2023
digital version
1: DisCoHead: Audio-and-Video-Driven Talking Head Generation by Disentangled Control of Head Pose and Facial Expressions
Hwang, Geumbyeol / Hong, Sunwon / Lee, Seunghyun / Park, Sungwoo / Chae, Gyeongsu et al. | 2023
digital version
1: Naturalistic Head Motion Generation from Speech
Mittal, Trisha / Aldeneh, Zakaria / Fedzechkina, Masha / Ranjan, Anurag / Theobald, Barry-John et al. | 2023
digital version
1: Bayesian Methods for Optical Flow Estimation Using a Variational Approximation, with Applications to Ultrasound
Dorazil, Jan / Fleury, Bernard H. / Hlawatsch, Franz et al. | 2023
digital version
1: FNeural Speech Enhancement with Very Low Algorithmic Latency and Complexity via Integrated full- and sub-band Modeling
Wang, Zhong-Qiu / Cornell, Samuele / Choi, Shukjae / Lee, Younglo / Kim, Byeong-Yeol / Watanabe, Shinji et al. | 2023
digital version
1: LSTM-Based Video Quality Prediction Accounting for Temporal Distortions in Videoconferencing Calls
Mittag, Gabriel / Naderi, Babak / Gopal, Vishak / Cutler, Ross et al. | 2023
digital version
1: Applying Independent Vector Analysis on EEG-Based Motor Imagery Classification
Moraes, Caroline P. A. / Aristimunha, Bruno / Dos Santos, Lucas Heck / Pinaya, Walter Hugo Lopez / de Camargo, Raphael Yokoingawa / Fantinato, Denis G. / Neves, Aline et al. | 2023
digital version
1: Hierarchical Pronunciation Assessment with Multi-Aspect Attention
Do, Heejin / Kim, Yunsu / Lee, Gary Geunbae et al. | 2023
digital version
1: Zero-Shot Anomalous Sound Detection in Domestic Environments Using Large-Scale Pretrained Audio Pattern Recognition Models
Ilic Mezza, Alessandro / Zanetti, Giulio / Cobos, Maximo / Antonacci, Fabio et al. | 2023
digital version
1: Improving Bert Fine-Tuning via Stabilizing Cross-Layer Mutual Information
Li, Jicun / Li, Xingjian / Wang, Tianyang / Wang, Shi / Cao, Yanan / Xu, Chengzhong / Dou, Dejing et al. | 2023
digital version
1: A Model-Based Hearing Compensation Method Using a Self-Supervised Framework
Niu, Yadong / Li, Nan / Wu, Xihong / Chen, Jing et al. | 2023
digital version
1: Structured Pruning of Self-Supervised Pre-Trained Models for Speech Recognition and Understanding
Peng, Yifan / Kim, Kwangyoun / Wu, Felix / Sridhar, Prashant / Watanabe, Shinji et al. | 2023
digital version
1: Contrastive Domain Adaptation Via Delimitation Discriminator
Wei, Xing / Wen, Bin / Chen, Lei / Liu, Yujie / Zhao, Chong / Lu, Yang et al. | 2023
digital version
1: Efficient Siamese Network for UAV Tracking
Zhang, Xiaohan / Wang, Dong / Ma, Xiaohong et al. | 2023
digital version
1: Counterfactual Explanation for Multivariate Times Series Using A Contrastive Variational Autoencoder
Todo, William / Selmani, Merwann / Laurent, Beatrice / Loubes, Jean-Michel et al. | 2023
digital version
1: Long-Term Synchronization of Wireless Acoustic Sensor Networks with Nonpersistent Acoustic Activity Using Coherence State
Chinaev, Aleksej / Knaepper, Niklas / Enzner, Gerald et al. | 2023
digital version
1: CN-CVS: A Mandarin Audio-Visual Dataset for Large Vocabulary Continuous Visual to Speech Synthesis
Chen, Chen / Wang, Dong / Zheng, Thomas Fang et al. | 2023
digital version
1: Real-Time Speech Enhancement with Dynamic Attention Span
Zheng, Chengyu / Zhou, Yuan / Peng, Xiulian / Zhang, Yuan / Lu, Yan et al. | 2023
digital version
1: Neurally Augmented State Space Model for Simultaneous Communication and Tracking with Low Complexity Receivers
Pedraza, Fernando / Caire, Giuseppe et al. | 2023
digital version
1: Cosmopolite Sound Monitoring (CoSMo): A Study of Urban Sound Event Detection Systems Generalizing to Multiple Cities
Angulo, Florian / Essid, Slim / Peeters, Geoffroy / Mietlicki, Christophe et al. | 2023
digital version
1: NC-WAMKD: Neighborhood Correction Weight-Adaptive Multi-Teacher Knowledge Distillation for Graph-Based Semi-Supervised Node Classification
Liu, Jiahao / Guo, Pengcheng / Song, Yonghong et al. | 2023
digital version
1: F-PABEE: Flexible-Patience-Based Early Exiting For Single-Label and Multi-Label Text Classification Tasks
Gao, Xiangxiang / Zhu, Wei / Gao, Jiasheng / Yin, Congrui et al. | 2023
digital version
1: Speech and Noise Dual-Stream Spectrogram Refine Network With Speech Distortion Loss For Robust Speech Recognition
Lu, Haoyu / Li, Nan / Song, Tongtong / Wang, Longbiao / Dang, Jianwu / Wang, Xiaobao / Zhang, Shiliang et al. | 2023
digital version
1: Streaming Stroke Classification of Online Handwriting
Liu, Jing-Yu / Zhang, Yan-Ming / Yin, Fei / Liu, Cheng-Lin et al. | 2023
digital version
1: Reducing Language Confusion for Code-Switching Speech Recognition with Token-Level Language Diarization
Liu, Hexin / Xu, Haihua / Garcia, Leibny Paola / Khong, Andy W. H. / He, Yi / Khudanpur, Sanjeev et al. | 2023
digital version
1: Cross-Modal Audio-Visual Co-Learning for Text-Independent Speaker Verification
Liu, Meng / Lee, Kong Aik / Wang, Longbiao / Zhang, Hanyi / Zeng, Chang / Dang, Jianwu et al. | 2023
digital version
1: Egocentric Audio-Visual Noise Suppression
Sharma, Roshan / He, Weipeng / Lin, Ju / Lakomkin, Egor / Liu, Yang / Kalgaonkar, Kaustubh et al. | 2023
digital version
1: Sparse Graph Learning with Spectrum Prior for Deep Graph Convolutional Networks
Zeng, Jin / Liu, Yang / Cheung, Gene / Hu, Wei et al. | 2023
digital version
1: A Game of Snakes and Gans
Asokan, Siddarth / Mohammed, Fatwir Sheikh / Sekhar Seelamantula, Chandra et al. | 2023
digital version
1: Enabling Large-Scale Image Search with Co-Attention Mechanism
Hu, Zechao / Bors, Adrian G. et al. | 2023
digital version
1: Deep Manifold Graph Auto-Encoder For Attributed Graph Embedding
Hu, Bozhen / Zang, Zelin / Xia, Jun / Wu, Lirong / Tan, Cheng / Li, Stan Z. et al. | 2023
digital version
1: Learning Expressive And Generalizable Motion Features For Face Forgery Detection
Zhang, Jingyi / Zhang, Peng / Wang, Jingjing / Xie, Di / Pu, Shiliang et al. | 2023
digital version
1: Self-Supervised Speech Representation Learning for Keyword-Spotting With Light-Weight Transformers
Gao, Chenyang / Gu, Yue / Caliva, Francesco / Liu, Yuzong et al. | 2023
digital version
1: UPGLADE: Unplugged Plug-and-Play Audio Declipper Based on Consensus Equilibrium of DNN and Sparse Optimization
Tanaka, Tomoro / Yatabe, Kohei / Oikawa, Yasuhiro et al. | 2023
digital version
1: Spatio-Temporal Structure Consistency for Semi-Supervised Medical Image Classification
Lei, Wentao / Liu, Lei / Liu, Li et al. | 2023
digital version
1: A Bandit Online Convex Optimization Approach To Distributed Energy Management In Networked Systems
Tsetis, Ioannis / Cheng, Xiaotong / Maghsudi, Setareh et al. | 2023
digital version
1: Efficiently Fusing Sparse Lidar for Enhanced Self-Supervised Monocular Depth Estimation
Wang, Yue / Gong, Mingrong / Xia, Lei / Zhang, Qieshi / Cheng, Jun et al. | 2023
digital version
1: Exploiting Prompt Learning with Pre-Trained Language Models for Alzheimer’s Disease Detection
Wang, Yi / Deng, Jiajun / Wang, Tianzi / Zheng, Bo / Hu, Shoukang / Liu, Xunying / Meng, Helen et al. | 2023
digital version
1: Sparse Bayesian Learning Assisted Decision Fusion in Millimeter Wave Massive MIMO Sensor Networks
Chawla, Apoorva / Ciuonzo, Domenico / Rossi, Pierluigi Salvo et al. | 2023
digital version
1: FedVMR: A New Federated Learning Method for Video Moment Retrieval
Wang, Yan / Luo, Xin / Chen, Zhen-Duo / Zhang, Peng-Fei / Liu, Meng / Xu, Xin-Shun et al. | 2023
digital version
1: Context-Aware Face Clustering with Graph Convolutional Networks
Zhang, Dafeng / Guo, Jiangbo / Jin, Zhezhu et al. | 2023
digital version
1: Constrained non-negative PARAFAC2 for electromyogram separation
Magbonde, Abile / Quaine, Franck / Rivet, Bertrand et al. | 2023
digital version
1: Continuous Learning for Blind Image Quality Assessment with Contrastive Transformer
Yang, Jifan / Wang, Zhongyuan / Huang, Baojin / Deng, Lianbing et al. | 2023
digital version
1: Surface-Sampling Based Objective Quality Assessment Metrics for Meshes
Fu, Chunyang / Zhang, Xiang / Nguyen-Canh, Thuong / Xu, Xiaozhong / Li, Ge / Liu, Shan et al. | 2023
digital version
1: Exploration Into Translation-Equivariant Image Quantization
Shin, Woncheol / Lee, Gyubok / Lee, Jiyoung / Lyou, Eunyi / Lee, Joonseok / Choi, Edward et al. | 2023
digital version
1: Deep Subband Network for Joint Suppression of Echo, Noise and Reverberation in Real-Time Fullband Speech Communication
Xiong, Feifei / Dong, Minya / Zhou, Kechenying / Zhu, Houwei / Feng, Jinwei et al. | 2023
digital version
1: More Speaking or More Speakers?
Berrebbi, Dan / Collobert, Ronan / Jaitly, Navdeep / Likhomanenko, Tatiana et al. | 2023
digital version
1: Neighborhood Information-Based Label Refinement for Person Re-Identification with Label Noise
Zhong, Xian / Su, Shuaipeng / Liu, Wenxuan / Jia, Xuemei / Huang, Wenxin / Wang, Mengdie et al. | 2023
digital version
1: Universal Speaker Recognition Encoders for Different Speech Segments Duration
Novoselov, Sergey / Volokhov, Vladimir / Lavrentyeva, Galina et al. | 2023
digital version
1: Joint Neural Representation for Multiple Light Fields
Guludec, Guillaume Le / Guillemot, Christine et al. | 2023
digital version
1: Semi-Supervised Speech Enhancement Based On Speech Purity
Cui, Zihao / Zhang, Shilei / Chen, Yanan / Gao, Yingying / Deng, Chao / Feng, Junlan et al. | 2023
digital version
1: Continuous Interaction with A Smart Speaker via Low-Dimensional Embeddings of Dynamic Hand Pose
Xu, Songpei / Kaul, Chaitanya / Ge, Xuri / Murray-Smith, Roderick et al. | 2023
digital version
1: Analyzing Acoustic Word Embeddings from Pre-Trained Self-Supervised Speech Models
Sanabria, Ramon / Tang, Hao / Goldwater, Sharon et al. | 2023
digital version
1: Scalable Weight Reparametrization for Efficient Transfer Learning
Kim, Byeonggeun / Lee, Jun-Tae / Yang, Seunghan / Chang, Simyung et al. | 2023
digital version
1: Efficient Large-Scale Audio Tagging Via Transformer-to-CNN Knowledge Distillation
Schmid, Florian / Koutini, Khaled / Widmer, Gerhard et al. | 2023
digital version
1: Weight-Sharing Supernet for Searching Specialized Acoustic Event Classification Networks Across Device Constraints
Lin, Guan-Ting / Tang, Qingming / Kao, Chieh-Chi / Rozgic, Viktor / Wang, Chao et al. | 2023
digital version
1: Building Change Detection Using Cross-Temporal Feature Interaction Network
Feng, Yuchao / Jiang, Jiawei / Xu, Honghui / Zheng, Jianwei et al. | 2023
digital version
1: RCDPT: Radar-Camera Fusion Dense Prediction Transformer
Lo, Chen-Chou / Vandewalle, Patrick et al. | 2023
digital version
1: Global HRTF Interpolation Via Learned Affine Transformation of Hyper-Conditioned Features
Lee, Jin Woo / Lee, Sungho / Lee, Kyogu et al. | 2023
digital version
1: Wireless Power Transfer Using Chirp Waveforms
Roy, Arijit / Psomas, Constantinos / Krikidis, Ioannis et al. | 2023
digital version
1: Analysing the Masked Predictive Coding Training Criterion for Pre-Training a Speech Representation Model
Yadav, Hemant / Sitaram, Sunayana / Shah, Rajiv Ratn et al. | 2023
digital version
1: Less Is More: A Unified Architecture for Device-Directed Speech Detection with Multiple Invocation Types
Rudovic, Oggi / Chang, Wonil / Garg, Vineet / Dighe, Pranay / Simha, Pramod / Berkowitz, Jack / Abdelaziz, Ahmed H. / Kajarekar, Sachin / Marchi, Erik / Adya, Saurabh et al. | 2023
digital version
1: MRML: Multimodal Rumor Detection by Deep Metric Learning
Peng, Liwen / Jian, Songlei / Li, Dongsheng / Shen, Siqi et al. | 2023
digital version
1: Face Recognition on Point Cloud with Cgan-Top for Denoising
Liu, Junyu / Ren, Jianfeng / Sun, Hongliang / Jiang, Xudong et al. | 2023
digital version
1: Any-to-Any Voice Conversion with F0 and Timbre Disentanglement and Novel Timbre Conditioning
Kovela, Sudheer / Valle, Rafael / Dantrey, Ambrish / Catanzaro, Bryan et al. | 2023
digital version
1: Inverse Reinforcement Learning with Graph Neural Networks for IoT Resource Allocation
Wang, Guangchen / Cheng, Peng / Chen, Zhuo / Xiang, Wei / Vucetic, Branka / Li, Yonghui et al. | 2023
digital version
1: NNSVS: A Neural Network-Based Singing Voice Synthesis Toolkit
Yamamoto, Ryuichi / Yoneyama, Reo / Toda, Tomoki et al. | 2023
digital version
1: Overview of the L3DAS23 Challenge on Audio-Visual Extended Reality
Marinoni, Christian / Gramaccioni, Riccardo F. / Chen, Changan / Uncini, Aurelio / Comminiello, Danilo et al. | 2023
digital version
1: Overview of the ICASSP 2023 General Meeting Understanding and Generation Challenge (MUG)
Zhang, Qinglin / Deng, Chong / Liu, Jiaqing / Yu, Hai / Chen, Qian / Wang, Wen / Yan, Zhijie / Liu, Jinglin / Ren, Yi / Zhao, Zhou et al. | 2023
digital version
1: Multilingual Alzheimer’s Dementia Recognition through Spontaneous Speech: A Signal Processing Grand Challenge
Luz, Saturnino / Haider, Fasih / Fromm, Davida / Lazarou, Ioulietta / Kompatsiaris, Ioannis / MacWhinney, Brian et al. | 2023
digital version
1: Divcon: Learning Concept Sequences for Semantically Diverse Image Captioning
Zheng, Yue / Li, Ya-Li / Wang, Shengjin et al. | 2023
digital version
1: Exploiting Virtual Array Diversity for Accurate Radar Detection
Guan, Junfeng / Madani, Sohrab / Ahmed, Waleed / Hussein, Samah / Gupta, Saurabh / Hassanieh, Haitham et al. | 2023
digital version
1: Accelerated Distributed Stochastic Non-Convex Optimization over Time-Varying Directed Networks
Chen, Yiyue / Hashemi, Abolfazl / Vikalo, Haris et al. | 2023
digital version
1: SAN: A Robust End-to-End ASR Model Architecture
Min, Zeping / Ge, Qian / Huang, Guanhua et al. | 2023
digital version
1: Resource Allocation for UAV-Enabled Integrated Sensing and Communication (ISAC) via Multi-Objective Optimization
Rezaei, Omid / Naghsh, Mohammad Mahdi / Karbasi, Seyed Mohammad / Nayebi, Mohammad Mahdi et al. | 2023
digital version
1: Removing Radio Frequency Interference From Auroral Kilometric Radiation With Stacked Autoencoders
Chang, Allen / Knapp, Mary / LaBelle, James / Swoboda, John / Volz, Ryan / Erickson, Philip J. et al. | 2023
digital version
1: Soft Label Coding for end-to-end Sound Source Localization with ad-hoc Microphone Arrays
Feng, Linfeng / Gong, Yijun / Zhang, Xiao-Lei et al. | 2023
digital version
1: Study And Design Of Robust Personal Sound Zones With Vast Using Low Rank Rirs
Bhattacharjee, Sankha Subhra / Shi, Liming / Ping, Guoli / Shen, Xiaoxiang / Christensen, Mads Grasboll et al. | 2023
digital version
1: ROI-Based Deep Image Compression with Swin Transformers
Li, Binglin / Liang, Jie / Fu, Haisheng / Han, Jingning et al. | 2023
digital version
1: Event-Based Visual Microphone
Howard, Matthew / Hirakawa, Keigo et al. | 2023
digital version
1: Named Entity Detection and Injection for Direct Speech Translation
Gaido, Marco / Tang, Yun / Kulikov, Ilia / Huang, Rongqing / Gong, Hongyu / Inaguma, Hirofumi et al. | 2023
digital version
1: Efficient Stuttering Event Detection Using Siamese Networks
Mohapatra, Payal / Islam, Bashima / Islam, Md Tamzeed / Jiao, Ruochen / Zhu, Qi et al. | 2023
digital version
1: BadRes: Reveal the Backdoors Through Residual Connection
He, Mingrui / Chen, Tianyu / Zhou, Haoyi / Zhang, Shanghang / Li, Jianxin et al. | 2023
digital version
1: End-to-End Unsupervised Sketch to Image Generation
Lv, Xingming / Wu, Lei / Cheng, Zhenwei / Meng, Xiangxu et al. | 2023
digital version
1: Trinet: Stabilizing Self-Supervised Learning From Complete or Slow Collapse
Cao, Lixin / Wang, Jun / Yang, Ben / Su, Dan / Yu, Dong et al. | 2023
digital version
1: ERBNet: An Effective Representation Based Network for Unbiased Scene Graph Generation
Ma, Wenxi / Hou, Tianxiang / Di, Qianji / Qi, Zhongang / Shan, Ying / Wang, Hanzi et al. | 2023
digital version
1: Deformable Cross Attention for Learning Optical Flow
Abdein, Rokia / Xiang, Xuezhi / Lv, Ning / Saddik, Abdulmotaleb El et al. | 2023
digital version
1: Optimal Kernel for Real-Time Arbitrary-Shaped Text Detection
Ma, Haozhao / Yang, Chuang / Yuan, Yuan / Wang, Qi et al. | 2023
digital version
1: SVMV: Spatiotemporal Variance-Supervised Motion Volume for Video Frame Interpolation
Luo, Yao / Pan, Jinshan / Tang, Jinhui et al. | 2023
digital version
1: Cumulative Attention Based Streaming Transformer ASR with Internal Language Model Joint Training and Rescoring
Li, Mohan / Do, Cong-Thanh / Doddipatla, Rama et al. | 2023
digital version
1: Two-Stage Neural Network for ICASSP 2023 Speech Signal Improvement Challenge
Liu, Mingshuai / Lv, Shubo / Zhang, Zihan / Han, Runduo / Hao, Xiang / Xia, Xianjun / Chen, Li / Xiao, Yijian / Xie, Lei et al. | 2023
digital version
1: The Multimodal Information Based Speech Processing (Misp) 2022 Challenge: Audio-Visual Diarization And Recognition
Wang, Zhe / Wu, Shilong / Chen, Hang / He, Mao-Kui / Du, Jun / Lee, Chin-Hui / Chen, Jingdong / Watanabe, Shinji / Siniscalchi, Sabato / Scharenborg, Odette et al. | 2023
digital version
1: Implicit Vehicle Positioning with Cooperative Lidar Sensing
Barbieri, Luca / Tedeschini, Bernardo Camajori / Brambilla, Mattia / Nicoli, Monica et al. | 2023
digital version
1: Self-Supervised Guided Hypergraph Feature Propagation for Semi-Supervised Classification with Missing Node Features
Lei, Chengxiang / Fu, Sichao / Wang, Yuetian / Qiu, Wenhao / Hu, Yachen / Peng, Qinmu / You, Xinge et al. | 2023
digital version
1: Differential Analysis for Networks Obeying Conservation Laws
Rayas, Anirudh / Anguluri, Rajasekhar / Cheng, Jiajun / Dasarathy, Gautam et al. | 2023
digital version
1: Hardware-Limited Non-Uniform Task-Based Quantizers
Bernardo, Neil Irwin / Zhu, Jingge / Eldar, Yonina C. / Evans, Jamie et al. | 2023
digital version
1: Adaptive Noise Canceller Algorithm with SNR-Based Stepsize and Data-Dependent Averaging
Sugiyama, Akihiko et al. | 2023
digital version
1: Signal Processing And Quantum State Tomography on Noisy Devices
Shi, Wenbo / Malaney, Robert et al. | 2023
digital version
1: In-Sensor & Neuromorphic Computing Are all You Need for Energy Efficient Computer Vision
Datta, Gourav / Liu, Zeyu / Kaiser, Md Abdullah-Al / Kundu, Souvik / Mathai, Joe / Yin, Zihan / Jacob, Ajey P. / Jaiswal, Akhilesh R. / Beerel, Peter A. et al. | 2023
digital version
1: Adversarial Contrastive Distillation with Adaptive Denoising
Wang, Yuzheng / Chen, Zhaoyu / Yang, Dingkang / Liu, Yang / Liu, Siao / Zhang, Wenqiang / Qi, Lizhe et al. | 2023
digital version
1: On Designing Light-Weight Object Trackers Through Network Pruning: Use CNNS or Transformers?
Aggarwal, Saksham / Gupta, Taneesh / Sahu, Pawan K. / Chavan, Arnav / Tiwari, Rishabh / Prasad, Dilip K. / Gupta, Deepak K. et al. | 2023
digital version
1: Variational Inference Aided Estimation of Time Varying Channels
Bock, Benedikt / Baur, Michael / Rizzello, Valentina / Utschick, Wolfgang et al. | 2023
digital version
1: Class-Incremental Learning on Multivariate Time Series Via Shape-Aligned Temporal Distillation
Qiao, Zhongzheng / Hu, Minghui / Jiang, Xudong / Suganthan, Ponnuthurai Nagaratnam / Savitha, Ramasamy et al. | 2023
digital version
1: Inv-Senet: Invariant Self Expression Network for Clustering Under Biased Data
Singh, Ashutosh / Singh, Ashish / Masoomi, Aria / Imbiriba, Tales / Learned-Miller, Erik / Erdogmus, Deniz et al. | 2023
digital version
1: Fine-Grained Textual Knowledge Transfer to Improve RNN Transducers for Speech Recognition and Understanding
Sunder, Vishal / Thomas, Samuel / Kuo, Hong-Kwang J. / Kingsbury, Brian / Fosler-Lussier, Eric et al. | 2023
digital version
1: Training Neural Networks for Sequential Change-Point Detection
Lee, Junghwan / Xie, Yao / Cheng, Xiuyuan et al. | 2023
digital version
1: High-Resolution Neural Network Processing of LFM Radar Pulses
Akhtar, Jabran et al. | 2023
digital version
1: MLCGAN: Multi-Lead ECG Synthesis with Multi Label Conditional Generative Adversarial Network
Wu, Jian / Wang, Liping / Pan, Hailin / Wang, Binyu et al. | 2023
digital version
1: NRTSI: Non-Recurrent Time Series Imputation
Shan, Siyuan / Li, Yang / Oliva, Junier B. et al. | 2023
digital version
1: The Edinburgh International Accents of English Corpus: Towards the Democratization of English ASR
Sanabria, Ramon / Bogoychev, Nikolay / Markl, Nina / Carmantini, Andrea / Klejch, Ondrej / Bell, Peter et al. | 2023
digital version
1: Centralized Cascade Multi-Channel Noise Reduction and Acoustic Feedback Cancellation in a Wireless Acoustic Sensor And Actuator Network
Ruiz, Santiago / van Waterschoot, Toon / Moonen, Marc et al. | 2023
digital version
1: Intent Does Matter! Propagating High-Order Relations for Exploring Interest Preferences
Zheng, Xiangping / Liang, Xun / Wu, Bo / Feng, Junlan / Guo, Yuhui / Zhang, Sensen et al. | 2023
digital version
1: Compose & Embellish: Well-Structured Piano Performance Generation via A Two-Stage Approach
Wu, Shih-Lun / Yang, Yi-Hsuan et al. | 2023
digital version
1: Input-Dependent Dynamical Channel Association For Knowledge Distillation
Tang, Qiankun / Zhang, Yuan / Xu, Xiaogang / Wang, Jun / Guo, Yimin et al. | 2023
digital version
1: Robust Adaptive Beamforming with Proximal Method
Li, Ruifu / Cabric, Danijela et al. | 2023
digital version
1: Conformer-Based Target-Speaker Automatic Speech Recognition For Single-Channel Audio
Zhang, Yang / Puvvada, Krishna C. / Lavrukhin, Vitaly / Ginsburg, Boris et al. | 2023
digital version
1: An Isotropy Analysis for Self-Supervised Acoustic Unit Embeddings on the Zero Resource Speech Challenge 2021 Framework
Chen, Jianan / Sakti, Sakriani et al. | 2023
digital version
1: Bimodal Fusion Network for Basic Taste Sensation Recognition from Electroencephalography and Electromyography
Gao, Han / Zhao, Shuo / Li, Huiyan / Liu, Li / Wang, You / Hu, Ruifen / Zhang, Jin / Li, Guang et al. | 2023
digital version
1: Papez: Resource-Efficient Speech Separation with Auditory Working Memory
Oh, Hyunseok / Yi, Juheon / Lee, Youngki et al. | 2023
digital version
1: Effectiveness of Text, Acoustic, and Lattice-Based Representations in Spoken Language Understanding Tasks
Villatoro-Tello, Esau / Madikeri, Srikanth / Zuluaga-Gomez, Juan / Sharma, Bidisha / Saeed Sarfjoo, Seyyed / Nigmatulina, Iuliia / Motlicek, Petr / Ivanov, Alexei V. / Ganapathiraju, Aravind et al. | 2023
digital version
1: Search for Efficient Deep Visual-Inertial Odometry Through Neural Architecture Search
Chen, Yu / Yang, Mingyu / Kim, Hun-Seok et al. | 2023
digital version
1: Prune Then Distill: Dataset Distillation with Importance Sampling
Sundar, Anirudh S / Keskin, Gokce / Chandak, Chander / Chen, I-Fan / Ghahremani, Pegah / Ghosh, Shalini et al. | 2023
digital version
1: CF-VTON: Multi-Pose Virtual Try-on with Cross-Domain Fusion
Du, Chenghu / Xiong, Shengwu et al. | 2023
digital version
1: LQGNET: Hybrid Model-Based and Data-Driven Linear Quadratic Stochastic Control
Casspi, Solomon Goldgraber / Husser, Oliver / Revach, Guy / Shlezinger, Nir et al. | 2023
digital version
1: Mingling or Misalignment? Temporal Shift for Speech Emotion Recognition with Pre-Trained Representations
Shen, Siyuan / Liu, Feng / Zhou, Aimin et al. | 2023
digital version
1: GTN-Bailando: Genre Consistent long-Term 3D Dance Generation Based on Pre-Trained Genre Token Network
Zhuang, Haolin / Lei, Shun / Xiao, Long / Li, Weiqin / Chen, Liyang / Yang, Sicheng / Wu, Zhiyong / Kang, Shiyin / Meng, Helen et al. | 2023
digital version
1: Streaming Multi-Channel Speech Separation with Online Time-Domain Generalized Wiener Filter
Luo, Yi et al. | 2023
digital version
1: String-Based Molecule Generation Via Multi-Decoder VAE
Kwon, Kisoo / Jeong, Kuhwan / Park, Junghyun / Na, Hwidong / Shin, Jinwoo et al. | 2023
digital version
1: Robust Spatiotemporal Fusion of Satellite Images via Convex Optimization
Isono, Ryosuke / Naganuma, Kazuki / Ono, Shunsuke et al. | 2023
digital version
1: A Sidecar Separator Can Convert A Single-Talker Speech Recognition System to A Multi-Talker One
Meng, Lingwei / Kang, Jiawen / Cui, Mingyu / Wang, Yuejiao / Wu, Xixin / Meng, Helen et al. | 2023
digital version
1: N2MVSNet: Non-Local Neighbors Aware Multi-View Stereo Network
Zhang, Zhe / Gao, Huachen / Hu, Yuxi / Wang, Ronggang et al. | 2023
digital version
1: Windowed Fourier Analysis for Signal Processing on Graph Bundles
Roddenberry, T. Mitchell / Segarra, Santiago et al. | 2023
digital version
1: Diffusion-Based Generative Speech Source Separation
Scheibler, Robin / Ji, Youna / Chung, Soo-Whan / Byun, Jaeuk / Choe, Soyeon / Choi, Min-Seok et al. | 2023
digital version
1: Shuffled Autoregression for Motion Interpolation
Huang, Shuo / Jia, Jia / Yang, Zongxin / Wang, Wei / Wu, Haozhe / Yang, Yi / Xing, Junliang et al. | 2023
digital version
1: Joint Estimation of DOA and Distance in Noisy Reverberant Conditions
Bu, Suliang / Zhao, Tuo / Zhao, Yunxin et al. | 2023
digital version
1: Change Point Detection with Neural Online Density-Ratio Estimator
Wang, Xiuheng / Borsoi, Ricardo Augusto / Richard, Cedric / Chen, Jie et al. | 2023
digital version
1: Towards Low-Power Heart Rate Estimation Based on User’s Demographics and Activity Level For Wearables
Pacheco, Andre G. C. / Cabello, Frank A. C. / Fonoff, Adriana M. O. / Rodrigues, Paula G. / Penatti, Otavio A. B. / Pinto, Paula R. et al. | 2023
digital version
1: ifUNet++: Iterative Feedback UNet++ for Infrared Small Target Detection
Weng, Zhangying / Li, Peng / Zhuang, Xin / Yan, Xuefeng / Gong, Lina / Xie, Haoran / Wei, Mingqiang et al. | 2023
digital version
1: Vararray Meets T-Sot: Advancing the State of the Art of Streaming Distant Conversational Speech Recognition
Kanda, Naoyuki / Wu, Jian / Wang, Xiaofei / Chen, Zhuo / Li, Jinyu / Yoshioka, Takuya et al. | 2023
digital version
1: Binary Image Fast Perfect Recovery from Sparse 2D-DFT Coefficients
Pei, Soo-Chang / Chang, Kuo-Wei et al. | 2023
digital version
1: Time-Aware Multiway Adaptive Fusion Network for Temporal Knowledge Graph Question Answering
Liu, Yonghao / Liang, Di / Fang, Fang / Wang, Sirui / Wu, Wei / Jiang, Rui et al. | 2023
digital version
1: Exploiting Interactivity and Heterogeneity for Sleep Stage Classification Via Heterogeneous Graph Neural Network
Jia, Ziyu / Lin, Youfang / Zhou, Yuhan / Cai, Xiyang / Zheng, Peng / Li, Qiang / Wang, Jing et al. | 2023
digital version
1: When is Mimo Massive in Radar?
Shah, Jaimin / Cardone, Martina / Dytso, Alex / Rush, Cynthia et al. | 2023
digital version
1: Detecting Malicious Migration on Edge to Prevent Running Data Leakage
Wong, Yuchen / Shen, Qingni / Li, Cong / Liu, Cunzhan / Ai, Tianxiang et al. | 2023
digital version
1: PI-Trans: Parallel-Convmlp and Implicit-Transformation Based Gan for Cross-View Image Translation
Ren, Bin / Tang, Hao / Wang, Yiming / Li, Xia / Wang, Wei / Sebe, Mcu et al. | 2023
digital version
1: Interpolation of Spatial Room Impulse Responses Using Partial Optimal Transport
Geldert, Aaron / Meyer-Kahlen, Nils / Schlecht, Sebastian J. et al. | 2023
digital version
1: Knowledge-Augmented Frame Semantic Parsing with Hybrid Prompt-Tuning
Zhang, Rui / Sun, Yajing / Yang, Jingyuan / Peng, Wei et al. | 2023
digital version
1: HappyQuokka System for ICASSP 2023 Auditory EEG Challenge
Piao, Zhenyu / Kim, Miseul / Yoon, Hyungchan / Kang, Hong-Goo et al. | 2023
digital version
1: Deep Unfolded Tensor Robust PCA With Self-Supervised Learning
Dong, Harry / Shah, Megna / Donegan, Sean / Chi, Yuejie et al. | 2023
digital version
1: Continual Learning for On-Device Speech Recognition Using Disentangled Conformers
Diwan, Anuj / Yeh, Ching-Feng / Hsu, Wei-Ning / Tomasello, Paden / Choi, Eunsol / Harwath, David / Mohamed, Abdelrahman et al. | 2023
digital version
1: Robust Online Multiband Drift Estimation in Electrophysiology Data
Windolf, Charlie / Paulk, Angelique C. / Kfir, Yoav / Trautmann, Eric / Meszena, Domokos / Munoz, William / Caprara, Irene / Jamali, Mohsen / Boussard, Julien / Williams, Ziv M. et al. | 2023
digital version
1: Progressive Refinement Learning Based on Feature Cross Perception for Residential Areas Semantic Segmentation
Lyu, Xinran / Zhang, Libao et al. | 2023
digital version
1: Improving Adversarial Robustness with Hypersphere Embedding and Angular-Based Regularizations
Fakorede, Olukorede / Nirala, Ashutosh / Atsague, Modeste / Tian, Jin et al. | 2023
digital version
1: Graph Contrastive Learning with Learnable Graph Augmentation
Pu, Xinyan / Zhang, Ke / Shu, Huazhong / Coatrieux, Jean Louis / Kong, Youyong et al. | 2023
digital version
1: To Regularize or Not to Regularize: The Role of Positivity in Sparse Array Interpolation with a Single Snapshot
Hucumenoglu, Mehmet Can / Sarangi, Pulak / Rajamaki, Robin / Pal, Piya et al. | 2023
digital version
1: TeAw: Text-Aware Few-Shot Remote Sensing Image Scene Classification
Cheng, Kaihui / Yang, Chule / Fan, Zunlin / Wu, Dayan / Guan, Naiyang et al. | 2023
digital version
1: RIS Reflection and Placement Optimisation for Underlay D2D Communications in Cognitive Cellular Networks
Ghose, Sarbani / Mishra, Deepak / Maity, Santi P. / Alexandropoulos, George C. et al. | 2023
digital version
1: Not All Classes are Equal: Adaptively Focus-Aware Confidence for Semi-Supervised Object Detection
Zhu, Hui / Lu, Yongchun / Zhao, Hongyu / Zhao, Guoqing / Zhao, Xiaofang et al. | 2023
digital version
1: Adversarial Data Augmentation Using VAE-GAN for Disordered Speech Recognition
Jin, Zengrui / Xie, Xurong / Geng, Mengzhe / Wang, Tianzi / Hu, Shujie / Deng, Jiajun / Li, Guinan / Liu, Xunying et al. | 2023
digital version
1: Multi-Blank Transducers for Speech Recognition
Xu, Hainan / Jia, Fei / Majumdar, Somshubra / Watanabe, Shinji / Ginsburg, Boris et al. | 2023
digital version
1: End-to-End Word-Level Disfluency Detection and Classification in Children’s Reading Assessment
Venkatasubramaniam, Lavanya / Sunder, Vishal / Fosler-Lussier, Eric et al. | 2023
digital version
1: Speech Emotion Recognition via Heterogeneous Feature Learning
Liu, Ke / Wu, DongYa / Wang, Dekui / Feng, Jun et al. | 2023
digital version
1: A Study on Bias and Fairness in Deep Speaker Recognition
Hajavi, Amirhossein / Etemad, Ali et al. | 2023
digital version
1: Retinal Biomarkers for Detecting Diabetic Retinopaty Using Smartphone-Based Deep Learning Frameworks
Karakaya, Mahmut / Aygun, Ramazan S. et al. | 2023
digital version
1: Hierarchical Interactive Reconstruction Network for Video Compressive Sensing
Zhang, Tong / Cui, Wenxue / Hui, Chen / Jiang, Feng et al. | 2023
digital version
1: A Unified Uncertainty-Aware Exploration: Combining Epistemic and Aleatory Uncertainty
Malekzadeh, Parvin / Hou, Ming / Plataniotis, Konstantinos N. et al. | 2023
digital version
1: FedSD: A New Federated Learning Structure Used in Non-iid Data
Yi, Minmin / Ning, Houchun / Liu, Peng et al. | 2023
digital version
1: Towards Dialogue Modeling Beyond Text
Wu, Tongzi / Zhou, Yuhao / Ling, Wang / Yang, Hojin / Veloso, Joana / Sun, Lin / Huang, Ruixin / Guimaraes, Norberto / Sanner, Scott et al. | 2023
digital version
1: DPP-Based Client Selection for Federated Learning with NON-IID DATA
Zhang, Yuxuan / Xu, Chao / Yang, Howard H. / Wang, Xijun / Quek, Tony Q. S. et al. | 2023
digital version
1: Learning Robust Self-Attention Features for Speech Emotion Recognition with Label-Adaptive Mixup
Kang, Lei / Zhang, Lichao / Jiang, Dazhi et al. | 2023
digital version
1: Adaptive Eccm for Mitigating Smart Jammers
Jain, Shashwat / Pattanayak, Kunal / Krishnamurthy, Vikram / Berry, Christopher et al. | 2023
digital version
1: IAST: Instance Association Relying on Spatio-Temporal Features for Video Instance Segmentation
Chen, Junhao / Liu, Sheng / Chen, Ruixiang / Guo, Bingnan / Zhang, Feng et al. | 2023
digital version
1: Exploring the Role of Fricatives in Classifying Healthy Subjects and Patients with Amyotrophic Lateral Sclerosis and Parkinson’s Disease
Bhattacharjee, Tanuka / Belur, Yamini / Nalini, Atchayaram / Yadav, Ravi / Ghosh, Prasanta Kumar et al. | 2023
digital version
1: Stay In The Middle: A Semi-Supervised Model for CT Metal Artifact Reduction
Wang, Tao / Yu, Hui / Lu, Zexin / Zhang, Zhongzhou / Zhou, Jiliu / Zhang, Yi et al. | 2023
digital version
1: Neural Fourier Shift for Binaural Speech Rendering
Woo Lee, Jin / Lee, Kyogu et al. | 2023
digital version
1: Semi-Supervised Contrastive Learning with Soft Mask Attention for Facial Action Unit Detection
Liu, Zhongling / Liu, Rujie / Shi, Ziqiang / Liu, Liu / Mi, Xiaoyu / Murase, Kentaro et al. | 2023
digital version
1: Recursive Estimation of User Intent From Noninvasive Electroencephalography Using Discriminative Models
Smedemark-Margulies, Niklas / Celik, Basak / Imbiriba, Tales / Kocanaogullari, Aziz / Erdogmus, Deniz et al. | 2023
digital version
1: Diabetic Retinopathy Grading with Weakly-Supervised Lesion Priors
Hou, Junlin / Xiao, Fan / Xu, Jilan / Feng, Rui / Zhang, Yuejie / Zou, Haidong / Lu, Lina / Xue, Wenwen et al. | 2023
digital version
1: Prompt-Distiller: Few-Shot Knowledge Distillation for Prompt-Based Language Learners with Dual Contrastive Learning
Hou, Boyu / Wang, Chengyu / Chen, Xiaoqing / Qiu, Minghui / Feng, Liang / Huang, Jun et al. | 2023
digital version
1: Contextually-Rich Human Affect Perception Using Multimodal Scene Information
Bose, Digbalay / Hebbar, Rajat / Somandepalli, Krishna / Narayanan, Shrikanth et al. | 2023
digital version
1: Stabilising and Accelerating Light Gated Recurrent Units for Automatic Speech Recognition
Moumen, Adel / Parcollet, Titouan et al. | 2023
digital version
1: Sampling Order-Limited Signals on the Sphere
Khan, Muhammad Salaar Arif / Nadeem, Salman / Khalid, Zubair et al. | 2023
digital version
1: Sequence-Based Device-Free Gesture Recognition Framework for Multi-Channel Acoustic Signals
Yang, Zhizheng / Wang, Xun / Xia, Dongyu / Wang, Wei / Dai, Haipeng et al. | 2023
digital version
1: Using Adapters to Overcome Catastrophic Forgetting in End-to-End Automatic Speech Recognition
Eeckt, Steven Vander / Van Hamme, Hugo et al. | 2023
digital version
1: Can Knowledge of End-to-End Text-to-Speech Models Improve Neural Midi-to-Audio Synthesis Systems?
Shi, Xuan / Cooper, Erica / Wang, Xin / Yamagishi, Junichi / Narayanan, Shrikanth et al. | 2023
digital version
1: MGAT: Multi-Granularity Attention Based Transformers for Multi-Modal Emotion Recognition
Fan, Weiquan / Xing, Xiaofen / Cai, Bolun / Xu, Xiangmin et al. | 2023
digital version
1: HPFTN: Hierarchical Progressive Fusion Transformer Network for Video Denoising
Zhang, Shuaitao / Zhang, Yuan / Zhao, Zheng / Xie, Di / Pu, Shiliang et al. | 2023
digital version
1: Soft 2D-to-3D Delivery Using Deep Graph Neural Networks for Holographic-Type Communication
Fujihashi, Takuya / Koike-Akino, Toshiaki / Watanabe, Takashi et al. | 2023
digital version
1: CLAP Learning Audio Concepts from Natural Language Supervision
Elizalde, Benjamin / Deshmukh, Soham / Ismail, Mahmoud Al / Wang, Huaming et al. | 2023
digital version
1: Soft Dynamic Time Warping for Multi-Pitch Estimation and Beyond
Krause, Michael / Weis, Christof / Muller, Meinard et al. | 2023
digital version
1: SPECTRANET-SO(3): Learning Satellite Orientation from Optical Spectra by Implicitly Modeling Mutually Exclusive Probability Distributions on The Rotation Manifold
Phelps, Matthew / Swindle, Thomas / Gazak, J. Zachary / Vandenberg, Andrew / Fletcher, Justin et al. | 2023
digital version
1: Channel Estimation in Massive MIMO with Heavy-Tailed Noise: Gaussian-Mixture Versus Cauchy Models
Gulgun, Ziya / Larsson, Erik G. et al. | 2023
digital version
1: Speech Intelligibility Classifiers from 550k Disordered Speech Samples
Venugopalan, Subhashini / Tobin, Jimmy / Yang, Samuel J. / Seaver, Katie / Cave, Richard J.N. / Jiang, Pan-Pan / Zeghidour, Neil / Heywood, Rus / Green, Jordan / Brenner, Michael P. et al. | 2023
digital version
1: Filler Word Detection with Hard Category Mining and Inter-Category Focal Loss
Zhao, Zhiyuan / Wu, Lijun / Tang, Chuanxin / Yin, Dacheng / Zhao, Yucheng / Luo, Chong et al. | 2023
digital version
1: Modular Conformer Training for Flexible End-to-End ASR
Audhkhasi, Kartik / Farris, Brian / Ramabhadran, Bhuvana / Moreno, Pedro J. et al. | 2023
digital version
1: Untargeted Backdoor Attack Against Object Detection
Luo, Chengxiao / Li, Yiming / Jiang, Yong / Xia, Shu-Tao et al. | 2023
digital version
1: Cross-Modality depth Estimation via Unsupervised Stereo RGB-to-infrared Translation
Tang, Shi / Ye, Xinchen / Xue, Fei / Xu, Rui et al. | 2023
digital version
1: A Dynamic Cross-Scale Transformer with Dual-Compound Representation for 3D Medical Image Segmentation
Zhang, Ruixia / Wang, Zhiqiong / Wang, Zhongyang / Xin, Junchang et al. | 2023
digital version
1: Generic Dependency Modeling for Multi-Party Conversation
Shen, Weizhou / Quan, Xiaojun / Yang, Ke et al. | 2023
digital version
1: WL-MSR: Watch and Listen for Multimodal Subtitle Recognition
Liu, Jiawei / Wang, Hao / Wang, Weining / He, Xingjian / Liu, Jing et al. | 2023
digital version
1: Residual Hybrid Attention Network for Compression Artifact Reduction
Luo, Bingchun / Yu, Wei et al. | 2023
digital version
1: Dual-Attention Neural Transducers for Efficient Wake Word Spotting in Speech Recognition
Sahai, Saumya Y. / Liu, Jing / Muniyappa, Thejaswi / Sathyendra, Kanthashree M. / Alexandridis, Anastasios / Strimel, Grant P. / McGowan, Ross / Rastrow, Ariya / Chang, Feng-Ju / Mouchtaris, Athanasios et al. | 2023
digital version
1: Look and Think: Intrinsic Unification of Self-Attention and Convolution for Spatial-Channel Specificity
Gao, Xiang / Lin, Honghui / Li, Yu / Fang, Ruiyan / Zhang, Xin et al. | 2023
digital version
1: Higher-Order Link Prediction Via Learnable Maximum Mean Discrepancy
Karanikolas, Georgios V. / Pages-Zamora, Alba / Giannakis, Georgios B. et al. | 2023
digital version
1: EI²SR: Learning an Enhanced Intra-Instance Semantic Relationship for Arbitrary-Shaped Scene Text Detection
Shu, Yan / Liu, Shaohui / Zhou, Yu / Xu, Honglei / Jiang, Feng et al. | 2023
digital version
1: Towards Real-Time Single-Channel Speech Separation in Noisy and Reverberant Environments
Neri, Julian / Braun, Sebastian et al. | 2023
digital version
1: Comparative Layer-Wise Analysis of Self-Supervised Speech Models
Pasad, Ankita / Shi, Bowen / Livescu, Karen et al. | 2023
digital version
1: Maximum Likelihood Distillation for Robust Modulation Classification
Maroto, Javier / Bovet, Gerome / Frossard, Pascal et al. | 2023
digital version
1: Stochastic Optimization of Vector Quantization Methods in Application to Speech and Image Processing
Vali, Mohammad Hassan / Backstrom, Tom et al. | 2023
digital version
1: Deep Fusion of Multi-Object Densities Using Transformer
Li, Lechi / Dai, Chen / Xia, Yuxuan / Svensson, Lennart et al. | 2023
digital version
1: Core: Transferable Long-Range Time Series Forecasting Enhanced by Covariates-Guided Representation
Li, Xin-Yi / Zhong, Pei-Nan / Chen, Di / Yang, Yu-Bin et al. | 2023
digital version
1: Toward Privacy-Enhancing Ambulatory-Based Well-Being Monitoring: Investigating User Re-Identification Risk in Multimodal Data
Pranjal, Ravi / Seshadri, Ranjana / Kumar Sanath Kumar Kadaba, Rakesh / Feng, Tiantian / Narayanan, Shrikanth S. / Chaspari, Theodora et al. | 2023
digital version
1: Mutually Guided Few-Shot Learning For Relational Triple Extraction
Yang, Chengmei / Jiang, Shuai / He, Bowei / Ma, Chen / He, Lianghua et al. | 2023
digital version
1: Guide and Select: A Transformer-Based Multimodal Fusion Method for Points of Interest Description Generation
Liu, Hanqing / Wang, Wei / Hu, Niu / Zheng, Hai-Tao / Xie, Rui / Wu, Wei / Bai, Yang et al. | 2023
digital version
1: Interpretation of Neural Networks is Susceptible to Universal Adversarial Perturbations
Oskouie, Haniyeh Ehsani / Farnia, Farzan et al. | 2023
digital version
1: High-Resolution Embedding Extractor for Speaker Diarisation
Heo, Hee-Soo / Kwon, Youngki / Lee, Bong-Jin / Kim, You Jin / Jung, Jee-Weon et al. | 2023
digital version
1: Prosody-Controllable Spontaneous TTS with Neural HMMS
Lameris, Harm / Mehta, Shivam / Henter, Gustav Eje / Gustafson, Joakim / Szekely, Eva et al. | 2023
digital version
1: Faster Than Fast: Accelerating the Griffin-Lim Algorithm
Nenov, Rossen / Nguyen, Dang-Khoa / Balazs, Peter et al. | 2023
digital version
1: Scalable and Secure Federated XGBoost
Nguyen, Quang Minh / Khanh Le, Nhan / Nguyen, Lam M. et al. | 2023
digital version
1: A Generalized Subspace Distribution Adaptation Framework for Cross-Corpus Speech Emotion Recognition
Li, Shaokai / Song, Peng / Ji, Liang / Jin, Yun / Zheng, Wenming et al. | 2023
digital version
1: ClassA Entropy for the Analysis of Structural Complexity of Physiological Signals
Xiao, Hongjian / Li, Ling / Mandic, Danilo P. et al. | 2023
digital version
1: Improving Disfluency Detection with Multi-Scale Self Attention and Contrastive Learning
Wang, Peiying / Duan, Chaoqun / Chen, Meng / He, Xiaodong et al. | 2023
digital version
1: Time-Resolved FMRI Shared Response Model Using Gaussian Process Factor Analysis
Ebrahimi, MohammadReza / Calarco, Navona / Hawco, Colin / Voineskos, Aristotle / Khisti, Ashish et al. | 2023
digital version
1: Dynamic TF-TDNN: Dynamic Time Delay Neural Network Based on Temporal-Frequency Attention for Dialect Recognition
Liao, Chao / Huang, Jinwen / Yuan, Huan / Yao, Peng / Tan, Jianchao / Zhang, Dawei / Deng, Feng / Wang, Xiaorui / Song, Chengru et al. | 2023
digital version
1: Contrastive Learning of Functionality-Aware Code Embeddings
Li, Yiyang / Wu, Hongqiu / Zhao, Hai et al. | 2023
digital version
1: Ultrasound Image Quality Control Using Speech-Assisted Switchable CycleGAN
Huh, Jaeyoung / Khan, Shujaat / Sun Lee, Eun / Chul Ye, Jong et al. | 2023
digital version
1: Super Dilated Nested Arrays with Ideal Critical Weights and Increased Degrees of Freedom
Shaalan, Ahmed M. A. / Du, Jun et al. | 2023
digital version
1: Transient Dictionary Learning for Compressed Time-of-Flight Imaging
Conde, Miguel Heredia et al. | 2023
digital version
1: Does Your Model Think Like an Engineer? Explainable AI for Bearing Fault Detection with Deep Learning
Decker, Thomas / Lebacher, Michael / Tresp, Volker et al. | 2023
digital version
1: FAPM: Fast Adaptive Patch Memory for Real-Time Industrial Anomaly Detection
Kim, Donghyeong / Park, Chaewon / Cho, Suhwan / Lee, Sangyoun et al. | 2023
digital version
1: A Distributed Adaptive Algorithm for Non-Smooth Spatial Filtering Problems
Hovine, Charles / Bertrand, Alexander et al. | 2023
digital version
1: Graph Learning from Gaussian and Stationary Graph Signals
Buciulea, Andrei / Marques, Antonio G. et al. | 2023
digital version
1: Spatio-Temporal Attention in Multi-Granular Brain Chronnectomes For Detection of Autism Spectrum Disorder
Orme-Rogers, James / Srivastava, Ajitesh et al. | 2023
digital version
1: Priv-Aug-Shap-ECGResNet: Privacy Preserving Shapley-Value Attributed Augmented Resnet for Practical Single-Lead Electrocardiogram Classification
Ukil, Arijit / Marin, Leandro / Jara, Antonio J. et al. | 2023
digital version
1: Efficient Online Convolutional Dictionary Learning Using Approximate Sparse Components
Veshki, Farshad G. / Vorobyov, Sergiy A. et al. | 2023
digital version
1: Low-Latency Electrolaryngeal Speech Enhancement Based on Fastspeech2-Based Voice Conversion and Self-Supervised Speech Representation
Kobayashi, Kazuhiro / Hayashi, Tomoki / Toda, Tomoki et al. | 2023
digital version
1: Zero-Shot Personalized Lip-To-Speech Synthesis with Face Image Based Voice Control
Sheng, Zheng-Yan / Ai, Yang / Ling, Zhen-Hua et al. | 2023
digital version
1: mmWave Wi-Fi Trajectory Estimation with Continuous-Time Neural Dynamic Learning
Vaca-Rubio, Cristian J. / Wang, Pu / Koike-Akino, Toshiaki / Wang, Ye / Boufounos, Petros / Popovski, Petar et al. | 2023
digital version
1: Efficient Intelligibility Evaluation Using Keyword Spotting: A Study on Audio-Visual Speech Enhancement
Valentini-Botinhao, Cassia / Aldana Blanco, Andrea Lorena / Klejch, Ondrej / Bell, Peter et al. | 2023
digital version
1: D-3DLD: Depth-Aware Voxel Space Mapping for Monocular 3D Lane Detection with Uncertainty
Kim, Nayeon / Byeon, Moonsub / Ji, Daehyun / Oh, Dokwan et al. | 2023
digital version
1: Finer-Grained Decomposition for Parallel Quantum Mimo Processing
Kim, Minsung / Jamieson, Kyle et al. | 2023
digital version
1: Deep Root Music Algorithm for Data-Driven Doa Estimation
Shmuel, Dor H. / Merkofer, Julian P. / Revach, Guy / van Sloun, Ruud J. G. / Shlezinger, Nir et al. | 2023
digital version
1: Police: Provably Optimal Linear Constraint Enforcement For Deep Neural Networks
Balestriero, Randall / LeCun, Yann et al. | 2023
digital version
1: A Novel Metric For Evaluating Audio Caption Similarity
Bhosale, Swapnil / Chakraborty, Rupayan / Kopparapu, Sunil Kumar et al. | 2023
digital version
1: Generalized Two-Stage Particle Filter for High Dimensions
Iloska, Marija / Bugallo, Monica F. et al. | 2023
digital version
1: Mitigating Unintended Memorization in Language Models Via Alternating Teaching
Liu, Zhe / Zhang, Xuedong / Peng, Fuchun et al. | 2023
digital version
1: Adaptive Multi-Corpora Language Model Training for Speech Recognition
Ma, Yingyi / Liu, Zhe / Zhang, Xuedong et al. | 2023
digital version
1: Domain Adaptation without Catastrophic Forgetting on a Small-Scale Partially-Labeled Corpus for Speech Emotion Recognition
Zhu, Zhi / Sato, Yoshinao et al. | 2023
digital version
1: SingNet: a real-time Singing Voice beat and Downbeat Tracking System
Heydari, Mojtaba / Wang, Ju-Chiang / Duan, Zhiyao et al. | 2023
digital version
1: PCQA-Graphpoint: Efficient Deep-Based Graph Metric for Point Cloud Quality Assessment
Tliba, Marouane / Chetouani, Aladine / Valenzise, Giuseppe / Dufaux, Frederic et al. | 2023
digital version
1: Adaptive Step-Size Methods for Compressed SGD
Subramaniam, Adarsh M. / Magesh, Akshayaa / Veeravalli, Venugopal V. et al. | 2023
digital version
1: Leveraging Multiple Sources in Automatic African American English Dialect Detection for Adults and Children
Johnson, Alexander / Shetty, Vishwas M. / Ostendorf, Mari / Alwan, Abeer et al. | 2023
digital version
1: Adaptive Simulated Annealing Through Alternating Rényi Divergence Minimization
Guilmeau, Thomas / Chouzenoux, Emilie / Elvira, Victor et al. | 2023
digital version
1: NAS-DYMC: NAS-Based Dynamic Multi-Scale Convolutional Neural Network for Sound Event Detection
Wang, Jun / Yao, Peng / Deng, Feng / Tan, Jianchao / Song, Chengru / Wang, Xiaorui et al. | 2023
digital version
1: Wespeaker: A Research and Production Oriented Speaker Embedding Learning Toolkit
Wang, Hongji / Liang, Chengdong / Wang, Shuai / Chen, Zhengyang / Zhang, Binbin / Xiang, Xu / Deng, Yanlei / Qian, Yanmin et al. | 2023
digital version
1: Privacy Preserving Face Recognition with Lensless Camera
Henry, Chris / Asif, M. Salman / Li, Zhu et al. | 2023
digital version
1: Exploiting CCTV Cameras for Hand Hygiene Recognition in ICU
Huang, Weijun / Huang, Jia / Wang, Guowei / Lu, Hongzhou / He, Min / Wang, Wenjin et al. | 2023
digital version
1: Learning Sparse auto-Encoders for Green AI image coding
Gille, Cyprien / Guyard, Frederic / Antonini, Marc / Barlaud, Michel et al. | 2023
digital version
1: 3D Audio Signal Processing Systems for Speech Enhancement and Sound Localization and Detection
Bai, Jisheng / Huang, Siwei / Yin, Han / Jia, Yafei / Wang, Mou / Chen, Jianfeng et al. | 2023
digital version
1: Quantum Variational Bayes on Manifolds
Lopatnikova, Anna / Tran, Minh-Ngoc et al. | 2023
digital version
1: Exploring Complementary Features in Multi-Modal Speech Emotion Recognition
Wang, Suzhen / Ma, Yifeng / Ding, Yu et al. | 2023
digital version
1: Deep Spatio-Temporal Multiplex Graph Learning for Cardiac Imaging Classification
Banus, Jaume / Ogier, Augustin / Hullin, Roger / Meyer, Philippe / van Heeswijk, Ruud B. / Richiardi, Jonas et al. | 2023
digital version
1: Sign Language Recognition via Deformable 3D Convolutions and Modulated Graph Convolutional Networks
Papadimitriou, Katerina / Potamianos, Gerasimos et al. | 2023
digital version
1: Unsupervised word Segmentation Based on Word Influence
Yan, Ruohao / Zhang, Huaping / Silamu, Wushour / Hamdulla, Askar et al. | 2023
digital version
1: TAPE: An End-to-End Timbre-Aware Pitch Estimator
Tamer, Nazif Can / Ozer, Yigitcan / Muller, Meinard / Serra, Xavier et al. | 2023
digital version
1: Text Classification In The Wild: A Large-Scale Long-Tailed Name Normalization Dataset
Qi, Jiexing / Li, Shuhao / Guo, Zhixin / Huang, Yusheng / Zhou, Chenghu / Zhang, Weinan / Wang, Xinbing / Lin, Zhouhan et al. | 2023
digital version
1: Designing and Evaluating Speech Emotion Recognition Systems: A Reality Check Case Study with IEMOCAP
Antoniou, Nikolaos / Katsamanis, Athanasios / Giannakopoulos, Theodoros / Narayanan, Shrikanth et al. | 2023
digital version
1: TEA-PSE 3.0: Tencent-Ethereal-Audio-Lab Personalized Speech Enhancement System For ICASSP 2023 Dns-Challenge
Ju, Yukai / Chen, Jun / Zhang, Shimin / He, Shulin / Rao, Wei / Zhu, Weixin / Wang, Yannan / Yu, Tao / Shang, Shidong et al. | 2023
digital version
1: General or Specific? Investigating Effective Privacy Protection in Federated Learning for Speech Emotion Recognition
Tan, Chao / Cao, Yang / Li, Sheng / Yoshikawa, Masatoshi et al. | 2023
digital version
1: AST-SED: An Effective Sound Event Detection Method Based on Audio Spectrogram Transformer
Li, Kang / Song, Yan / Dai, Li-Rong / McLoughlin, Ian / Fang, Xin / Liu, Lin et al. | 2023
digital version
1: Nested Attention Network with Graph Filtering for Visual Question and Answering
Lu, Jing / Wu, Chunlei / Wang, Leiquan / Yuan, Shaozu / Wu, Jie et al. | 2023
digital version
1: Defending Against Universal Patch Attacks by Restricting Token Attention in Vision Transformers
Yu, Hongwei / Chen, Jiansheng / Ma, Huimin / Yu, Cheng / Ding, Xinlong et al. | 2023
digital version
1: M²-CTTS: End-to-End Multi-Scale Multi-Modal Conversational Text-to-Speech Synthesis
Xue, Jinlong / Deng, Yayue / Wang, Fengping / Li, Ya / Gao, Yingming / Tao, Jianhua / Sun, Jianqing / Liang, Jiaen et al. | 2023
digital version
1: Effectiveness of Mining Audio and Text Pairs from Public Data for Improving ASR Systems for Low-Resource Languages
Bhogale, Kaushal / Raman, Abhigyan / Javed, Tahir / Doddapaneni, Sumanth / Kunchukuttan, Anoop / Kumar, Pratyush / Khapra, Mitesh M. et al. | 2023
digital version
1: Effectiveness of Inter- and Intra-Subarray Spatial Features for Acoustic Scene Classification
Kawamura, Takao / Kinoshita, Yuma / Ono, Nobutaka / Scheibler, Robin et al. | 2023
digital version
1: Bayesian Network Modeling and Prediction of Transitions Within the Homelessness System
Rahman, Khandker Sadia / Zois, Daphney-Stavroula / Chelmis, Charalampos et al. | 2023
digital version
1: Adaptive Knowledge Distillation Between Text and Speech Pre-Trained Models
Ni, Jinjie / Ma, Yukun / Wang, Wen / Chen, Qian / Ng, Dianwen / Lei, Han / Nguyen, Trung Hieu / Zhang, Chong / Ma, Bin / Cambria, Erik et al. | 2023
digital version
1: Tell Model Where to Attend: Improving Interpretability of Aspect-Based Sentiment Classification via Small Explanation Annotations
Cheng, Zhenxiao / Zhou, Jie / Wu, Wen / Chen, Qin / He, Liang et al. | 2023
digital version
1: Comparative Study of IRS Assisted Opportunistic Communications Over i.i.d. and los channels
Yashvanth, L. / Murthy, Chandra R. et al. | 2023
digital version
1: Multi-Head Attention and GRU for Improved Match-Mismatch Classification of Speech Stimulus and EEG Response
Borsdorf, Marvin / Pahuja, Saurav / Ivucic, Gabriel / Cai, Siqi / Li, Haizhou / Schultz, Tanja et al. | 2023
digital version
1: DTTR: Detecting Text with Transformers
Yang, Jing / You, Zhiqiang / Zhong, Zhiwei / Liu, Peng / Mei, Langqi / Huang, Shenguang et al. | 2023
digital version
1: DST: Deformable Speech Transformer for Emotion Recognition
Chen, Weidong / Xing, Xiaofen / Xu, Xiangmin / Pang, Jianxin / Du, Lan et al. | 2023
digital version
1: Cross-Training: A Semi-Supervised Training Scheme for Speech Recognition
Khorram, Soheil / Tripathi, Anshuman / Kim, Jaeyoung / Lu, Han / Zhang, Qian / Prabhavalkar, Rohit / Sak, Hasim et al. | 2023
digital version
1: Wav2Seq: Pre-Training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages
Wu, Felix / Kim, Kwangyoun / Watanabe, Shinji / Han, Kyu J. / McDonald, Ryan / Weinberger, Kilian Q. / Artzi, Yoav et al. | 2023
digital version
1: MLP-GAN for Brain Vessel Image Segmentation
Xie, Bin / Tang, Hao / Duan, Bin / Cai, Dawen / Yan, Yan et al. | 2023
digital version
1: Stacking-Based Attention Temporal Convolutional Network for Action Segmentation
Yang, Liu / Jiang, Yu / Hong, Junkun / Wu, Zhenjie / Yang, Zhan / Long, Jun et al. | 2023
digital version
1: Probabilistic Back-ends for Online Speaker Recognition and Clustering
Sholokhov, Alexey / Kuzmin, Nikita / Lee, Kong Aik / Chng, Eng Siong et al. | 2023
digital version
1: Information Extraction from Pill Bottle Images via Text Stitching
Gupta, Rahul Kumar / Roy, Shilka / Jos, Sujit / S., Unni V. / Lavoie, Lauren / Medous, Frederic / Smith, Walter et al. | 2023
digital version
1: Semi-Supervised Remote Sensing Image Change Detection Using Mean Teacher Model for Constructing Pseudo-Labels
Mao, Zan / Tong, Xinyu / Luo, Ze et al. | 2023
digital version
1: Analysing Discrete Self Supervised Speech Representation For Spoken Language Modeling
Sicherman, Amitay / Adi, Yossi et al. | 2023
digital version
1: Flowpose: Conditional Normalizing Flows for 3D Human Pose and Shape Estimation from Monocular Videos
Du, Yaoyao / Zhang, Zixiao / Li, Zhihao / Wei, Peng / Liao, Qingmin / Yang, Wenming et al. | 2023
digital version
1: Glacier: Glass-Box Transformer for Interpretable Dynamic Neuroimaging
Mahmood, Usman / Fu, Zening / Calhoun, Vince / Plis, Sergey et al. | 2023
digital version
1: NBA-OMP: Near-Field Beam-Split-Aware Orthogonal Matching Pursuit for Wideband THz Channel Estimation
Elbir, Ahmet M. / Vijay Mishra, Kumar / Chatzinotas, Symeon et al. | 2023
digital version
1: MUG: A General Meeting Understanding and Generation Benchmark
Zhang, Qinglin / Deng, Chong / Liu, Jiaqing / Yu, Hai / Chen, Qian / Wang, Wen / Yan, Zhijie / Liu, Jinglin / Ren, Yi / Zhao, Zhou et al. | 2023
digital version
1: Automatic Classification of Vocal Intensity Category from Speech
Kodali, Manila / Kadiri, Sudarsana Reddy / Laaksonen, Laura / Alku, Paavo et al. | 2023
digital version
1: A Template Matching Approach for Reference Picture Padding in Video Coding
Horst, Nicolas / Das, Priyanka / Wien, Mathias et al. | 2023
digital version
1: An Efficient Relay Selection Scheme for Relay-assisted HARQ
Ding, Weihang / Shikh-Bahaei, Mohammad et al. | 2023
digital version
1: Sora: Scalable Black-Box Reachability Analyser on Neural Networks
Xu, Peipei / Wang, Fu / Ruan, Wenjie / Zhang, Chi / Huang, Xiaowei et al. | 2023
digital version
1: The First Pathloss Radio Map Prediction Challenge
Yapar, Cagkan / Jaensch, Fabian / Levie, Ron / Kutyniok, Gitta / Caire, Giuseppe et al. | 2023
digital version
1: U-Shiftformer: Brain Tumor Segmentation Using A Shifted Attention Mechanism
Lin, Chih-Wei / Chen, Zhongsheng et al. | 2023
digital version
1: Does Human Speech Follow Benford’s Law?
Hsu, Leo / Berisha, Visar et al. | 2023
digital version
1: Conversation-Oriented ASR with Multi-Look-Ahead CBS Architecture
Zhao, Huaibo / Fujie, Shinya / Ogawa, Tetsuji / Sakuma, Jin / Kida, Yusuke / Kobayashi, Tetsunori et al. | 2023
digital version
1: Towards a Unified Training for Levenshtein Transformer
Zheng, Kangjie / Wang, Longyue / Wang, Zhihao / Chen, Binqi / Zhang, Ming / Tu, Zhaopeng et al. | 2023
digital version
1: A Principled Approach to Model Validation in Domain Generalization
Lyu, Boyang / Nguyen, Thuan / Scheutz, Matthias / Ishwar, Prakash / Aeron, Shuchin et al. | 2023
digital version
1: Neural Networks with Quantization Constraints
Hounie, Ignacio / Elenter, Juan / Ribeiro, Alejandro et al. | 2023
digital version
1: Direct Position Determination with One-Bit Signal for Multiple Targets
Ni, Lihua / Zhang, Di / Xing, Tianyi / Ran, Maoyan / Liu, Ning / Wan, Qun et al. | 2023
digital version
1: Learning to Balance the Global Coherence and Informativeness in Knowledge-Grounded Dialogue Generation
Niu, Chenxu / Hu, Yue / Peng, Wei / Xie, Yuqiang et al. | 2023
digital version
1: Backdoor Attack Against Automatic Speaker Verification Models in Federated Learning
Meng, Dan / Wang, Xue / Wang, Jun et al. | 2023
digital version
1: Wireless Deep Speech Semantic Transmission
Xiao, Zixuan / Yao, Shengshi / Dai, Jincheng / Wang, Sixian / Niu, Kai / Zhang, Ping et al. | 2023
digital version
1: Context-Aware Fine-Tuning of Self-Supervised Speech Models
Shon, Suwon / Wu, Felix / Kim, Kwangyoun / Sridhar, Prashant / Livescu, Karen / Watanabe, Shinji et al. | 2023
digital version
1: Improved Acoustic-to-Articulatory Inversion Using Representations from Pretrained Self-Supervised Learning Models
Udupa, Sathvik / C, Siddarth / Ghosh, Prasanta Kumar et al. | 2023
digital version
1: Lightweight Annotation and Class Weight Training for Automatic Estimation of Alarm Audibility in Noise
Effa, Francois / Serizel, Romain / Arz, Jean-Pierre / Grimault, Nicolas et al. | 2023
digital version
1: Disentangled Training with Adversarial Examples for Robust Small-Footprint Keyword Spotting
Wang, Zhenyu / Wan, Li / Zhang, Biqiao / Huang, Yiteng / Li, Shang-Wen / Sun, Ming / Lei, Xin / Yang, Zhaojun et al. | 2023
digital version
1: Numerical Semantic Modeling for Implicit Discourse Relation Recognition
Wang, Chenxu / Jian, Ping / Wang, Hai et al. | 2023
digital version
1: Stereoscopic Video Retargeting Based on Camera Motion Classification
Cai, Linghui / Tang, Zhenhua et al. | 2023
digital version
1: Spoofed Training Data for Speech Spoofing Countermeasure Can Be Efficiently Created Using Neural Vocoders
Wang, Xin / Yamagishi, Junichi et al. | 2023
digital version
1: Massively Multilingual Shallow Fusion with Large Language Models
Hu, Ke / Sainath, Tara N. / Li, Bo / Du, Nan / Huang, Yanping / Dai, Andrew M. / Zhang, Yu / Cabrera, Rodrigo / Chen, Zhifeng / Strohman, Trevor et al. | 2023
digital version
1: SDTN: Speaker Dynamics Tracking Network for Emotion Recognition in Conversation
Chen, Jiawei / Huang, Peijie / Huang, Guotai / Li, Qianer / Xu, Yuhong et al. | 2023
digital version
1: Improving CTC-Based ASR Models With Gated Interlayer Collaboration
Yang, Yuting / Li, Yuke / Du, Binbin et al. | 2023
digital version
1: Restoration of Time-Varying Graph Signals using Deep Algorithm Unrolling
Kojima, Hayate / Noguchi, Hikari / Yamada, Koki / Tanaka, Yuichi et al. | 2023
digital version
1: A Dual-Path Transformer Network for Scene Text Detection
Lin, Jingyu / Yan, Yan / Wang, Hanzi et al. | 2023
digital version
1: Audio-Visual Speech Enhancement with a Deep Kalman Filter Generative Model
Golmakani, Ali / Sadeghi, Mostafa / Serizel, Romain et al. | 2023
digital version
1: Ideal: Improved Dense Local Contrastive Learning For Semi-Supervised Medical Image Segmentation
Basak, Hritam / Chattopadhyay, Soumitri / Kundu, Rohit / Nag, Sayan / Mallipeddi, Rammohan et al. | 2023
digital version
1: Embedding a Differentiable Mel-Cepstral Synthesis Filter to a Neural Speech Synthesis System
Yoshimura, Takenori / Takaki, Shinji / Nakamura, Kazuhiro / Oura, Keiichiro / Hono, Yukiya / Hashimoto, Kei / Nankaku, Yoshihiko / Tokuda, Keiichi et al. | 2023
digital version
1: Symbol Level Precoding in the RF Domain for Low Hardware Complexity RIS-Assisted MU-MISO Systems
Tsinos, Christos G. / Tsiftsis, Theodoros A. / Schober, Robert et al. | 2023
digital version
1: CTCBERT: Advancing Hidden-Unit Bert with CTC Objectives
Fan, Ruchao / Wang, Yiming / Gaur, Yashesh / Li, Jinyu et al. | 2023
digital version
1: Sine: Similarity-Regularized Intra-Class Exploitation for Cross-Granularity Few-Shot Learning
Yang, Jinhai / Yang, Hua et al. | 2023
digital version
1: Topological Signal Processing Over Weighted Simplicial Complexes
Battiloro, Claudio / Sardellitti, Stefania / Barbarossa, Sergio / Lorenzo, Paolo Di et al. | 2023
digital version
1: Neural Mode Estimation
Sun, Peng / Wen, Zhenyu / Zhou, Yejian / Hong, Zhen / Lin, Tao et al. | 2023
digital version
1: Meta Learning with Adaptive Loss Weight for Low-Resource Speech Recognition
Wang, Qiulin / Hu, Wenxuan / Li, Lin / Hong, Qingyang et al. | 2023
digital version
1: An Auto-Encoder Based Method for Camera Fingerprint Compression
Zhang, Kaixuan / Liu, Zihan / Hu, Jiashang / Wang, Shilin et al. | 2023
digital version
1: A Transformer-Based E2E SLU Model for Improved Semantic Parsing
Istaiteh, Othman / Kussad, Yasmeen / Daqour, Yahya / Habib, Maria / Habash, Mohammad / Gowda, Dhananjaya et al. | 2023
digital version
1: Procontext: Exploring Progressive Context Transformer for Tracking
Lan, Jin-Peng / Cheng, Zhi-Qi / He, Jun-Yan / Li, Chenyang / Luo, Bin / Bao, Xu / Xiang, Wangmeng / Geng, Yifeng / Xie, Xuansong et al. | 2023
digital version
1: Achieving Fair Speech Emotion Recognition via Perceptual Fairness
Chien, Woan-Shiuan / Lee, Chi-Chun et al. | 2023
digital version
1: Unsupervised Pre-Training for Data-Efficient Text-to-Speech on Low Resource Languages
Park, Seongyeon / Song, Myungseo / Kim, Bohyung / Oh, Tae-Hyun et al. | 2023
digital version
1: Image Sharing Chain Detection VIA Sequence-To-Sequence Model
You, Jiaxiang / Li, Yuanman / Liang, Rongqin / Tan, Yuxuan / Zhou, Jiantao / Li, Xia et al. | 2023
digital version
1: NCL: Textual Backdoor Defense Using Noise-Augmented Contrastive Learning
Zhai, Shengfang / Shen, Qingni / Chen, Xiaoyi / Wang, Weilong / Li, Cong / Fang, Yuejian / Wu, Zhonghai et al. | 2023
digital version
1: Higher-Order Spatio-Temporal Neural Networks for Covid-19 Forecasting
Chen, Yuzhou / Batsakis, Sotiris / Poor, H. Vincent et al. | 2023
digital version
1: Regression to Classification: Waveform Encoding for Neural Field-Based Audio Signal Representation
Kim, TaeSoo / Rho, Daniel / Lee, Gahui / Park, JaeHan / Ko, Jong Hwan et al. | 2023
digital version
1: Visual Answer Localization with Cross-Modal Mutual Knowledge Transfer
Weng, Yixuan / Li, Bin et al. | 2023
digital version
1: An Empirical Study and Improvement for Speech Emotion Recognition
Wu, Zhen / Lu, Yizhe / Dai, Xinyu et al. | 2023
digital version
1: A Study of Audio Mixing Methods for Piano Transcription in Violin-Piano Ensembles
Kim, Hyemi / Park, Jiyun / Kwon, Taegyun / Jeong, Dasaem / Nam, Juhan et al. | 2023
digital version
1: Interaction-Assisted Multi-Modal Representation Learning for Recommendation
Wu, Hao / Wang, Jiajie / Zu, Zhonglin et al. | 2023
digital version
1: Exploiting One-Class Classification Optimization Objectives for Increasing Adversarial Robustness
Mygdalis, Vasileios / Pitas, Ioannis et al. | 2023
digital version

How to get this title?

Check access

Download

Commercial Copyright fee: €30.47 Basic fee: €4.00 Total price: €34.47

Academic Copyright fee: €30.47 Basic fee: €2.00 Total price: €32.47

Quicklinks

Borrowing & Ordering

Quicklinks

Search & discover

Quicklinks

Learning & working

Quicklinks

Publishing & Archiving

Quicklinks

About the TIB

Quicklinks

Research & Development

D2Former: A Fully Complex Dual-Path Dual-Decoder Conformer Network Using Joint Complex Masking and Complex Spectral Mapping for Monaural Speech Enhancement (English)

How to get this title?

Export, share and cite

More details on this result

Table of contents

Table of contents conference proceedings

Similar titles

How to get this title?

Export, share and cite