Efficient infusion of self-supervised representations in Automatic Speech Recognition Paper • 2404.12628 • Published Apr 19, 2024 • 9
LASPA: Language Agnostic Speaker Disentanglement with Prefix-Tuned Cross-Attention Paper • 2506.02083 • Published Jun 2 • 8
Enhancing Whisper's Accuracy and Speed for Indian Languages through Prompt-Tuning and Tokenization Paper • 2412.19785 • Published Dec 27, 2024 • 8
Attention Is Not Always the Answer: Optimizing Voice Activity Detection with Simple Feature Fusion Paper • 2506.01365 • Published Jun 2 • 8
Nonparallel Emotional Voice Conversion For Unseen Speaker-Emotion Pairs Using Dual Domain Adversarial Network & Virtual Domain Pairing Paper • 2302.10536 • Published Feb 21, 2023 • 6
Isometric Neural Machine Translation using Phoneme Count Ratio Reward-based Reinforcement Learning Paper • 2403.15469 • Published Mar 20, 2024 • 9
VECL-TTS: Voice identity and Emotional style controllable Cross-Lingual Text-to-Speech Paper • 2406.08076 • Published Jun 12, 2024 • 6
REWIND: Speech Time Reversal for Enhancing Speaker Representations in Diffusion-based Voice Conversion Paper • 2505.20756 • Published May 27 • 8
DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing Paper • 2406.08802 • Published Jun 13, 2024 • 8
EmoReg: Directional Latent Vector Modeling for Emotional Intensity Regularization in Diffusion-based Voice Conversion Paper • 2412.20359 • Published Dec 29, 2024 • 8
Cross-Modal Fusion and Attention Mechanism for Weakly Supervised Video Anomaly Detection Paper • 2412.20455 • Published Dec 29, 2024 • 9
Precise Event Spotting in Sports Videos: Solving Long-Range Dependency and Class Imbalance Paper • 2503.00147 • Published Feb 28 • 8
DuET: Dual Incremental Object Detection via Exemplar-Free Task Arithmetic Paper • 2506.21260 • Published Jun 26 • 8
Fiducial Focus Augmentation for Facial Landmark Detection Paper • 2402.15044 • Published Feb 23, 2024 • 8
M2FNet: Multi-modal Fusion Network for Emotion Recognition in Conversation Paper • 2206.02187 • Published Jun 5, 2022 • 9