Meetween's Research Papers - a meetween Collection

meetween 's Collections

Meetween's Research Papers

Meetween's Research Papers

updated 7 days ago

Research papers published within the MEETWEEN project

Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?

Paper • 2402.12025 • Published Feb 19, 2024 • 2
StreamAtt: Direct Streaming Speech-to-Text Translation with Attention-based Audio History Selection

Paper • 2406.06097 • Published Jun 10, 2024 • 2
SimulSeamless: FBK at IWSLT 2024 Simultaneous Speech Translation

Paper • 2406.14177 • Published Jun 20, 2024 • 1
MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages

Paper • 2410.01036 • Published Oct 1, 2024 • 16
What the Harm? Quantifying the Tangible Impact of Gender Bias in Machine Translation with a Human-centered Study

Paper • 2410.00545 • Published Oct 1, 2024 • 5
How "Real" is Your Real-Time Simultaneous Speech-to-Text Translation System?

Paper • 2412.18495 • Published Dec 24, 2024 • 9
How to Connect Speech Foundation Models and Large Language Models? What Matters and What Does Not

Paper • 2409.17044 • Published Sep 25, 2024 • 3
NUTSHELL: A Dataset for Abstract Generation from Scientific Talks

Paper • 2502.16942 • Published Feb 24 • 1
MCIF: Multimodal Crosslingual Instruction-Following Benchmark from Scientific Talks

Paper • 2507.19634 • Published Jul 25 • 9
Cross-Attention is Half Explanation in Speech-to-Text Models

Paper • 2509.18010 • Published 9 days ago • 5
Better Late Than Never: Evaluation of Latency Metrics for Simultaneous Speech-to-Text Translation

Paper • 2509.17349 • Published 9 days ago • 2
KIT's Offline Speech Translation and Instruction Following Submission for IWSLT 2025

Paper • 2505.13036 • Published May 19
How do Multimodal Foundation Models Encode Text and Speech? An Analysis of Cross-Lingual and Cross-Modal Representations

Paper • 2411.17666 • Published Nov 26, 2024
Early-Exit and Instant Confidence Translation Quality Estimation

Paper • 2502.14429 • Published Feb 20 • 4
Are Generative Models Underconfident? An Embarrassingly Simple Quality Estimation Approach

Paper • 2502.11115 • Published Feb 16
MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR Models

Paper • 2411.18152 • Published Nov 27, 2024 • 1
Cocktail-Party Audio-Visual Speech Recognition

Paper • 2506.02178 • Published Jun 2
Audio-Visual Speech Representation Expert for Enhanced Talking Face Video Generation and Evaluation

Paper • 2405.04327 • Published May 7, 2024
Audio-driven Talking Face Generation with Stabilized Synchronization Loss

Paper • 2307.09368 • Published Jul 18, 2023
Summarizing Speech: A Comprehensive Survey

Paper • 2504.08024 • Published Apr 10
Contrastive Learning for Task-Independent SpeechLLM-Pretraining

Paper • 2412.15712 • Published Dec 20, 2024
Quality Estimation with k-nearest Neighbors and Automatic Evaluation for Model-specific Quality Estimation

Paper • 2404.18031 • Published Apr 27, 2024
Blending LLMs into Cascaded Speech Translation: KIT's Offline Speech Translation System for IWSLT 2024

Paper • 2406.16777 • Published Jun 24, 2024
Optimizing Rare Word Accuracy in Direct Speech Translation with a Retrieval-and-Demonstration Approach

Paper • 2409.09009 • Published Sep 13, 2024
COMET-poly: Machine Translation Metric Grounded in Other Candidates

Paper • 2508.18549 • Published Aug 25
PIER: A Novel Metric for Evaluating What Matters in Code-Switching

Paper • 2501.09512 • Published Jan 16
Decoupled Vocabulary Learning Enables Zero-Shot Translation from Unseen Languages

Paper • 2408.02290 • Published Aug 5, 2024
Continuously Learning New Words in Automatic Speech Recognition

Paper • 2401.04482 • Published Jan 9, 2024
Weight Factorization and Centralization for Continual Learning in Speech Recognition

Paper • 2506.16574 • Published Jun 19
Towards Better Disentanglement in Non-Autoregressive Zero-Shot Expressive Voice Conversion

Paper • 2506.04013 • Published Jun 4
Streaming Non-Autoregressive Model for Accent Conversion and Pronunciation Improvement

Paper • 2506.16580 • Published Jun 19