Spaces:

fdaudens
/

colqwen-omni-demo

Running on Zero

fdaudens HF Staff commited on Jul 17

Commit

27e0438

verified ·

1 Parent(s): 3c3337a

Update app.py

Files changed (1) hide show

app.py CHANGED Viewed

@@ -184,12 +184,13 @@ def process_audio_rag(audio_file_path, query, chunk_length=30, use_openai=False,
 with gr.Blocks(title="AudioRAG Demo") as demo:
     gr.Markdown("# AudioRAG Demo - Semantic Audio Search")
     gr.Markdown("""
-    This demo builds on the work from the ColQwen team, expanding retrieval capabilities beyond images to include audio and video. Inspired by the Qwen-Omni series, ColQwen-Omni (3B) pushes the boundaries of multimodal search — embedding and retrieving almost any type of content.
-    **What’s new?**
-    Unlike traditional methods, this model searches directly through raw audio without converting it to text. It understands semantic meaning in sound, speech, and audio patterns — making "AudioRAG" a real possibility.
-    📖 [Blog post](https://huggingface.co/blog/manu/colqwen-omni-omnimodal-retrieval) | 🤗 [Model on Hugging Face](https://huggingface.co/vidore/colqwen-omni-v0.1)
     """)
     with gr.Row():
@@ -211,7 +212,7 @@ with gr.Blocks(title="AudioRAG Demo") as demo:
     gr.Examples(
         examples=[
-            ["test.m4a", "Who's the guest of the podcast?", 30],
         ],
         inputs=[audio_input, query_input, chunk_length]
     )

 with gr.Blocks(title="AudioRAG Demo") as demo:
     gr.Markdown("# AudioRAG Demo - Semantic Audio Search")
     gr.Markdown("""
+    This demo builds on the work from the ColQwen team, expanding retrieval capabilities beyond images to include audio and video.
+    Unlike traditional methods, this model searches directly through raw audio without converting it to text. It understands semantic meaning in sound, speech, and audio patterns, making "AudioRAG" a real possibility.
+    📖 [Blog post](https://huggingface.co/blog/manu/colqwen-omni-omnimodal-retrieval) | 🤗 [Model on Hugging Face](https://huggingface.co/vidore/colqwen-omni-v0.1) | 📓 [Colab Notebook](https://colab.research.google.com/drive/1YOlTWfLbiyQqfq1SlqHA2iME1R-nH4aS#scrollTo=w7UyXtEcK0lA)
+    🎙️ Sample come from [Newsroom Robots](https://www.newsroomrobots.com/p/how-open-source-ai-puts-newsrooms)
     """)
     with gr.Row():
     gr.Examples(
         examples=[
+            ["test.m4a", "Who’s the podcast host?", 30],
         ],
         inputs=[audio_input, query_input, chunk_length]
     )