Thinh Le's picture

Thinh Le PRO

thinhlpg

·

thinhlpg

AI & ML interests

anime stuffs

Recent Activity

reacted to burtenshaw's post with 🧠 about 6 hours ago

Qwen 3 Fine tuning >> MoE. Update the experiment thread to include config and script for fine-tuning the Qwen3-30B-A3B model. The goal is to make a low latency non-thinking model for a daily driver coding, so 3 billion parameters active should be perfect. ✔️ training running ✔️ evals running ⏭️ improve dataset The moe isn't going to fit into colab's A100 even with quantization (🙏 @UnslothAI ). So I've been working on HF spaces' H100s for this. Everything is available in the tread and I'll share more tomorrow. https://huggingface.co/burtenshaw/Qwen3-Code-Lite/discussions/1

reacted to burtenshaw's post with 👍 about 6 hours ago

Qwen 3 Fine tuning >> MoE. Update the experiment thread to include config and script for fine-tuning the Qwen3-30B-A3B model. The goal is to make a low latency non-thinking model for a daily driver coding, so 3 billion parameters active should be perfect. ✔️ training running ✔️ evals running ⏭️ improve dataset The moe isn't going to fit into colab's A100 even with quantization (🙏 @UnslothAI ). So I've been working on HF spaces' H100s for this. Everything is available in the tread and I'll share more tomorrow. https://huggingface.co/burtenshaw/Qwen3-Code-Lite/discussions/1

liked a model about 6 hours ago

mistralai/Mixtral-8x7B-v0.1

View all activity

Organizations

thinhlpg's activity

upvoted a collection 3 days ago

Chronos Models & Datasets

Collection of artifacts related to Chronos pretrained models for time series forecasting. • 12 items • Updated Nov 26, 2024 • 41

upvoted a collection 11 days ago

Awesome Computer Use Agents

https://github.com/ranpox/awesome-computer-use • 25 items • Updated Dec 18, 2024 • 13

upvoted 4 papers 13 days ago

ToolRL: Reward is All Tool Learning Needs

Paper • 2504.13958 • Published 19 days ago • 42

Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model

Paper • 2408.17175 • Published Aug 30, 2024 • 4

SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound

Paper • 2405.00233 • Published Apr 30, 2024 • 18

Scaling Transformers for Low-Bitrate High-Quality Speech Coding

Paper • 2411.19842 • Published Nov 29, 2024 • 12

upvoted an article 17 days ago

Article

Evaluating Audio Reasoning with Big Bench Audio

Dec 20, 2024

• 21

upvoted a collection 18 days ago

Step-Audio

Step-Audio model family, including Audio-Tokenizer, Audio-Chat and TTS • 3 items • Updated Feb 17 • 31

upvoted an article 18 days ago

Article

FastRTC: The Real-Time Communication Library for Python

Feb 25

• 161

upvoted an article 19 days ago

Article

Hugging Face to sell open-source robots thanks to Pollen Robotics acquisition 🤖

22 days ago

• 42

upvoted 2 papers 25 days ago

SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild

Paper • 2503.18892 • Published Mar 24 • 30

OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens

Paper • 2504.07096 • Published 26 days ago • 73

upvoted a collection 25 days ago

Vietnamese speech dataset

for any speech-related tasks including but not limited to: speech-to-text & text-to-speech, speech classification, speaker verification, etc. • 31 items • Updated 12 days ago • 21

upvoted a collection about 1 month ago

MoshiVis v0.1

MoshiVis is a Vision Speech Model built as a perceptually-augmented version of Moshi v0.1 for conversing about image inputs • 8 items • Updated Mar 21 • 22

upvoted an article about 1 month ago

Article

Faster fine-tuning using TRL & Unsloth

Jan 10, 2024

• 59

upvoted an article about 2 months ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

Jan 28

• 849

upvoted an article 2 months ago

Article

SmolVLM2: Bringing Video Understanding to Every Device

Feb 20

• 242