Jiang Jiwen's picture

Jiang Jiwen

jjw0126

·

AI & ML interests

RL, LLM

Recent Activity

updated a model 20 days ago

PLM-Team/plm_internvl_ola_audio_proj

published a model 20 days ago

PLM-Team/plm_internvl_ola_audio_proj

updated a model 20 days ago

PLM-Team/plm_internvl_ola_code

View all activity

Organizations

upvoted a collection 3 months ago

Phi-4

Phi-4 family of small language, multi-modal and reasoning models. • 17 items • Updated Jul 10 • 183

upvoted a collection 5 months ago

Reasoning, Thinking, RL and Test-Time Scaling

260 items • Updated 11 days ago • 13

upvoted 2 articles 8 months ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

By

and 2 others •

Jan 28

• 880

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

By

•

Feb 7

• 228

upvoted 3 collections 8 months ago

🧠 Reasoning datasets

Datasets with reasoning traces for math and code released by the community • 24 items • Updated May 19 • 173

Reasoning Datasets

Distilled synthetic Reasoning datasets • 7 items • Updated Feb 2 • 61

DeepSeek R1 (All Versions)

DeepSeek-R1-0528 is here! The most powerful reasoning open LLM, available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 37 items • Updated 6 days ago • 259

upvoted 7 collections 9 months ago

Thinking/Reasoning Datasets

16 items • Updated Mar 28 • 2

gemini-2.0-flash-thinking-exp-1219 Datasets

Existing datasets with responses regenerated using gemini-2.0-flash-thinking-exp-1219. Currently only single-turn. • 15 items • Updated Mar 28 • 6

gemini-exp-1206 Datasets

Existing datasets with responses regenerated using gemini-exp-1206. Currently only single-turn. • 3 items • Updated Mar 28 • 1

story writing favourites

Models I personally liked for generating stories in the past. Not a recommendation, most of these are outdated. • 24 items • Updated Jun 11 • 69

long-cot-dataset

16 items • Updated Dec 22, 2024 • 14

Reasoning Models

If this really help, please upvote for researchers' hardwork • 14 items • Updated Jan 21 • 1

CoT Datasets

If this really help, please upvote for researchers' hardwork • 15 items • Updated Jan 20 • 1

upvoted 2 collections 10 months ago

DCLM

DCLM Models + Datasets • 6 items • Updated Aug 25 • 27

OLMo 2

Artifacts for the OLMo 2 release. • 35 items • Updated May 1 • 140

upvoted a collection about 1 year ago

small language models

under 7b 🐁 • 65 items • Updated Apr 9 • 39

upvoted a collection over 1 year ago

Model Merging

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 248