Phi-4 Collection Phi-4 family of small language, multi-modal and reasoning models. • 17 items • Updated Jul 10 • 183
view article Article Open-R1: a fully open reproduction of DeepSeek-R1 By eliebak and 2 others • Jan 28 • 880
view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr • Feb 7 • 228
🧠 Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community • 24 items • Updated May 19 • 173
DeepSeek R1 (All Versions) Collection DeepSeek-R1-0528 is here! The most powerful reasoning open LLM, available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 37 items • Updated 6 days ago • 259
gemini-2.0-flash-thinking-exp-1219 Datasets Collection Existing datasets with responses regenerated using gemini-2.0-flash-thinking-exp-1219. Currently only single-turn. • 15 items • Updated Mar 28 • 6
gemini-exp-1206 Datasets Collection Existing datasets with responses regenerated using gemini-exp-1206. Currently only single-turn. • 3 items • Updated Mar 28 • 1
story writing favourites Collection Models I personally liked for generating stories in the past. Not a recommendation, most of these are outdated. • 24 items • Updated Jun 11 • 69
Reasoning Models Collection If this really help, please upvote for researchers' hardwork • 14 items • Updated Jan 21 • 1
CoT Datasets Collection If this really help, please upvote for researchers' hardwork • 15 items • Updated Jan 20 • 1
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 248