Update README.md
Browse files
README.md
CHANGED
@@ -1,17 +1,18 @@
|
|
1 |
---
|
2 |
license: mit
|
3 |
---
|
4 |
-
**SWE-Dev-9B is trained from [glm-4-9B-chat](https://huggingface.co/THUDM/glm-4-9b-chat/)**
|
5 |
|
6 |
-
🚀 SWE-Dev,
|
7 |
|
8 |
-
|
9 |
|
10 |
-
🔧
|
11 |
|
12 |
-
|
|
|
|
|
|
|
13 |
|
14 |
-
💡 We further explored the scaling laws between data size, interaction rounds, and model performance, demonstrating that smaller, high-quality datasets are sufficient to support top-tier performance.
|
15 |
|
16 |
Notion Link: https://ubecwang.notion.site/1bc32cf963e080b2a01df2895f66021f?v=1bc32cf963e0810ca07e000c86c4c1e1
|
17 |
|
|
|
1 |
---
|
2 |
license: mit
|
3 |
---
|
|
|
4 |
|
5 |
+
🚀 SWE-Dev, an open-source Agent for Software Engineering tasks!
|
6 |
|
7 |
+
💡 We develop a comprehensive pipeline for creating developer-oriented datasets from GitHub repositories, including issue tracking, code localization, test case generation, and evaluation.
|
8 |
|
9 |
+
🔧 Based on open-source frameworks (OpenHands) and models, SWE-Dev-7B and 32B achieved solve rates of 23.4% and 36.6% on SWE-bench-Verified, respectively, even approaching the performance of GPT-4o.
|
10 |
|
11 |
+
📚 We find that training data scaling and inference scaling can both effectively boost the performance of models on SWE-bench. Moreover, higher data quality further improves this trend when combined with reinforcement fine-tuning (RFT). For inference scaling specifically, the solve rate on SWE-Dev increased from 34.0% at 30 rounds to 36.6% at 75 rounds.
|
12 |
+
|
13 |
+
|
14 |
+
SWE-Dev-9B is trained from [glm-4-9B-chat](https://huggingface.co/THUDM/glm-4-9b-chat/)
|
15 |
|
|
|
16 |
|
17 |
Notion Link: https://ubecwang.notion.site/1bc32cf963e080b2a01df2895f66021f?v=1bc32cf963e0810ca07e000c86c4c1e1
|
18 |
|