jeffra commited on
Commit
d71c180
Β·
verified Β·
1 Parent(s): 64b9559

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -0
README.md CHANGED
@@ -20,3 +20,20 @@ For more details about ExCoT and how to use it:
20
  * πŸ“ [ExCoT: Optimizing Reasoning for Text-to-SQL with Execution Feedback (arxiv)](https://arxiv.org/pdf/2503.19988)
21
  * πŸš€ [Getting started guide using ArcticTraining](https://github.com/snowflakedb/ArcticTraining/tree/main/projects/excot_dpo)
22
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  * πŸ“ [ExCoT: Optimizing Reasoning for Text-to-SQL with Execution Feedback (arxiv)](https://arxiv.org/pdf/2503.19988)
21
  * πŸš€ [Getting started guide using ArcticTraining](https://github.com/snowflakedb/ArcticTraining/tree/main/projects/excot_dpo)
22
 
23
+ ## Evaluation results
24
+
25
+ | Model | Ex% Dev | Ex% Test |
26
+ |--------------------------------------|---------|----------|
27
+ | Arctic-ExCoT-70B (LLaMA 3.1 70B) | **68.51** | 68.53 |
28
+ | Arctic-ExCoT-32B (Qwen-2.5-Coder 32B) | 68.25 | 68.19 |
29
+ | XiYanSQL-QwenCoder* | 67.01 | **69.03** |
30
+ | OpenAI GPT-4o | 54.04 | – |
31
+ | OpenAI GPT-4 | 46.35 | 54.89 |
32
+ | Anthropic Claude 3.5-Sonnet | 50.13 | – |
33
+ | Claude-2 | 42.70 | 49.02 |
34
+ | OpenAI o1-mini | 52.41 | – |
35
+ | OpenAI o3-mini | 53.72 | – |
36
+ | Mistral-large-2407 (123B) | 53.52 | 55.84 |
37
+ | DeepSeek-V2 (236B) | 56.13 | 56.68 |
38
+
39
+ Top Single-Model, Single-Inference Results on the BIRD Leaderboard (as of March 25, 2025). *XiYanSQL-QwenCoder: there are some challenges to reproduce the numbers [[1]](https://github.com/XGenerationLab/XiYanSQL-QwenCoder/issues/4)[[2]](https://modelscope.cn/models/XGenerationLab/XiYanSQL-QwenCoder-32B-2412/feedback/issueDetail/22708).