File size: 1,446 Bytes
3da0ba2 a31f288 3da0ba2 03b40cc a31f288 3da0ba2 238fbc9 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
---
license: mit
library_name: transformers
base_model:
- deepseek-ai/DeepSeek-V3-0324
- deepseek-ai/DeepSeek-R1
pipeline_tag: text-generation
---
# DeepSeek-R1T-Chimera
<div align="center">
<img src="https://www.tngtech.com/_astro/TNG_Logo.URm66zYr_Z2aCrIU.svg"
alt="TNG Logo"
width="400"
style="display: inline-block; vertical-align: middle;"/>
</div>
<br>
<div align="center">
<a href="LICENSE" style="margin: 2px;">
<img alt="License" src="https://img.shields.io/badge/License-MIT-f5de53?&color=f5de53" style="display: inline-block; vertical-align: middle;"/>
</a>
</div>
<br>
<div align="center">
<a href="LICENSE" style="margin: 2px;">
<img alt="License" src="R1T-Chimera_Benchmarks_20250427_V1.jpg" style="display: inline-block; vertical-align: middle;"/>
</a>
</div>
**Model merge of DeepSeek-R1 and DeepSeek-V3 (0324)**
An open weights model combining the intelligence of R1 with the token efficiency of V3.
[Announcement on X](https://x.com/tngtech/status/1916284566127444468) | [LinkedIn post](https://www.linkedin.com/posts/tng-technology-consulting_on-the-weekend-we-released-deepseek-r1t-chimera-activity-7323008947236290560-Cf2m)
## Model Details
- **Architecture**: DeepSeek-MoE Transformer-based language model
- **Combination Method**: Merged model weights from DeepSeek-R1 and DeepSeek-V3 (0324)
- **Release Date**: 2025-04-27
## Contact
- Email: research@tngtech.com
- X.com: @tngtech |