|
--- |
|
license: mit |
|
language: |
|
- en |
|
- vi |
|
base_model: |
|
- BlossomsAI/BloomVN-8B-chat |
|
pipeline_tag: text-generation |
|
library_name: transformers |
|
tags: |
|
- conversational |
|
--- |
|
<div align="center"> |
|
<img src="https://github.com/bloomifycafe/blossomsAI/blob/main/assets/logo.png?raw=true" alt="Logo"/> |
|
</div> |
|
</br> |
|
<div align="center"> |
|
|
|
# π BloomVN-8B-Chat-Reasoning |
|
|
|
</div> |
|
|
|
### A fine-tuned multilingual model for Vietnamese reasoning |
|
|
|
## NOTE |
|
- This model is a test version for our new training pipeline. THIS IS NOT SUITABLE FOR PRODUCTION |
|
- The full model will be updated soon. |
|
|
|
## π Overview |
|
|
|
- A language model with reasoning capability that provides step-by-step reasoning in Vietnamese before delivering answers. |
|
- The model follows a structured XML format with explicit reasoning tags. |
|
- It's designed for educational applications and complex problem-solving tasks in Vietnamese. |
|
|
|
## π§ Method |
|
|
|
- Fine-tuned using Group Relative Policy Optimization (GRPO) with [Unsloth](https://github.com/unslothai/unsloth) for hardware efficiency. |
|
- Employs rule-based reward functions to encourage adherence to Vietnamese XML reasoning format. |
|
- Uses LoRA adaptation on a Vietnamese dataset spanning various task types. |
|
|
|
|
|
## π« Quantization |
|
|
|
Coming Soon! |
|
|
|
## π€ Contributors |
|
|
|
Developed with β€οΈ by [BlossomAI](https://github.com/BlossomAI) |
|
|
|
<p align="left"> |
|
<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/made with unsloth.png" width="200" /> |
|
</p> |
|
|
|
--- |
|
<div align="center"> |
|
<sub>Star βοΈ this repo if you find it valuable!</sub> |
|
</div> |