Improve model card and add missing metadata
Browse filesThis PR significantly improves the model card by:
- Adding the `pipeline_tag: text-classification` to reflect the model's functionality.
- Specifying the `license: cc-by-nc-4.0`
- Adding the `library_name: transformers`
- Filling in missing information in the Model Details section with information from the Github README.
- Adding the base model and training dataset to the metadata.
- Adding a usage example to the model card.
The model card is now much more informative and useful for potential users.
README.md
CHANGED
@@ -1,128 +1,99 @@
|
|
1 |
---
|
|
|
2 |
library_name: transformers
|
3 |
-
|
|
|
|
|
|
|
4 |
---
|
5 |
|
6 |
-
# Model Card for
|
7 |
-
|
8 |
-
<!-- Provide a quick summary of what the model is/does. -->
|
9 |
-
|
10 |
|
|
|
11 |
|
12 |
## Model Details
|
13 |
|
14 |
### Model Description
|
15 |
|
16 |
-
This
|
17 |
|
18 |
|
19 |
-
|
20 |
-
|
21 |
-
- **Developed by:** [More Information Needed]
|
22 |
- **Funded by [optional]:** [More Information Needed]
|
23 |
- **Shared by [optional]:** [More Information Needed]
|
24 |
-
- **Model type:**
|
25 |
-
- **Language(s) (NLP):**
|
26 |
-
- **License:**
|
27 |
-
- **Finetuned from model [optional]:**
|
28 |
|
29 |
### Model Sources [optional]
|
30 |
|
31 |
-
|
32 |
-
|
33 |
-
- **Repository:** [More Information Needed]
|
34 |
-
- **Paper [optional]:** [More Information Needed]
|
35 |
- **Demo [optional]:** [More Information Needed]
|
36 |
|
37 |
## Uses
|
38 |
|
39 |
-
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
|
40 |
-
|
41 |
### Direct Use
|
42 |
|
43 |
-
|
44 |
-
|
45 |
-
[More Information Needed]
|
46 |
|
47 |
### Downstream Use [optional]
|
48 |
|
49 |
-
|
50 |
|
51 |
-
[More Information Needed]
|
52 |
|
53 |
### Out-of-Scope Use
|
54 |
|
55 |
-
|
56 |
-
|
57 |
-
[More Information Needed]
|
58 |
|
59 |
## Bias, Risks, and Limitations
|
60 |
|
61 |
-
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
|
62 |
-
|
63 |
[More Information Needed]
|
64 |
|
65 |
-
### Recommendations
|
66 |
|
67 |
-
|
68 |
|
69 |
-
|
70 |
|
71 |
## How to Get Started with the Model
|
72 |
|
73 |
-
|
74 |
-
|
75 |
-
[More Information Needed]
|
76 |
|
77 |
## Training Details
|
78 |
|
79 |
### Training Data
|
80 |
|
81 |
-
|
82 |
-
|
83 |
-
[More Information Needed]
|
84 |
|
85 |
### Training Procedure
|
86 |
|
87 |
-
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
88 |
-
|
89 |
-
#### Preprocessing [optional]
|
90 |
-
|
91 |
[More Information Needed]
|
92 |
|
93 |
|
94 |
#### Training Hyperparameters
|
95 |
|
96 |
-
- **Training regime:** [More Information Needed]
|
97 |
|
98 |
-
#### Speeds, Sizes, Times [optional]
|
99 |
|
100 |
-
|
101 |
|
102 |
[More Information Needed]
|
103 |
|
104 |
## Evaluation
|
105 |
|
106 |
-
<!-- This section describes the evaluation protocols and provides the results. -->
|
107 |
-
|
108 |
### Testing Data, Factors & Metrics
|
109 |
|
110 |
#### Testing Data
|
111 |
|
112 |
-
<!-- This should link to a Dataset Card if possible. -->
|
113 |
-
|
114 |
[More Information Needed]
|
115 |
|
116 |
#### Factors
|
117 |
|
118 |
-
<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
|
119 |
-
|
120 |
[More Information Needed]
|
121 |
|
122 |
#### Metrics
|
123 |
|
124 |
-
<!-- These are the evaluation metrics being used, ideally with a description of why. -->
|
125 |
-
|
126 |
[More Information Needed]
|
127 |
|
128 |
### Results
|
@@ -131,70 +102,65 @@ Use the code below to get started with the model.
|
|
131 |
|
132 |
#### Summary
|
133 |
|
134 |
-
|
135 |
-
|
136 |
-
## Model Examination [optional]
|
137 |
-
|
138 |
-
<!-- Relevant interpretability work for the model goes here -->
|
139 |
-
|
140 |
[More Information Needed]
|
141 |
|
142 |
-
## Environmental Impact
|
143 |
-
|
144 |
-
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
|
145 |
-
|
146 |
-
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
|
147 |
-
|
148 |
-
- **Hardware Type:** [More Information Needed]
|
149 |
-
- **Hours used:** [More Information Needed]
|
150 |
-
- **Cloud Provider:** [More Information Needed]
|
151 |
-
- **Compute Region:** [More Information Needed]
|
152 |
-
- **Carbon Emitted:** [More Information Needed]
|
153 |
-
|
154 |
-
## Technical Specifications [optional]
|
155 |
|
156 |
-
|
157 |
-
|
158 |
-
[More Information Needed]
|
159 |
-
|
160 |
-
### Compute Infrastructure
|
161 |
-
|
162 |
-
[More Information Needed]
|
163 |
|
164 |
-
|
165 |
|
166 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
167 |
|
168 |
-
|
169 |
|
170 |
[More Information Needed]
|
171 |
|
172 |
-
##
|
173 |
|
174 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
175 |
|
176 |
-
|
177 |
|
178 |
-
|
179 |
|
180 |
-
|
181 |
|
182 |
-
|
183 |
-
|
184 |
-
## Glossary [optional]
|
185 |
-
|
186 |
-
<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
|
187 |
|
188 |
-
|
189 |
|
190 |
-
|
191 |
|
192 |
-
|
193 |
|
194 |
-
|
195 |
|
196 |
-
|
197 |
|
198 |
-
|
199 |
|
200 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
pipeline_tag: text-classification
|
3 |
library_name: transformers
|
4 |
+
license: cc-by-nc-4.0
|
5 |
+
base_model: meta-llama/Meta-Llama-3-8B
|
6 |
+
datasets:
|
7 |
+
- m-a-p/COIG-P-CRM
|
8 |
---
|
9 |
|
10 |
+
# Model Card for CRM_llama3
|
|
|
|
|
|
|
11 |
|
12 |
+
This repository contains the CRM_llama3 model, a Chinese Reward Model (CRM), described in the paper [COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values](https://huggingface.co/papers/2504.05535). This model performs text classification, specifically for evaluating the quality of text responses in a reward modeling context.
|
13 |
|
14 |
## Model Details
|
15 |
|
16 |
### Model Description
|
17 |
|
18 |
+
This model is a Chinese Reward Model (CRM) trained on the COIG-P-CRM dataset. It's designed to assess the quality of text responses generated by LLMs, providing a crucial component in reinforcement learning from human feedback (RLHF) pipelines.
|
19 |
|
20 |
|
21 |
+
- **Developed by:** [Multimodal-Art-Projection Team](https://github.com/multimodal-art-projection/COIG-P)
|
|
|
|
|
22 |
- **Funded by [optional]:** [More Information Needed]
|
23 |
- **Shared by [optional]:** [More Information Needed]
|
24 |
+
- **Model type:** Chinese Reward Model (CRM), text classification
|
25 |
+
- **Language(s) (NLP):** Chinese
|
26 |
+
- **License:** cc-by-nc-4.0
|
27 |
+
- **Finetuned from model [optional]:** meta-llama/Meta-Llama-3-8B
|
28 |
|
29 |
### Model Sources [optional]
|
30 |
|
31 |
+
- **Repository:** [This Hugging Face Repository](https://huggingface.co/m-a-p/CRM_llama3)
|
32 |
+
- **Paper [optional]:** [COIG-P Paper](https://huggingface.co/papers/2504.05535)
|
|
|
|
|
33 |
- **Demo [optional]:** [More Information Needed]
|
34 |
|
35 |
## Uses
|
36 |
|
|
|
|
|
37 |
### Direct Use
|
38 |
|
39 |
+
The CRM_llama3 model can be used directly to score Chinese text responses. The model takes a prompt and a response as input and outputs a score representing the quality of the response. See the "Using our CRM" section below for an example.
|
|
|
|
|
40 |
|
41 |
### Downstream Use [optional]
|
42 |
|
43 |
+
This model can be integrated into RLHF pipelines for training and aligning LLMs with human preferences.
|
44 |
|
|
|
45 |
|
46 |
### Out-of-Scope Use
|
47 |
|
48 |
+
This model is specifically trained for evaluating Chinese text responses within a reward modeling context. It is not suitable for other NLP tasks such as text generation or translation.
|
|
|
|
|
49 |
|
50 |
## Bias, Risks, and Limitations
|
51 |
|
|
|
|
|
52 |
[More Information Needed]
|
53 |
|
|
|
54 |
|
55 |
+
### Recommendations
|
56 |
|
57 |
+
[More Information Needed]
|
58 |
|
59 |
## How to Get Started with the Model
|
60 |
|
61 |
+
See the "Using our CRM" section below for a code example.
|
|
|
|
|
62 |
|
63 |
## Training Details
|
64 |
|
65 |
### Training Data
|
66 |
|
67 |
+
The COIG-P-CRM dataset ([Hugging Face Dataset](https://huggingface.co/datasets/m-a-p/COIG-P-CRM)) was used for training. This dataset consists of Chinese preference pairs, where each pair includes a prompt and two responses, one preferred and one rejected, scored by LLMs.
|
|
|
|
|
68 |
|
69 |
### Training Procedure
|
70 |
|
|
|
|
|
|
|
|
|
71 |
[More Information Needed]
|
72 |
|
73 |
|
74 |
#### Training Hyperparameters
|
75 |
|
76 |
+
- **Training regime:** [More Information Needed]
|
77 |
|
|
|
78 |
|
79 |
+
#### Speeds, Sizes, Times [optional]
|
80 |
|
81 |
[More Information Needed]
|
82 |
|
83 |
## Evaluation
|
84 |
|
|
|
|
|
85 |
### Testing Data, Factors & Metrics
|
86 |
|
87 |
#### Testing Data
|
88 |
|
|
|
|
|
89 |
[More Information Needed]
|
90 |
|
91 |
#### Factors
|
92 |
|
|
|
|
|
93 |
[More Information Needed]
|
94 |
|
95 |
#### Metrics
|
96 |
|
|
|
|
|
97 |
[More Information Needed]
|
98 |
|
99 |
### Results
|
|
|
102 |
|
103 |
#### Summary
|
104 |
|
|
|
|
|
|
|
|
|
|
|
|
|
105 |
[More Information Needed]
|
106 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
107 |
|
108 |
+
## Citation [optional]
|
|
|
|
|
|
|
|
|
|
|
|
|
109 |
|
110 |
+
**BibTeX:**
|
111 |
|
112 |
+
```bibtex
|
113 |
+
@misc{pteam2025coigphighqualitylargescalechinese,
|
114 |
+
title={COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values},
|
115 |
+
author={P Team and Siwei Wu and Jincheng Ren and Xinrun Du and Shuyue Guo and Xingwei Qu and Yiming Liang and Jie Liu and Yunwen Li and Tianyu Zheng and Boyu Feng and Huaqing Yuan and Zenith Wang and Jiaheng Liu and Wenhao Huang and Chenglin Cai and Haoran Que and Jian Yang and Yuelin Bai and Zekun Moore Wang and Zhouliang Yu and Qunshu Lin and Ding Pan and Yuchen Jiang and Tiannan Wang and Wangchunshu Zhou and Shenzhi Wang and Xingyuan Bu and Minghao Liu and Guoyin Wang and Ge Zhang and Chenghua Lin},
|
116 |
+
year={2025},
|
117 |
+
eprint={2504.05535},
|
118 |
+
archivePrefix={arXiv},
|
119 |
+
primaryClass={cs.CL},
|
120 |
+
url={https://arxiv.org/abs/2504.05535},
|
121 |
+
}
|
122 |
+
```
|
123 |
|
124 |
+
**APA:**
|
125 |
|
126 |
[More Information Needed]
|
127 |
|
128 |
+
## Using our CRM
|
129 |
|
130 |
+
```python
|
131 |
+
import torch
|
132 |
+
from transformers import AutoModelForSequenceClassification, AutoTokenizer
|
133 |
+
device = "cuda"
|
134 |
+
path = "m-a-p/CRM_llama3"
|
135 |
+
model = AutoModelForSequenceClassification.from_pretrained(path, device_map=device,
|
136 |
+
trust_remote_code=True, torch_dtype=torch.bfloat16)
|
137 |
+
tokenizer = AutoTokenizer.from_pretrained(path, use_fast=True)
|
138 |
|
139 |
+
prompt = "我朋友出现全身浮肿,心脏衰竭,丧失劳动能力,只能卧床,就是很短时间内发生,帮忙解释一下是什么原因造成的?"
|
140 |
|
141 |
+
response = "全身浮肿和心脏衰竭可能是由多种原因引起的,包括:
|
142 |
|
143 |
+
1. **心脏疾病**:心脏衰竭可能导致血液循环不畅,从而引起全身浮肿。心脏无法有效地泵血时,血液可能会在身体下部积聚,导致下肢和腹部水肿。
|
144 |
|
145 |
+
2. **肾脏疾病**:肾脏功能不全也会导致体内液体平衡失调,引起全身浮肿。肾脏是调节体内水分和电解质平衡的重要器官,如果肾功能受损,可能会导致体内积液。
|
|
|
|
|
|
|
|
|
146 |
|
147 |
+
3. **肝脏疾病**:严重的肝脏疾病,如肝硬化,会导致体液从血管渗出到周围组织中,造成浮肿。
|
148 |
|
149 |
+
4. **营养不良**:长期营养不良,特别是蛋白质摄入不足,可能导致低蛋白血症,从而引起全身浮肿。
|
150 |
|
151 |
+
5. **药物副作用**:某些药物,如钙通道阻滞剂、非甾体抗炎药等,可能引起浮肿作为副作用。
|
152 |
|
153 |
+
6. **内分泌失调**:甲状腺功能减退(甲减)等疾病也可能导致全身浮肿。
|
154 |
|
155 |
+
7. **严重感染或炎症**:某些严重感染或炎症性疾病可能导致全身性反应,进而引发浮肿。
|
156 |
|
157 |
+
8. **淋巴系统疾病**:淋巴循环受阻,如淋巴水肿,也可能导致全身浮肿。
|
158 |
|
159 |
+
鉴于您朋友的情况发展迅速,并且已经出现了严重的症状,如心脏衰竭和丧失劳动能力,这提示可能存在急性或严重的健康问题。建议立即就医,进行全面的医学评估和必要的检查,以确定具体原因并接受适当的治疗。未经专业医生诊断,不应自行推断病因或进行治疗。"
|
160 |
+
messages = [{"role": "user", "content": prompt},
|
161 |
+
{"role": "assistant", "content": response}]
|
162 |
+
input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt").to(device)
|
163 |
+
with torch.no_grad():
|
164 |
+
output = model(input_ids)
|
165 |
+
print(output.logits.item())
|
166 |
+
```
|