lbourdois commited on
Commit
2758bbc
·
verified ·
1 Parent(s): 105ee4a

Improve language tag

Browse files

Hi! As the model is multilingual, this is a PR to add other languages than English to the language tag to improve the referencing. Note that 29 languages are announced in the README, but only 13 are explicitly listed. I was therefore only able to add these 13 languages.

Files changed (1) hide show
  1. README.md +106 -95
README.md CHANGED
@@ -1,96 +1,107 @@
1
- ---
2
- library_name: transformers
3
- tags:
4
- - text-generation-inference
5
- - code
6
- - fact
7
- - math
8
- - short-context-reasoning
9
- license: apache-2.0
10
- language:
11
- - en
12
- - zh
13
- base_model:
14
- - Qwen/Qwen2.5-0.5B-Instruct
15
- pipeline_tag: text-generation
16
- ---
17
- ![5.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/UWb3145w-rFhJLCytbsBV.png)
18
-
19
- # **TESS-QwenRe-Fact-0.5B**
20
-
21
- > **TESS-QwenRe-Fact-0.5B** is a **compact fact-checking and short reasoning model** built upon **Qwen2.5 0.5B**. Designed for rapid response, real-world fact verification, and concise logical reasoning, this lightweight model is ideal for digital assistants, quick-response tools, and misinformation detection systems in **English** and **Chinese**.
22
-
23
- ## **Key Features**
24
-
25
- 1. **Fact Verification & Correction**
26
- Trained to analyze factual accuracy in statements and offer corrected or clarified responses, making it ideal for real-time verification tasks and misinformation mitigation.
27
-
28
- 2. **Concise Reasoning**
29
- Specializes in **short-form reasoning**, capable of analyzing and explaining claims, decisions, or statements in just a few logical steps — perfect for Q&A bots and assistant systems.
30
-
31
- 3. **Multilingual Support (EN + ZH)**
32
- Supports fact-checking tasks in both **English** and **Simplified Chinese**, enhancing accessibility for bilingual or regional use cases.
33
-
34
- 4. **Built on Qwen2.5 0.5B**
35
- Combines the latest architectural improvements from **Qwen2.5** with a small parameter footprint (0.5B), optimized for **speed**, **efficiency**, and **edge-device compatibility**.
36
-
37
- 5. **Prompt-Friendly Output**
38
- Responds well to well-structured queries, returning clean, interpretable answers — especially for true/false classification, source-based fact validation, and yes/no reasoning.
39
-
40
- ## **Quickstart with Transformers**
41
-
42
- ```python
43
- from transformers import AutoModelForCausalLM, AutoTokenizer
44
-
45
- model_name = "prithivMLmods/TESS-QwenRe-Fact-0.5B"
46
-
47
- model = AutoModelForCausalLM.from_pretrained(
48
- model_name,
49
- torch_dtype="auto",
50
- device_map="auto"
51
- )
52
- tokenizer = AutoTokenizer.from_pretrained(model_name)
53
-
54
- prompt = "Is the capital of Australia Sydney? Explain briefly."
55
- messages = [
56
- {"role": "system", "content": "You are a concise and accurate fact-checking assistant."},
57
- {"role": "user", "content": prompt}
58
- ]
59
- text = tokenizer.apply_chat_template(
60
- messages,
61
- tokenize=False,
62
- add_generation_prompt=True
63
- )
64
- model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
65
-
66
- generated_ids = model.generate(
67
- **model_inputs,
68
- max_new_tokens=256
69
- )
70
- generated_ids = [
71
- output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
72
- ]
73
-
74
- response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
75
- ```
76
-
77
- ## **Intended Use**
78
-
79
- - **Fact-Checking Assistants**: Quickly verify factual claims in conversation or content.
80
- - **Digital Truth Detectors**: Misinformation and rumor detection in social feeds or news summaries.
81
- - **Micro-Reasoning Bots**: Smart agents for short-form logic and rationale generation.
82
- - **Multilingual Knowledge Tools**: Fact reasoning in **EN/ZH**, ideal for diverse platforms.
83
-
84
- ## **Limitations**
85
-
86
- 1. **Limited Depth**
87
- Focused on **short-form reasoning** — may not perform well on multi-step or abstract logic tasks.
88
-
89
- 2. **Compact Model Scale**
90
- At 0.5B parameters, it prioritizes **efficiency over complexity** — best for straightforward fact-based tasks.
91
-
92
- 3. **Language & Topic Bias**
93
- Inherits limitations and biases from its base model Qwen2.5 0.5B. Use carefully in sensitive contexts.
94
-
95
- 4. **Prompt Clarity Required**
 
 
 
 
 
 
 
 
 
 
 
96
  Structured prompts result in higher factual accuracy and shorter response latency.
 
1
+ ---
2
+ library_name: transformers
3
+ tags:
4
+ - text-generation-inference
5
+ - code
6
+ - fact
7
+ - math
8
+ - short-context-reasoning
9
+ license: apache-2.0
10
+ language:
11
+ - zho
12
+ - eng
13
+ - fra
14
+ - spa
15
+ - por
16
+ - deu
17
+ - ita
18
+ - rus
19
+ - jpn
20
+ - kor
21
+ - vie
22
+ - tha
23
+ - ara
24
+ base_model:
25
+ - Qwen/Qwen2.5-0.5B-Instruct
26
+ pipeline_tag: text-generation
27
+ ---
28
+ ![5.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/UWb3145w-rFhJLCytbsBV.png)
29
+
30
+ # **TESS-QwenRe-Fact-0.5B**
31
+
32
+ > **TESS-QwenRe-Fact-0.5B** is a **compact fact-checking and short reasoning model** built upon **Qwen2.5 0.5B**. Designed for rapid response, real-world fact verification, and concise logical reasoning, this lightweight model is ideal for digital assistants, quick-response tools, and misinformation detection systems in **English** and **Chinese**.
33
+
34
+ ## **Key Features**
35
+
36
+ 1. **Fact Verification & Correction**
37
+ Trained to analyze factual accuracy in statements and offer corrected or clarified responses, making it ideal for real-time verification tasks and misinformation mitigation.
38
+
39
+ 2. **Concise Reasoning**
40
+ Specializes in **short-form reasoning**, capable of analyzing and explaining claims, decisions, or statements in just a few logical steps — perfect for Q&A bots and assistant systems.
41
+
42
+ 3. **Multilingual Support (EN + ZH)**
43
+ Supports fact-checking tasks in both **English** and **Simplified Chinese**, enhancing accessibility for bilingual or regional use cases.
44
+
45
+ 4. **Built on Qwen2.5 0.5B**
46
+ Combines the latest architectural improvements from **Qwen2.5** with a small parameter footprint (0.5B), optimized for **speed**, **efficiency**, and **edge-device compatibility**.
47
+
48
+ 5. **Prompt-Friendly Output**
49
+ Responds well to well-structured queries, returning clean, interpretable answers — especially for true/false classification, source-based fact validation, and yes/no reasoning.
50
+
51
+ ## **Quickstart with Transformers**
52
+
53
+ ```python
54
+ from transformers import AutoModelForCausalLM, AutoTokenizer
55
+
56
+ model_name = "prithivMLmods/TESS-QwenRe-Fact-0.5B"
57
+
58
+ model = AutoModelForCausalLM.from_pretrained(
59
+ model_name,
60
+ torch_dtype="auto",
61
+ device_map="auto"
62
+ )
63
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
64
+
65
+ prompt = "Is the capital of Australia Sydney? Explain briefly."
66
+ messages = [
67
+ {"role": "system", "content": "You are a concise and accurate fact-checking assistant."},
68
+ {"role": "user", "content": prompt}
69
+ ]
70
+ text = tokenizer.apply_chat_template(
71
+ messages,
72
+ tokenize=False,
73
+ add_generation_prompt=True
74
+ )
75
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
76
+
77
+ generated_ids = model.generate(
78
+ **model_inputs,
79
+ max_new_tokens=256
80
+ )
81
+ generated_ids = [
82
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
83
+ ]
84
+
85
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
86
+ ```
87
+
88
+ ## **Intended Use**
89
+
90
+ - **Fact-Checking Assistants**: Quickly verify factual claims in conversation or content.
91
+ - **Digital Truth Detectors**: Misinformation and rumor detection in social feeds or news summaries.
92
+ - **Micro-Reasoning Bots**: Smart agents for short-form logic and rationale generation.
93
+ - **Multilingual Knowledge Tools**: Fact reasoning in **EN/ZH**, ideal for diverse platforms.
94
+
95
+ ## **Limitations**
96
+
97
+ 1. **Limited Depth**
98
+ Focused on **short-form reasoning** — may not perform well on multi-step or abstract logic tasks.
99
+
100
+ 2. **Compact Model Scale**
101
+ At 0.5B parameters, it prioritizes **efficiency over complexity** — best for straightforward fact-based tasks.
102
+
103
+ 3. **Language & Topic Bias**
104
+ Inherits limitations and biases from its base model Qwen2.5 0.5B. Use carefully in sensitive contexts.
105
+
106
+ 4. **Prompt Clarity Required**
107
  Structured prompts result in higher factual accuracy and shorter response latency.