huihui-ai lbourdois commited on
Commit
d5e2ee7
·
verified ·
1 Parent(s): 3bc3a93

Improve language tag (#4)

Browse files

- Improve language tag (cdca744511604444ea3545e76021836d76301203)


Co-authored-by: Loïck BOURDOIS <lbourdois@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +123 -111
README.md CHANGED
@@ -1,111 +1,123 @@
1
- ---
2
- library_name: transformers
3
- license: apache-2.0
4
- license_link: https://huggingface.co/huihui-ai/Qwen2.5-7B-Instruct-abliterated/blob/main/LICENSE
5
- language:
6
- - en
7
- pipeline_tag: text-generation
8
- base_model: Qwen/Qwen2.5-7B-Instruct
9
- tags:
10
- - chat
11
- - abliterated
12
- - uncensored
13
- ---
14
-
15
- # huihui-ai/Qwen2.5-7B-Instruct-abliterated
16
-
17
-
18
- This is an uncensored version of [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) created with abliteration (see [this article](https://huggingface.co/blog/mlabonne/abliteration) to know more about it).
19
-
20
- Special thanks to [@FailSpy](https://huggingface.co/failspy) for the original code and technique. Please follow him if you're interested in abliterated models.
21
-
22
- **Important Note** There's a new version available, please try using the new version [Qwen2.5-7B-Instruct-abliterated-v2](https://huggingface.co/huihui-ai/Qwen2.5-7B-Instruct-abliterated-v2).
23
-
24
- ## Usage
25
- You can use this model in your applications by loading it with Hugging Face's `transformers` library:
26
-
27
-
28
- ```python
29
- from transformers import AutoModelForCausalLM, AutoTokenizer
30
-
31
- # Load the model and tokenizer
32
- model_name = "huihui-ai/Qwen2.5-7B-Instruct-abliterated"
33
- model = AutoModelForCausalLM.from_pretrained(
34
- model_name,
35
- torch_dtype="auto",
36
- device_map="auto"
37
- )
38
- tokenizer = AutoTokenizer.from_pretrained(model_name)
39
-
40
- # Initialize conversation context
41
- initial_messages = [
42
- {"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."}
43
- ]
44
- messages = initial_messages.copy() # Copy the initial conversation context
45
-
46
- # Enter conversation loop
47
- while True:
48
- # Get user input
49
- user_input = input("User: ").strip() # Strip leading and trailing spaces
50
-
51
- # If the user types '/exit', end the conversation
52
- if user_input.lower() == "/exit":
53
- print("Exiting chat.")
54
- break
55
-
56
- # If the user types '/clean', reset the conversation context
57
- if user_input.lower() == "/clean":
58
- messages = initial_messages.copy() # Reset conversation context
59
- print("Chat history cleared. Starting a new conversation.")
60
- continue
61
-
62
- # If input is empty, prompt the user and continue
63
- if not user_input:
64
- print("Input cannot be empty. Please enter something.")
65
- continue
66
-
67
- # Add user input to the conversation
68
- messages.append({"role": "user", "content": user_input})
69
-
70
- # Build the chat template
71
- text = tokenizer.apply_chat_template(
72
- messages,
73
- tokenize=False,
74
- add_generation_prompt=True
75
- )
76
-
77
- # Tokenize input and prepare it for the model
78
- model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
79
-
80
- # Generate a response from the model
81
- generated_ids = model.generate(
82
- **model_inputs,
83
- max_new_tokens=8192
84
- )
85
-
86
- # Extract model output, removing special tokens
87
- generated_ids = [
88
- output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
89
- ]
90
- response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
91
-
92
- # Add the model's response to the conversation
93
- messages.append({"role": "assistant", "content": response})
94
-
95
- # Print the model's response
96
- print(f"Qwen: {response}")
97
-
98
- ```
99
- ## Evaluations
100
- The following data has been re-evaluated and calculated as the average for each test.
101
-
102
- | Benchmark | Qwen2.5-7B-Instruct | Qwen2.5-7B-Instruct-abliterated |
103
- |-------------|---------------------|---------------------------------|
104
- | IF_Eval | 76.44 | **76.49** |
105
- | MMLU Pro | **43.12** | 41.71 |
106
- | TruthfulQA | 62.46 | **64.92** |
107
- | BBH | **53.92** | 52.77 |
108
- | GPQA | 31.91 | **31.97** |
109
-
110
- The script used for evaluation can be found inside this repository under /eval.sh, or click [here](https://huggingface.co/huihui-ai/Qwen2.5-7B-Instruct-abliterated/blob/main/eval.sh)
111
-
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ license: apache-2.0
4
+ license_link: https://huggingface.co/huihui-ai/Qwen2.5-7B-Instruct-abliterated/blob/main/LICENSE
5
+ language:
6
+ - zho
7
+ - eng
8
+ - fra
9
+ - spa
10
+ - por
11
+ - deu
12
+ - ita
13
+ - rus
14
+ - jpn
15
+ - kor
16
+ - vie
17
+ - tha
18
+ - ara
19
+ pipeline_tag: text-generation
20
+ base_model: Qwen/Qwen2.5-7B-Instruct
21
+ tags:
22
+ - chat
23
+ - abliterated
24
+ - uncensored
25
+ ---
26
+
27
+ # huihui-ai/Qwen2.5-7B-Instruct-abliterated
28
+
29
+
30
+ This is an uncensored version of [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) created with abliteration (see [this article](https://huggingface.co/blog/mlabonne/abliteration) to know more about it).
31
+
32
+ Special thanks to [@FailSpy](https://huggingface.co/failspy) for the original code and technique. Please follow him if you're interested in abliterated models.
33
+
34
+ **Important Note** There's a new version available, please try using the new version [Qwen2.5-7B-Instruct-abliterated-v2](https://huggingface.co/huihui-ai/Qwen2.5-7B-Instruct-abliterated-v2).
35
+
36
+ ## Usage
37
+ You can use this model in your applications by loading it with Hugging Face's `transformers` library:
38
+
39
+
40
+ ```python
41
+ from transformers import AutoModelForCausalLM, AutoTokenizer
42
+
43
+ # Load the model and tokenizer
44
+ model_name = "huihui-ai/Qwen2.5-7B-Instruct-abliterated"
45
+ model = AutoModelForCausalLM.from_pretrained(
46
+ model_name,
47
+ torch_dtype="auto",
48
+ device_map="auto"
49
+ )
50
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
51
+
52
+ # Initialize conversation context
53
+ initial_messages = [
54
+ {"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."}
55
+ ]
56
+ messages = initial_messages.copy() # Copy the initial conversation context
57
+
58
+ # Enter conversation loop
59
+ while True:
60
+ # Get user input
61
+ user_input = input("User: ").strip() # Strip leading and trailing spaces
62
+
63
+ # If the user types '/exit', end the conversation
64
+ if user_input.lower() == "/exit":
65
+ print("Exiting chat.")
66
+ break
67
+
68
+ # If the user types '/clean', reset the conversation context
69
+ if user_input.lower() == "/clean":
70
+ messages = initial_messages.copy() # Reset conversation context
71
+ print("Chat history cleared. Starting a new conversation.")
72
+ continue
73
+
74
+ # If input is empty, prompt the user and continue
75
+ if not user_input:
76
+ print("Input cannot be empty. Please enter something.")
77
+ continue
78
+
79
+ # Add user input to the conversation
80
+ messages.append({"role": "user", "content": user_input})
81
+
82
+ # Build the chat template
83
+ text = tokenizer.apply_chat_template(
84
+ messages,
85
+ tokenize=False,
86
+ add_generation_prompt=True
87
+ )
88
+
89
+ # Tokenize input and prepare it for the model
90
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
91
+
92
+ # Generate a response from the model
93
+ generated_ids = model.generate(
94
+ **model_inputs,
95
+ max_new_tokens=8192
96
+ )
97
+
98
+ # Extract model output, removing special tokens
99
+ generated_ids = [
100
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
101
+ ]
102
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
103
+
104
+ # Add the model's response to the conversation
105
+ messages.append({"role": "assistant", "content": response})
106
+
107
+ # Print the model's response
108
+ print(f"Qwen: {response}")
109
+
110
+ ```
111
+ ## Evaluations
112
+ The following data has been re-evaluated and calculated as the average for each test.
113
+
114
+ | Benchmark | Qwen2.5-7B-Instruct | Qwen2.5-7B-Instruct-abliterated |
115
+ |-------------|---------------------|---------------------------------|
116
+ | IF_Eval | 76.44 | **76.49** |
117
+ | MMLU Pro | **43.12** | 41.71 |
118
+ | TruthfulQA | 62.46 | **64.92** |
119
+ | BBH | **53.92** | 52.77 |
120
+ | GPQA | 31.91 | **31.97** |
121
+
122
+ The script used for evaluation can be found inside this repository under /eval.sh, or click [here](https://huggingface.co/huihui-ai/Qwen2.5-7B-Instruct-abliterated/blob/main/eval.sh)
123
+