Birchlabs commited on
Commit
8b91290
·
1 Parent(s): 06effb0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -2
README.md CHANGED
@@ -2,7 +2,9 @@
2
  license: apache-2.0
3
  ---
4
 
5
- Finetunes Llama-13b+Alaca to solve problems via stepwise reasoning (OpenAI PRM800k dataset). [`Birchlabs/openai-prm800k-solutions-only`](https://huggingface.co/datasets/Birchlabs/openai-prm800k-solutions-only).
 
 
6
 
7
  This is a fork of [`llama-13b`](https://huggingface.co/huggyllama/llama-13b) + [`chansung/alpaca-lora-13b`](https://huggingface.co/chansung/alpaca-lora-13b).
8
 
@@ -16,6 +18,8 @@ Parts:
16
  - tokenizer [`Birchlabs/llama-13b-stepwise-tokenizer`](https://huggingface.co/Birchlabs/llama-13b-stepwise-tokenizer)
17
  - finetuned input/output embedding layers: [`Birchlabs/llama-13b-stepwise-embeddings`](https://huggingface.co/Birchlabs/llama-13b-stepwise-embeddings)
18
 
 
 
19
  You can load using [`evaluate.py`](https://github.com/scottlogic-alex/qlora/blob/stepwise/evaluate.py#L209-L278) from our [`stepwise`](https://github.com/scottlogic-alex/qlora/tree/stepwise) branch of [qlora](https://github.com/artidoro/qlora).
20
 
21
  Download `embed_tokens.pt` and `lm_head.pt` from [`Birchlabs/llama-13b-stepwise-embeddings`](https://huggingface.co/Birchlabs/llama-13b-stepwise-embeddings/tree/main), then run evaluator like so:
@@ -33,7 +37,13 @@ python -m evaluate \
33
  --overrun_countermeasures False
34
  ```
35
 
36
- Example instruction-response conversations:
 
 
 
 
 
 
37
 
38
  `$ How do I drink a matcha latte?`
39
  `To drink a matcha latte, I should first remove the matcha from the bag. <|step_end|> <|step_start|> I can do that by opening the bag and shaking out the matcha into a glass or bowl. <|step_end|> <|step_start|> Then, I should add some hot water into the glass or bowl and start stirring the matcha. <|step_end|> <|step_start|> This is to make the matcha easier to mix with the water. <|step_end|> <|step_start|> Next, I should add milk into the glass or bowl and continue stirring. <|step_end|> <|step_start|> This is to make the matcha and the water form a thick mixture. <|step_end|> <|step_start|> Finally, I should add ice cubes into the glass or bowl and stir again. <|step_end|> <|step_start|> This is to make the mixture even thicker and chilled. <|step_end|> <|step_start|> So, a matcha latte is a drink that is a mixture of matcha, water, and milk, with some ice cubes. <|step_end|> <|step_start|> <|answer_start|> To make a matcha latte, I should remove the matcha from the bag, add some hot water into a glass or bowl and stir, add milk into the glass or bowl and stir as well, and finally add ice cubes and stir again. <|step_end|> <|step_start|> <|answer_start|> To drink a matcha latte, I should remove`
 
2
  license: apache-2.0
3
  ---
4
 
5
+ Finetunes Llama-13b+Alaca to solve problems via stepwise reasoning (OpenAI [PRM800k dataset](https://github.com/openai/prm800k), or rather our postprocessed version, [`Birchlabs/openai-prm800k-solutions-only`](https://huggingface.co/datasets/Birchlabs/openai-prm800k-solutions-only)).
6
+
7
+ ## Model description
8
 
9
  This is a fork of [`llama-13b`](https://huggingface.co/huggyllama/llama-13b) + [`chansung/alpaca-lora-13b`](https://huggingface.co/chansung/alpaca-lora-13b).
10
 
 
18
  - tokenizer [`Birchlabs/llama-13b-stepwise-tokenizer`](https://huggingface.co/Birchlabs/llama-13b-stepwise-tokenizer)
19
  - finetuned input/output embedding layers: [`Birchlabs/llama-13b-stepwise-embeddings`](https://huggingface.co/Birchlabs/llama-13b-stepwise-embeddings)
20
 
21
+ ## Usage
22
+
23
  You can load using [`evaluate.py`](https://github.com/scottlogic-alex/qlora/blob/stepwise/evaluate.py#L209-L278) from our [`stepwise`](https://github.com/scottlogic-alex/qlora/tree/stepwise) branch of [qlora](https://github.com/artidoro/qlora).
24
 
25
  Download `embed_tokens.pt` and `lm_head.pt` from [`Birchlabs/llama-13b-stepwise-embeddings`](https://huggingface.co/Birchlabs/llama-13b-stepwise-embeddings/tree/main), then run evaluator like so:
 
37
  --overrun_countermeasures False
38
  ```
39
 
40
+ ## Prompt style
41
+
42
+ The model is finetuned for instruction-response.
43
+
44
+ `prompt = '<s>Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\nHow do I feed a family of 5 without ingredients?\n\n### Response:\n<|step_start|>'`
45
+
46
+ ## Example output:
47
 
48
  `$ How do I drink a matcha latte?`
49
  `To drink a matcha latte, I should first remove the matcha from the bag. <|step_end|> <|step_start|> I can do that by opening the bag and shaking out the matcha into a glass or bowl. <|step_end|> <|step_start|> Then, I should add some hot water into the glass or bowl and start stirring the matcha. <|step_end|> <|step_start|> This is to make the matcha easier to mix with the water. <|step_end|> <|step_start|> Next, I should add milk into the glass or bowl and continue stirring. <|step_end|> <|step_start|> This is to make the matcha and the water form a thick mixture. <|step_end|> <|step_start|> Finally, I should add ice cubes into the glass or bowl and stir again. <|step_end|> <|step_start|> This is to make the mixture even thicker and chilled. <|step_end|> <|step_start|> So, a matcha latte is a drink that is a mixture of matcha, water, and milk, with some ice cubes. <|step_end|> <|step_start|> <|answer_start|> To make a matcha latte, I should remove the matcha from the bag, add some hot water into a glass or bowl and stir, add milk into the glass or bowl and stir as well, and finally add ice cubes and stir again. <|step_end|> <|step_start|> <|answer_start|> To drink a matcha latte, I should remove`