Update README.md
Browse files
README.md
CHANGED
@@ -2,7 +2,9 @@
|
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
|
5 |
-
Finetunes Llama-13b+Alaca to solve problems via stepwise reasoning (OpenAI PRM800k dataset)
|
|
|
|
|
6 |
|
7 |
This is a fork of [`llama-13b`](https://huggingface.co/huggyllama/llama-13b) + [`chansung/alpaca-lora-13b`](https://huggingface.co/chansung/alpaca-lora-13b).
|
8 |
|
@@ -16,6 +18,8 @@ Parts:
|
|
16 |
- tokenizer [`Birchlabs/llama-13b-stepwise-tokenizer`](https://huggingface.co/Birchlabs/llama-13b-stepwise-tokenizer)
|
17 |
- finetuned input/output embedding layers: [`Birchlabs/llama-13b-stepwise-embeddings`](https://huggingface.co/Birchlabs/llama-13b-stepwise-embeddings)
|
18 |
|
|
|
|
|
19 |
You can load using [`evaluate.py`](https://github.com/scottlogic-alex/qlora/blob/stepwise/evaluate.py#L209-L278) from our [`stepwise`](https://github.com/scottlogic-alex/qlora/tree/stepwise) branch of [qlora](https://github.com/artidoro/qlora).
|
20 |
|
21 |
Download `embed_tokens.pt` and `lm_head.pt` from [`Birchlabs/llama-13b-stepwise-embeddings`](https://huggingface.co/Birchlabs/llama-13b-stepwise-embeddings/tree/main), then run evaluator like so:
|
@@ -33,7 +37,13 @@ python -m evaluate \
|
|
33 |
--overrun_countermeasures False
|
34 |
```
|
35 |
|
36 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
37 |
|
38 |
`$ How do I drink a matcha latte?`
|
39 |
`To drink a matcha latte, I should first remove the matcha from the bag. <|step_end|> <|step_start|> I can do that by opening the bag and shaking out the matcha into a glass or bowl. <|step_end|> <|step_start|> Then, I should add some hot water into the glass or bowl and start stirring the matcha. <|step_end|> <|step_start|> This is to make the matcha easier to mix with the water. <|step_end|> <|step_start|> Next, I should add milk into the glass or bowl and continue stirring. <|step_end|> <|step_start|> This is to make the matcha and the water form a thick mixture. <|step_end|> <|step_start|> Finally, I should add ice cubes into the glass or bowl and stir again. <|step_end|> <|step_start|> This is to make the mixture even thicker and chilled. <|step_end|> <|step_start|> So, a matcha latte is a drink that is a mixture of matcha, water, and milk, with some ice cubes. <|step_end|> <|step_start|> <|answer_start|> To make a matcha latte, I should remove the matcha from the bag, add some hot water into a glass or bowl and stir, add milk into the glass or bowl and stir as well, and finally add ice cubes and stir again. <|step_end|> <|step_start|> <|answer_start|> To drink a matcha latte, I should remove`
|
|
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
|
5 |
+
Finetunes Llama-13b+Alaca to solve problems via stepwise reasoning (OpenAI [PRM800k dataset](https://github.com/openai/prm800k), or rather our postprocessed version, [`Birchlabs/openai-prm800k-solutions-only`](https://huggingface.co/datasets/Birchlabs/openai-prm800k-solutions-only)).
|
6 |
+
|
7 |
+
## Model description
|
8 |
|
9 |
This is a fork of [`llama-13b`](https://huggingface.co/huggyllama/llama-13b) + [`chansung/alpaca-lora-13b`](https://huggingface.co/chansung/alpaca-lora-13b).
|
10 |
|
|
|
18 |
- tokenizer [`Birchlabs/llama-13b-stepwise-tokenizer`](https://huggingface.co/Birchlabs/llama-13b-stepwise-tokenizer)
|
19 |
- finetuned input/output embedding layers: [`Birchlabs/llama-13b-stepwise-embeddings`](https://huggingface.co/Birchlabs/llama-13b-stepwise-embeddings)
|
20 |
|
21 |
+
## Usage
|
22 |
+
|
23 |
You can load using [`evaluate.py`](https://github.com/scottlogic-alex/qlora/blob/stepwise/evaluate.py#L209-L278) from our [`stepwise`](https://github.com/scottlogic-alex/qlora/tree/stepwise) branch of [qlora](https://github.com/artidoro/qlora).
|
24 |
|
25 |
Download `embed_tokens.pt` and `lm_head.pt` from [`Birchlabs/llama-13b-stepwise-embeddings`](https://huggingface.co/Birchlabs/llama-13b-stepwise-embeddings/tree/main), then run evaluator like so:
|
|
|
37 |
--overrun_countermeasures False
|
38 |
```
|
39 |
|
40 |
+
## Prompt style
|
41 |
+
|
42 |
+
The model is finetuned for instruction-response.
|
43 |
+
|
44 |
+
`prompt = '<s>Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\nHow do I feed a family of 5 without ingredients?\n\n### Response:\n<|step_start|>'`
|
45 |
+
|
46 |
+
## Example output:
|
47 |
|
48 |
`$ How do I drink a matcha latte?`
|
49 |
`To drink a matcha latte, I should first remove the matcha from the bag. <|step_end|> <|step_start|> I can do that by opening the bag and shaking out the matcha into a glass or bowl. <|step_end|> <|step_start|> Then, I should add some hot water into the glass or bowl and start stirring the matcha. <|step_end|> <|step_start|> This is to make the matcha easier to mix with the water. <|step_end|> <|step_start|> Next, I should add milk into the glass or bowl and continue stirring. <|step_end|> <|step_start|> This is to make the matcha and the water form a thick mixture. <|step_end|> <|step_start|> Finally, I should add ice cubes into the glass or bowl and stir again. <|step_end|> <|step_start|> This is to make the mixture even thicker and chilled. <|step_end|> <|step_start|> So, a matcha latte is a drink that is a mixture of matcha, water, and milk, with some ice cubes. <|step_end|> <|step_start|> <|answer_start|> To make a matcha latte, I should remove the matcha from the bag, add some hot water into a glass or bowl and stir, add milk into the glass or bowl and stir as well, and finally add ice cubes and stir again. <|step_end|> <|step_start|> <|answer_start|> To drink a matcha latte, I should remove`
|