kbu1564 commited on
Commit
b8c527c
ยท
verified ยท
1 Parent(s): 3f0bf94

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -2
README.md CHANGED
@@ -54,6 +54,7 @@ This represents approximately a 39ร— reduction in pretraining cost relative to `
54
  ## HuggingFace Usage Example
55
 
56
  ### Python Code
 
57
  ```python
58
  from transformers import AutoModelForCausalLM, AutoTokenizer
59
  model = AutoModelForCausalLM.from_pretrained("naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-0.5B").to(device="cuda")
@@ -67,11 +68,21 @@ chat = [
67
 
68
  inputs = tokenizer.apply_chat_template(chat, add_generation_prompt=True, return_dict=True, return_tensors="pt")
69
  inputs = inputs.to(device="cuda")
70
- output_ids = model.generate(**inputs, max_length=1024, stop_strings=["<|endofturn|>", "<|stop|>"], repetition_penalty=1.2, tokenizer=tokenizer)
 
 
 
 
71
  print(tokenizer.batch_decode(output_ids))
72
  ```
73
 
74
  ### Result
75
  ```bash
76
- ['<|im_start|>tool_list\n<|im_end|>\n<|im_start|>system\n- AI ์–ธ์–ด๋ชจ๋ธ์˜ ์ด๋ฆ„์€ "CLOVA X" ์ด๋ฉฐ ๋„ค์ด๋ฒ„์—์„œ ๋งŒ๋“ค์—ˆ๋‹ค.\n- ์˜ค๋Š˜์€ 2025๋…„ 04์›” 24์ผ(๋ชฉ)์ด๋‹ค.<|im_end|>\n<|im_start|>user\n์Šˆ๋ขฐ๋”ฉ๊ฑฐ ๋ฐฉ์ •์‹๊ณผ ์–‘์ž์—ญํ•™์˜ ๊ด€๊ณ„๋ฅผ ์ตœ๋Œ€ํ•œ ์ž์„ธํžˆ ์•Œ๋ ค์ค˜.<|im_end|>\n<|im_start|>assistant\n์–‘์ž์—ญํ•™์€ ์Šˆ๋ขฐ๋”ฉ๊ฑฐ ๋ฐฉ์ •์‹์„ ํ†ตํ•ด ๋ฌผ์งˆ๊ณผ ์—๋„ˆ์ง€, ๊ณต๊ฐ„ ๋“ฑ์˜ ํ˜„์ƒ์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.\n\n**1. ์Šˆ๋ขฐ๋”ฉ๊ฑฐ ๋ฐฉ์ •์‹**\n\n์Šˆ๋ขฐ๋”ฉ๊ฑฐ๋Š” ํŒŒ๋™ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•˜์—ฌ ์ž…์ž์˜ ์œ„์น˜์™€ ์šด๋™๋Ÿ‰์„ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ฃผ์žฅํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์‹์œผ๋กœ ํ‘œํ˜„๋ฉ๋‹ˆ๋‹ค:\n\n$$\\frac{\\partial \\psi}{\\partial t} = iH \\nabla^2 \\psi + V(x)\\psi $$\n\n์—ฌ๊ธฐ์„œ $\\psi$๋Š” ํŒŒ๋™ํ•จ์ˆ˜์ด๊ณ  $i$๋Š” ํ—ˆ์ˆ˜ ๋‹จ์œ„์ž…๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ $t$๋Š” ์‹œ๊ฐ„, $x$๋Š” ๊ณต๊ฐ„ ์ขŒํ‘œ์ด๋ฉฐ, $H$๋Š” ํ•ด๋ฐ€ํ„ด ์ƒ์ˆ˜๋กœ ์‹œ์Šคํ…œ์˜ ์—๋„ˆ์ง€๋ฅผ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค. ๋˜ํ•œ $V(x)$๋Š” ์™ธ๋ถ€ ํž˜์ด๋‚˜ ์žฅ๋ฒฝ์— ์˜ํ•ด ์˜ํ–ฅ์„ ๋ฐ›๋Š” ๋ถ€๋ถ„์„ ๋‚˜ํƒ€๋‚ด๋Š” ํ•จ์ˆ˜๋กœ, ์ผ๋ฐ˜์ ์œผ๋กœ ์ „์œ„์žฅ์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.\n\n**2. ์–‘์ž์—ญํ•™๊ณผ ์Šˆ๋ขฐ๋”ฉ๊ฑฐ ๋ฐฉ์ •์‹์˜ ๊ด€๊ณ„**\n\n์–‘์ž์—ญํ•™์—์„œ๋Š” ์Šˆ๋ขฐ๋”ฉ๊ฑฐ ๋ฐฉ์ •์‹์ด ๋งค์šฐ ์ค‘์š”ํ•œ ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ๋ชจ๋“  ๋ฌผ๋ฆฌ์  ์‹œ์Šคํ…œ์ด ๋ถˆํ™•์ •์„ฑ ์›๋ฆฌ์— ๋”ฐ๋ผ ํ–‰๋™์„ ํ•˜๋ฉฐ, ์ด๋Ÿฌํ•œ ์‹œ์Šคํ…œ๋“ค์€ ํ™•๋ฅ ์ ์œผ๋กœ ์ƒํƒœ๋ฅผ ๊ฐ€์งˆ ์ˆ˜๋ฐ–์— ์—†๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์Šˆ๋ขฐ๋”ฉ๊ฑฐ ๋ฐฉ์ •์‹์€ ์–‘์ž์—ญํ•™์„ ์ˆ˜ํ•™์ ์œผ๋กœ ๋ชจ๋ธ๋งํ•˜๋Š” ํ•ต์‹ฌ์ ์ธ ๋„๊ตฌ ์ค‘ ํ•˜๋‚˜์ž…๋‹ˆ๋‹ค.\n\n์˜ˆ๋ฅผ ๋“ค์–ด, ์›์žํ•ต ๋‚ด์˜ ์ „์ž๋“ค์˜ ์ƒํƒœ๋Š” ์Šˆ๋ขฐ๋”ฉ๊ฑฐ ๋ฐฉ์ •์‹์— ์˜ํ•ด ๊ฒฐ์ •๋˜๋ฉฐ, ์ด๋Š” ๋ฌผ๋ฆฌํ•™์  ๋ฒ•์น™์„ ๋”ฐ๋ฅด๋Š” ๊ฒƒ์œผ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค. ๋˜ํ•œ, ๊ด‘์ „ ํšจ๊ณผ์—์„œ๋„ ์Šˆ๋ขฐ๋”ฉ๊ฑฐ ๋ฐฉ์ •์‹์€ ๋น›์ด ๋ฌผ์งˆ ๋‚ด์—์„œ ์–ด๋–ป๊ฒŒ ํก์ˆ˜๋˜๊ณ  ๋ฐ˜์‚ฌ๋˜๋Š”์ง€๋ฅผ ์˜ˆ์ธกํ•˜๋Š”๋ฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.\n\n**3. ์‘์šฉ ๋ถ„์•ผ**\n\n์Šˆ๋ขฐ๋”ฉ๊ฑฐ ๋ฐฉ์ •์‹์€ ๋‹ค์–‘ํ•œ ๋ถ„์•ผ์—์„œ ํ™œ์šฉ๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค๋ฉด, ๋ฐ˜๋„์ฒด ๊ธฐ์ˆ ์—์„œ์˜ ํŠธ๋žœ์ง€์Šคํ„ฐ ์„ค๊ณ„, ํ•ต๋ฌผ๋ฆฌํ•™์—์„œ์˜ ๋ฐฉ์‚ฌ์„ฑ ๋ถ•๊ดด ์—ฐ๊ตฌ ๋“ฑ์ด ์žˆ์œผ๋ฉฐ, ์ด๋Š” ๋ชจ๋‘ ์Šˆ๋ขฐ๋”ฉ๊ฑฐ ๋ฐฉ์ •์‹์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ ์ด๋ก ์  ๊ธฐ๋ฐ˜ ์œ„์—์„œ ์ด๋ฃจ์–ด์ง‘๋‹ˆ๋‹ค.\n\n๋˜ํ•œ, ํ˜„๋Œ€ ๊ณผํ•™ ๊ธฐ์ˆ ์˜ ๋ฐœ์ „์—๋„ ํฐ ๊ธฐ์—ฌ๋ฅผ ํ•˜๊ณ  ์žˆ๋Š”๋ฐ, ํŠนํžˆ ์ธ๊ณต์ง€๋Šฅ(AI), ์ปดํ“จํ„ฐ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ๋“ฑ์—์„œ ๋ณต์žกํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ณ  ์ƒˆ๋กœ์šด ์ง€์‹์„ ์ฐฝ์ถœํ•˜๊ธฐ ์œ„ํ•œ ๊ธฐ์ดˆ๊ฐ€ ๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.\n\n๊ฒฐ๋ก ์ ์œผ๋กœ, ์Šˆ๋ขฐ๋”ฉ๊ฑฐ ๋ฐฉ์ •์‹์€ ์–‘์ž์—ญํ•™์˜ ๊ธฐ๋ณธ ๊ฐœ๋…๋“ค์„ ์ดํ•ดํ•˜๊ณ  ํ•ด์„ํ•˜๋ฉฐ, ๊ทธ ๊ฒฐ๊ณผ๋กœ์„œ ๋งŽ์€ ํ˜์‹ ์ ์ด๊ณ  ์‹ค์šฉ์ ์ธ ๊ธฐ์ˆ ์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” ์–‘์ž์—ญํ•™์˜ ์ค‘์š”์„ฑ์„ ๋ณด์—ฌ์ฃผ๋Š” ๋Œ€ํ‘œ์ ์ธ ์˜ˆ์‹œ๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.<|im_end|><|endofturn|>']
 
 
 
 
 
 
77
  ```
 
54
  ## HuggingFace Usage Example
55
 
56
  ### Python Code
57
+ For better inference results with `HyperCLOVAX-SEED-Text-Instruct-0.5B`, we recommend setting `repetition_penalty` to `1.2`.
58
  ```python
59
  from transformers import AutoModelForCausalLM, AutoTokenizer
60
  model = AutoModelForCausalLM.from_pretrained("naver-hyperclovax/HyperCLOVAX-SEED-Text-Instruct-0.5B").to(device="cuda")
 
68
 
69
  inputs = tokenizer.apply_chat_template(chat, add_generation_prompt=True, return_dict=True, return_tensors="pt")
70
  inputs = inputs.to(device="cuda")
71
+ output_ids = model.generate(**inputs,
72
+ max_length=1024,
73
+ stop_strings=["<|endofturn|>", "<|stop|>"],
74
+ repetition_penalty=1.2,
75
+ tokenizer=tokenizer)
76
  print(tokenizer.batch_decode(output_ids))
77
  ```
78
 
79
  ### Result
80
  ```bash
81
+ [
82
+ '<|im_start|>tool_list\n<|im_end|>\n' \
83
+ '<|im_start|>system\n- AI ์–ธ์–ด๋ชจ๋ธ์˜ ์ด๋ฆ„์€ "CLOVA X" ์ด๋ฉฐ ๋„ค์ด๋ฒ„์—์„œ ๋งŒ๋“ค์—ˆ๋‹ค.\n- ์˜ค๋Š˜์€ 2025๋…„ 04์›” 24์ผ(๋ชฉ)์ด๋‹ค.<|im_end|>\n' \
84
+ '<|im_start|>user\n์Šˆ๋ขฐ๋”ฉ๊ฑฐ ๋ฐฉ์ •์‹๊ณผ ์–‘์ž์—ญํ•™์˜ ๊ด€๊ณ„๋ฅผ ์ตœ๋Œ€ํ•œ ์ž์„ธํžˆ ์•Œ๋ ค์ค˜.<|im_end|>\n' \
85
+ '<|im_start|>assistant\n์–‘์ž์—ญํ•™์€ ์Šˆ๋ขฐ๋”ฉ๊ฑฐ ๋ฐฉ์ •์‹์„ ํ†ตํ•ด ๋ฌผ์งˆ๊ณผ ์—๋„ˆ์ง€, ๊ณต๊ฐ„ ๋“ฑ์˜ ํ˜„์ƒ์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.\n\n**1. ์Šˆ๋ขฐ๋”ฉ๊ฑฐ ๋ฐฉ์ •์‹**\n\n์Šˆ๋ขฐ๋”ฉ๊ฑฐ๋Š” ํŒŒ๋™ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•˜์—ฌ ์ž…์ž์˜ ์œ„์น˜์™€ ์šด๋™๋Ÿ‰์„ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ฃผ์žฅํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์‹์œผ๋กœ ํ‘œํ˜„๋ฉ๋‹ˆ๋‹ค:\n\n$$\\frac{\\partial \\psi}{\\partial t} = iH \\nabla^2 \\psi + V(x)\\psi $$\n\n์—ฌ๊ธฐ์„œ $\\psi$๋Š” ํŒŒ๋™ํ•จ์ˆ˜์ด๊ณ  $i$๋Š” ํ—ˆ์ˆ˜ ๋‹จ์œ„์ž…๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ $t$๋Š” ์‹œ๊ฐ„, $x$๋Š” ๊ณต๊ฐ„ ์ขŒํ‘œ์ด๋ฉฐ, $H$๋Š” ํ•ด๋ฐ€ํ„ด ์ƒ์ˆ˜๋กœ ์‹œ์Šคํ…œ์˜ ์—๋„ˆ์ง€๋ฅผ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค. ๋˜ํ•œ $V(x)$๋Š” ์™ธ๋ถ€ ํž˜์ด๋‚˜ ์žฅ๋ฒฝ์— ์˜ํ•ด ์˜ํ–ฅ์„ ๋ฐ›๋Š” ๋ถ€๋ถ„์„ ๋‚˜ํƒ€๋‚ด๋Š” ํ•จ์ˆ˜๋กœ, ์ผ๋ฐ˜์ ์œผ๋กœ ์ „์œ„์žฅ์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.\n\n**2. ์–‘์ž์—ญํ•™๊ณผ ์Šˆ๋ขฐ๋”ฉ๊ฑฐ ๋ฐฉ์ •์‹์˜ ๊ด€๊ณ„**\n\n์–‘์ž์—ญํ•™์—์„œ๋Š” ์Šˆ๋ขฐ๋”ฉ๊ฑฐ ๋ฐฉ์ •์‹์ด ๋งค์šฐ ์ค‘์š”ํ•œ ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ๋ชจ๋“  ๋ฌผ๋ฆฌ์  ์‹œ์Šคํ…œ์ด ๋ถˆํ™•์ •์„ฑ ์›๋ฆฌ์— ๋”ฐ๋ผ ํ–‰๋™์„ ํ•˜๋ฉฐ, ์ด๋Ÿฌํ•œ ์‹œ์Šคํ…œ๋“ค์€ ํ™•๋ฅ ์ ์œผ๋กœ ์ƒํƒœ๋ฅผ ๊ฐ€์งˆ ์ˆ˜๋ฐ–์— ์—†๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์Šˆ๋ขฐ๋”ฉ๊ฑฐ ๋ฐฉ์ •์‹์€ ์–‘์ž์—ญํ•™์„ ์ˆ˜ํ•™์ ์œผ๋กœ ๋ชจ๋ธ๋งํ•˜๋Š” ํ•ต์‹ฌ์ ์ธ ๋„๊ตฌ ์ค‘ ํ•˜๋‚˜์ž…๋‹ˆ๋‹ค.\n\n์˜ˆ๋ฅผ ๋“ค์–ด, ์›์žํ•ต ๋‚ด์˜ ์ „์ž๋“ค์˜ ์ƒํƒœ๋Š” ์Šˆ๋ขฐ๋”ฉ๊ฑฐ ๋ฐฉ์ •์‹์— ์˜ํ•ด ๊ฒฐ์ •๋˜๋ฉฐ, ์ด๋Š” ๋ฌผ๋ฆฌํ•™์  ๋ฒ•์น™์„ ๋”ฐ๋ฅด๋Š” ๊ฒƒ์œผ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค. ๋˜ํ•œ, ๊ด‘์ „ ํšจ๊ณผ์—์„œ๋„ ์Šˆ๋ขฐ๋”ฉ๊ฑฐ ๋ฐฉ์ •์‹์€ ๋น›์ด ๋ฌผ์งˆ ๋‚ด์—์„œ ์–ด๋–ป๊ฒŒ ํก์ˆ˜๋˜๊ณ  ๋ฐ˜์‚ฌ๋˜๋Š”์ง€๋ฅผ ์˜ˆ์ธกํ•˜๋Š”๋ฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.\n\n**3. ์‘์šฉ ๋ถ„์•ผ**\n\n์Šˆ๋ขฐ๋”ฉ๊ฑฐ ๋ฐฉ์ •์‹์€ ๋‹ค์–‘ํ•œ ๋ถ„์•ผ์—์„œ ํ™œ์šฉ๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค๋ฉด, ๋ฐ˜๋„์ฒด ๊ธฐ์ˆ ์—์„œ์˜ ํŠธ๋žœ์ง€์Šคํ„ฐ ์„ค๊ณ„, ํ•ต๋ฌผ๋ฆฌํ•™์—์„œ์˜ ๋ฐฉ์‚ฌ์„ฑ ๋ถ•๊ดด ์—ฐ๊ตฌ ๋“ฑ์ด ์žˆ์œผ๋ฉฐ, ์ด๋Š” ๋ชจ๋‘ ์Šˆ๋ขฐ๋”ฉ๊ฑฐ ๋ฐฉ์ •์‹์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ ์ด๋ก ์  ๊ธฐ๋ฐ˜ ์œ„์—์„œ ์ด๋ฃจ์–ด์ง‘๋‹ˆ๋‹ค.\n\n๋˜ํ•œ, ํ˜„๋Œ€ ๊ณผํ•™ ๊ธฐ์ˆ ์˜ ๋ฐœ์ „์—๋„ ํฐ ๊ธฐ์—ฌ๋ฅผ ํ•˜๊ณ  ์žˆ๋Š”๋ฐ, ํŠนํžˆ ์ธ๊ณต์ง€๋Šฅ(AI), ์ปดํ“จํ„ฐ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ๋“ฑ์—์„œ ๋ณต์žกํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ณ  ์ƒˆ๋กœ์šด ์ง€์‹์„ ์ฐฝ์ถœํ•˜๊ธฐ ์œ„ํ•œ ๊ธฐ์ดˆ๊ฐ€ ๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.\n\n๊ฒฐ๋ก ์ ์œผ๋กœ, ์Šˆ๋ขฐ๋”ฉ๊ฑฐ ๋ฐฉ์ •์‹์€ ์–‘์ž์—ญํ•™์˜ ๊ธฐ๋ณธ ๊ฐœ๋…๋“ค์„ ์ดํ•ดํ•˜๊ณ  ํ•ด์„ํ•˜๋ฉฐ, ๊ทธ ๊ฒฐ๊ณผ๋กœ์„œ ๋งŽ์€ ํ˜์‹ ์ ์ด๊ณ  ์‹ค์šฉ์ ์ธ ๊ธฐ์ˆ ์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” ์–‘์ž์—ญํ•™์˜ ์ค‘์š”์„ฑ์„ ๋ณด์—ฌ์ฃผ๋Š” ๋Œ€ํ‘œ์ ์ธ ์˜ˆ์‹œ๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.<|im_end|>' \
86
+ '<|endofturn|>'
87
+ ]
88
  ```