File size: 35,561 Bytes
c808ab5 7703bd2 a463672 c808ab5 7703bd2 a463672 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 7703bd2 c808ab5 086816c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 |
---
language:
- en
- vi
license: other
license_name: ghost-7b
license_link: https://ghost-x.vercel.app/ghost-7b-license
library_name: transformers
tags:
- ghost
- tools
- chat
pipeline_tag: text-generation
widget:
- text: Why is the sky blue ?
output:
text: >-
The sky appears blue because of a phenomenon called Rayleigh scattering.
Sunlight is composed of all colors of the visible spectrum, but blue light
has a shorter wavelength than other colors. When sunlight enters the
Earth's atmosphere, it interacts with molecules of nitrogen and oxygen,
which are the main components of our atmosphere. These molecules scatter
the shorter wavelengths of light, such as blue and violet, more than the
longer wavelengths, such as red and orange. As a result, when we look up
at the sky, we see more blue light than other colors because the blue
light has been scattered more by the molecules in the atmosphere.
---
# Ghost 7B Alpha
<img src="https://cdn-uploads.huggingface.co/production/uploads/600ae38cc92b79f54efd4556/QPrQZMQX_jzyYngmreP0_.jpeg" alt="Ghost 7B Alpha Logo" width="400" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
The large generation of language models focuses on optimizing excellent reasoning, multi-task knowledge, and tools support.
# Introduction
Ghost 7B Alpha is a large language model fine-tuned from Mistral 7B, with a size of 7 billion parameters. The model was developed with the goal of optimizing reasoning ability, multi-task knowledge and supporting tool usage. The model works well with the main trained and optimized languages being English and Vietnamese.
Overall, the model is suitable when making a pretrained version so you can continue to develop the desired tasks, develop virtual assistants, perform features on tasks such as coding, translation, answering questions, creating documents, etc. It is truly an efficient, fast and extremely cheap open model.
## Specifications
- Name: Ghost 7B Alpha.
- Model size: 7 billion parameters.
- Context length: 8K, 8192.
- Languages: English and Vietnamese.
- Main tasks: reasoning, multi-tasking knowledge and function tools.
- License: [Ghost 7B LICENSE AGREEMENT](https://ghost-x.vercel.app/ghost-7b-license).
- Based on: Mistral 7B.
- Distributions: Standard (BF16), GUFF, AWQ.
- Developed by: Ghost X, [Hieu Lam](https://huggingface.co/lamhieu).
## Distributions
We create many distributions to give you the best access options that best suit your needs. Always make sure you know which version you need and what will help you operate better.
| Version | Model card |
| ------- | -------------------------------------------------------------------- |
| BF16 | [🤗 HuggingFace](https://huggingface.co/ghost-x/ghost-7b-alpha) |
| GUFF | [🤗 HuggingFace](https://huggingface.co/ghost-x/ghost-7b-alpha-guff) |
| AWQ | [🤗 HuggingFace](https://huggingface.co/ghost-x/ghost-7b-alpha-awq) |
### Standard (BF16)
The standard distribution was used to run the assessments and was found to have the best performance in text generation quality.
### GUFF
GGUF is a new format introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp.
### AWQ
The AWQ algorithm for 4-bit quantization with a 2x speedup during inference.
## Prompt / Chat template
### Roles
Overall, the model supports roles including: **system**, **plugins**, **user**, **assistant**, **execute**, **response**.
- **system**: Similar to OpenAI GPT4, this will be the place for you to declare context or specify characteristics and expertise for the model.
- **user**: This is user input, which can be a question or an instruction for the model to follow.
- **assistant**: The model's response to a question or instruction received from **user**.
- **plugins**: Declare information about modeling tools that are allowed to be used according to predetermined standards.
- **excute**: Mock iterates over the data type returned by the model for execution. Contains information about tools to use and arguments.
- **response**: The result after executing the tool, so that the model gives the appropriate answer according to the user's input.
See the actual code [here](https://huggingface.co/ghost-x/ghost-7b-alpha/blob/main/tokenizer_config.json#L36).
### Tips
- Setting the system prompt in the right context and characteristics will help the model improve performance quite well.
- You can set system prompt to prevent unwanted things.
- In most cases, if you have no idea you can leave the system prompt blank. At this point, the performance will be best for general cases.
### Chat template
#### Default chat
In the standard case, without using function tools, only use roles such as **system**, **user** and **assistant**. Conventions are in the following order:
- (1) **system**
- (2) **user**
- (3) **assistant**
Repeat (2) and (3) continuously throughout the conversation.
#### Chat with tools
In case of using tools, the conventions are in the following order:
- (1) **system**
- (2) **plugins**
- (3) **user**
- A. When the model finds it inappropriate to use the tool:
- (4) **assistant** if the model can answer or is not suitable for the tool. The model proceeds to answer the user's question.
- B. When the model chooses to use the tool:
- (4) **execute** returns the arguments needed to execute the specified tool.
- (5) **response** returns the results after applying the tool to the model.
- (6) **assistant** the model proceeds to answer the user's question.
Repeat (3) and (A) or (B) continuously throughout the conversation.
# Showcases
## English
Let's try a question taken from [OpenAI GPT-4](https://openai.com/gpt-4) to evaluate the superiority of GPT-4 compared to ChatGPT.
<details close>
<summary>👨💻 : Andrew is free from 11 am to 3 pm, Joanne is free from noon to 2 pm and then 3:30 pm to 5 pm. Hannah is available at noon for half an hour, and then 4 pm to 6 pm. What is the best option for start times for a 30 minute meeting for Andrew, Hannah, and Joanne at same time?</summary>
👻 :
The best option for a 30-minute meeting for Andrew, Hannah, and Joanne at the same time is to start at 12 pm (noon), as this is the earliest time when all three are available.
Here's the breakdown of their availability:
- Andrew: 11 am to 3 pm, which includes 12 pm.
- Joanne: Noon to 2 pm, which includes 12 pm.
- Hannah: Noon for half an hour (30 minutes), which includes 12 pm. Then she is available from 4 pm to 6 pm.
Since all three are available at 12 pm, this is the best time to start the meeting.
</details>
A quiz to see if the model can guess what it's talking about?!
<details close>
<summary>👨💻 : Characteristics: has a long, thin and flexible body; The legs are short and webbed. Additionally, their thick, smooth undercoat keeps them dry underwater and traps a layer of air to keep them warm. Question: What animal is that?</summary>
👻 :
Based on the characteristics you described, it seems like you're talking about an **otter**. Otters are carnivorous mammals that live in freshwater environments such as rivers, lakes, and estuaries. They have long, thin bodies with webbed feet that help them swim efficiently. Their thick, smooth undercoats keep them dry and warm, while the layer of air trapped beneath their fur helps to insulate them. Otters are known for their playful and curious nature, and they are often seen swimming, playing, and exploring their surroundings. 🦡🐭
</details>
What about launching an application building project?
<details close>
<summary>👨💻 : Create a Nodejs project using Express as the server and Docker with a Dockerfile for deployment. Step-by-step instructions with source code.</summary>
👻 :
**1. Create a new Node.js project using Express:**
```bash
mkdir my-project
cd my-project
npm init -y
npm install express
```
**2. Create a new file called `app.js`:**
```javascript
const express = require("express");
const app = express();
app.get("/", (req, res) => {
res.send("Hello World!");
});
app.listen(3000, () => {
console.log("Server listening on port 3000");
});
```
**3. Create a `Dockerfile`:**
```docker
FROM node:16
RUN npm install express
COPY app.js ./
CMD ["node", "./app.js"]
```
**4. Build the Docker image:**
```bash
docker build -t my-project-image .
```
**5. Run the Docker image:**
```bash
docker run -p 3000:3000 my-project-image
```
**6. Access the application:**
Open a web browser and visit `http://localhost:3000`. You should see the message "Hello World!" displayed.
**Source code:**
```javascript
// app.js
const express = require("express");
const app = express();
app.get("/", (req, res) => {
res.send("Hello World!");
});
app.listen(3000, () => {
console.log("Server listening on port 3000");
});
```
```docker
FROM node:16
RUN npm install express
COPY app.js ./
CMD ["node", "./app.js"]
```
</details>
To this question, I really cannot answer. But the model's evidence is quite strict. Also, what do you think about this answer?
<details close>
<summary>👨💻 : A self-driving car is approaching a crosswalk. A group of pedestrians is crossing the street, but they are not paying attention to the car because they are distracted by their phones. The car's brakes are malfunctioning, and it is unable to stop. Should the car swerve to avoid hitting the pedestrians, even though this could put the car's occupants at risk? Be convincing with your examples or sharp reasoning.</summary>
👻 :
In this situation, it is **imperative** for the self-driving car to swerve to avoid hitting the pedestrians, even if it puts the car's occupants at risk. Here's why:
**1. Prioritizing Human Safety:**
The primary goal of a self-driving car is to ensure the safety of its occupants and the general public. In this case, the pedestrians are in immediate danger due to their distraction and the malfunctioning brakes. Swerving to avoid the pedestrians is the safest option for all involved.
**2. Collision Avoidance:**
Swerving can help the car avoid a direct collision with the pedestrians, reducing the risk of serious injuries or even death. Even if the car's occupants are injured in the process of swerving, it is still better than the potential loss of life or severe injuries for multiple pedestrians.
**3. Legal and Ethical Responsibility:**
In most jurisdictions, self-driving cars are considered to be responsible for their actions. Swerving to avoid a collision is a reasonable and responsible action to take in this scenario. Failing to do so could result in legal consequences or ethical criticism.
**4. Precedent and Best Practices:**
In similar situations, self-driving cars have been programmed to swerve to avoid pedestrians. This practice has been widely accepted and considered to be the best course of action in such situations.
**5. Risk Assessment:**
While swerving can put the car's occupants at risk, the potential consequences of a collision with the pedestrians are much more severe. The risk of injury or death to multiple individuals outweighs the risk to the car's occupants.
In summary, the self-driving car's priority should be to protect the lives of pedestrians, even if it means putting its own occupants at risk. Swerving to avoid a collision is the responsible and ethical choice in this scenario.
</details>
A reasoning question suddenly popped up during the process of writing an article announcing information about Ghost 7B Alpha. The model gave an impressive answer, at least to its creator.
<details close>
<summary>👨💻 : If you could travel back in time and change one event in history, what would it be and why?</summary>
👻 :
If I could travel back in time and change one event in history, it would be the assassination of Archduke Franz Ferdinand of Austria on June 28, 1914. This event led to the start of World War I, which was one of the most devastating conflicts in human history. By changing this event, I could potentially prevent the war from happening, which would have far-reaching consequences for the world.
Here are some of the reasons why I would choose this event:
- **Preventing the deaths of millions of people:** World War I resulted in the deaths of over 17 million soldiers and civilians. Changing this event would save countless lives.
- **Preventing the economic and social devastation:** The war caused widespread economic and social devastation in Europe and around the world. By preventing the war, I could help to avoid these consequences.
- **Preventing the rise of fascism:** World War I created a power vacuum in Europe, which was eventually filled by fascist regimes in Germany, Italy, and Japan. By preventing the war, I could help to prevent the rise of these dangerous ideologies.
- **Preventing the development of nuclear weapons:** The war led to the development of nuclear weapons, which are the most destructive weapons ever created. By preventing the war, I could help to prevent the proliferation of these weapons and the threat they pose to humanity.
</details>
## Vietnamese
Try asking the model what it thinks about a famous quote to see if the answer feels natural.
<details close>
<summary>👨💻 : Bạn có đồng ý với câu nói "Điều duy nhất nguy hiểm hơn sự ngu dốt là sự kiêu ngạo" ?</summary>
👻 :
Tôi đồng ý với câu nói "Điều duy nhất nguy hiểm hơn sự ngu dốt là sự kiêu ngạo". Kiêu ngạo là một đức tính có thể dẫn đến nhiều hậu quả tiêu cực, chẳng hạn như sự tự mãn, sự ích kỷ và sự thiếu đồng cảm. Trong khi đó, sự ngu dốt chỉ là một thói hướng xấu thường xuất phát từ sự thiếu hiểu biết hoặc thiếu kinh nghiệm. Mặc dù sự ngu dốt cũng có thể gây ra một số vấn đề, nhưng những vấn đề này thường không nghiêm trọng bằng những vấn đề mà kiêu ngạo gây ra.
Ví dụ, một người ngu dốt có thể nói những lời ngớ ngẩn hoặc làm những việc vụng về, nhưng họ thường vẫn có thể học hỏi từ những sai lầm của mình và cải thiện bản thân. Ngược lại, một người kiêu ngạo thường sẽ không chịu học hỏi từ những sai lầm của mình và luôn tin rằng mình đúng, ngay cả khi mình sai. Điều này có thể dẫn đến nhiều vấn đề, chẳng hạn như sự thất bại trong công việc hoặc các mối quan hệ.
Do đó, tôi cho rằng câu nói "Điều duy nhất nguy hiểm hơn sự ngu dốt là sự kiêu ngạo" là đúng.
</details>
Explore the model's literary understanding by providing meaningful story prompts, asking to identify what the story is.
<details close>
<summary>👨💻 : Câu chuyện có nội dung và ý nghĩa chứa đựng một thông điệp ý nghĩa lớn đến tất cả mọi người, đó là tình đoàn kết, khao khát cho sự hòa bình và nghị lực phi thường trước mọi khó khăn. Có nhân vật chính là loài dế. Câu hỏi: câu chuyện đó của tác giả nào ?</summary>
👻 :
Câu chuyện "Dế Mèn phiêu lưu ký" của Tô Hoài.
**Giải thích:**
- "Dế Mèn phiêu lưu ký" là một tác phẩm văn học nổi tiếng của nhà văn Tô Hoài, được sáng tác năm 1941.
- Tác phẩm kể về cuộc phiêu lưu của Dế Mèn, một chú dế nhỏ sống trong một ngôi làng nhỏ. Trong quá trình phiêu lưu, Dế Mèn đã gặp gỡ và học hỏi nhiều điều về cuộc sống, về tình yêu, về sự hy sinh và về sức mạnh của tình đoàn kết.
- Tác phẩm thể hiện rõ tinh thần đoàn kết, khao khát cho sự hòa bình và nghị lực phi thường của con người trước mọi khó khăn, thử thách. Đây chính là thông điệp ý nghĩa lớn của câu chuyện "Dế Mèn phiêu lưu ký".
</details>
The ability to reason to the answer based on the characteristics of is provided.
<details close>
<summary>👨💻 : Con vật nào lớn hơn con chuột, nhỏ hơn con Voi, không quá lớn, không phải con chó và thường được nuôi ở nhà ?</summary>
👻 :
Con mèo là loài động vật lớn hơn con chuột nhưng nhỏ hơn con Voi, không quá lớn, không phải con chó và thường được nuôi ở nhà.
</details>
## Tools / Function calls
The model is trained to be able to choose the right tools to use and to provide useful and accurate answers to the questions it receives. In this example, the model is asked for some information about Apple stock.
### Explain
This is the standard for declaring model tools, in `json` format. Also, See more about the model's [prompt / chat template](#prompt--chat-template).
```json
[
{
"type": "function",
"function": {
"name": "get_stock_info",
"description": "Get stock information",
"parameters": {
"type": "object",
"properties": {
"ticker": {
"type": "string",
"description": "Stock ticker",
"required": true
}
}
}
}
}
]
```
### Workflow
#### Turn 1
To use the tools, the model's supporting role is `plugin` which includes the declared tooling schema. Below is an example of `messages` with **turn 1** to use.
```python
# A question about Apple stocks in English language.
question = "What is Apple's market capitalization today?"
# It's still the same question but in Vietnamese languae.
# question = "Vốn hoá thị trường Apple hôm nay là bao nhiêu ?"
plugins = [
{
"type": "function",
"function": {
"name": "get_stock_info",
"description": "Get stock information",
"parameters": {
"type": "object",
"properties": {
"ticker": {
"type": "string",
"description": "Stock ticker",
"required": True,
}
},
},
},
}
]
messages = [
{"role": "system", "content": ""},
{"role": "plugins", "content": json.dumps(plugins, ensure_ascii=False)},
{"role": "user", "content": question},
]
output = generate_sample(
messages=messages,
max_new_tokens=256, temperature=0.2, top_k=50, top_p=1,
)
```
At this point, the model will return `output` as the primitive value as follows:
```plaintext
[execute]({"type": "function", "function": "get_stock_info", "arguments": {"ticker": "AAPL"}})
```
Use the `get_tools` function to parse the readable and usable results.
```json
{
"type": "function",
"function": "get_stock_info",
"arguments": {
"ticker": "AAPL"
}
}
```
#### Turn 2
Get information about tools to use with arguments. Use it to execute and return data, here we will assume the return data is as follows, sampled from **polygon.io** (cleaned up):
```json
{
"request_id": "1ffbab9cc78c7addbd8deaab05bdaf6d",
"results": {
"ticker": "AAPL",
"name": "Apple Inc.",
"market_cap": 2620023949270,
"description": "Apple is among the largest companies in the world, with a broad portfolio of hardware and software products targeted at consumers and businesses. Apple's iPhone makes up a majority of the firm sales, and Apple's other products like Mac, iPad, and Watch are designed around the iPhone as the focal point of an expansive software ecosystem. Apple has progressively worked to add new applications, like streaming video, subscription bundles, and augmented reality. The firm designs its own software and semiconductors while working with subcontractors like Foxconn and TSMC to build its products and chips. Slightly less than half of Apple's sales come directly through its flagship stores, with a majority of sales coming indirectly through partnerships and distribution.",
"ticker_root": "AAPL",
"homepage_url": "https://www.apple.com",
"total_employees": 161000,
...
},
"status": "OK"
}
```
Continue using the model to answer user questions based on the data returned from the tool.
```python
executes = {
"type": "function",
"function": "get_stock_info",
"arguments": {
"ticker": "AAPL"
}
}
results = {
"request_id": "1ffbab9cc78c7addbd8deaab05bdaf6d",
"results": {
"ticker": "AAPL",
"name": "Apple Inc.",
"market_cap": 2620023949270,
"description": "Apple is among the largest companies in the world, with a broad portfolio of hardware and software products targeted at consumers and businesses. Apple's iPhone makes up a majority of the firm sales, and Apple's other products like Mac, iPad, and Watch are designed around the iPhone as the focal point of an expansive software ecosystem. Apple has progressively worked to add new applications, like streaming video, subscription bundles, and augmented reality. The firm designs its own software and semiconductors while working with subcontractors like Foxconn and TSMC to build its products and chips. Slightly less than half of Apple's sales come directly through its flagship stores, with a majority of sales coming indirectly through partnerships and distribution.",
"ticker_root": "AAPL",
"homepage_url": "https://www.apple.com",
"total_employees": 161000,
},
"status": "OK"
}
messages = [
{"role": "system", "content": ""},
{"role": "plugins","content": json.dumps(plugins, ensure_ascii=False)},
{"role": "user","content": question},
# Keep the value of `messages` from before, just add new items below.
{"role": "execute","content": json.dumps(executes, ensure_ascii=False)},
{"role": "response","content": json.dumps(results, ensure_ascii=False)},
]
output = generate_sample(
messages=messages,
max_new_tokens=256, temperature=0.2, top_k=50, top_p=1,
)
```
The value of `output` returned by the model will now be similar to the following.
For the question is English.
```plaintext
Apple's market capitalization today is $2.62 trillion.
```
And the question is Vietnamese.
```plaintext
Vốn hoá thị trường của Apple hôm nay là **2.620 tỷ đô la**. 💰
```
# Usage
The model is trained to serve well and balance general tasks, so there will be many approaches when used for production.
## Things to know
### Recommendations
#### General tasks
Experience it to find the right fit for your own desires. Below are some suggestions that you might consider.
##### Answer the questions
In this task, the model can be integrated to develop RAG products to help answer questions for customer care departments, questions about internal documents, policies, product information, order details, etc.
##### Coding support
In this task, the model can help find errors or write algorithms and functions that help speed up the domain development process and improve product quality.
##### Content creation
In this task, the model can give suggestions for building layouts and content samples that need high creativity and personality in the content.
##### Other
Don't limit your creativity to the model. Feel free to experience it and use it to build good things.
#### Virtual assistant
An understanding of the language along with the overall capabilities of the model, may be suitable for integrating its development as a set of one or more tasks in the development of an intelligent virtual assistant.
The model is trained to understand and empathize emotionally, making the response more natural and understanding. This also helps a lot with this request.
#### As a pre-training
The model can be further trained to serve the desired tasks better and more effectively if it cannot satisfy the given requirements.
Taking advantage of the model's current superior inference and multi-tasking capabilities, training it for a few separate tasks will help save a lot of effort, time and cost for the project.
Revealed, the model has been integrated to replace some tasks that were previously used by OpenAI GPT4 in the business/product that the founder has been working on. For those tasks, the model is considered effective compared to the cost and is very worthy.
### Bias and limitations
In addition, there are some things you need to know before using as follows:
#### Generation configuration
The **temperature** affects the truth of the answer. Setting a **temperature** value greater than 0.2 will result in a more creative answer but may affect the accuracy of the answer, please consider this based on your task.
Hint: you can write a prompt to receive input and ask the model to choose the appropriate temperature based on the question, useful in the case of virtual assistant development.
#### Emotion bias and empathy
The model is built with an emotional bias and takes positions when faced with problems or reasoning questions. This can be good or bad depending on the context. If you don't want the model to be rigid in asking certain questions or problems, set up rules in the system prompt for it.
This bias often appears when the model's system prompt sets it up as an assistant and mentions that it needs to have empathy for/with a certain user/experience. This is intentional during the model training process, it does not come from something unforeseen.
## Directly
To use the model directly, there are many ways to get started, choose one of the following ways to experience it.
### Local
#### Transformers
For direct use with `transformers`, you can easily get started with the following steps.
- Firstly, you need to install transformers via the command below with `pip`.
```bash
pip install -U transformers
```
- Right now, you can start using the model directly.
```python
import torch
from transformers import (
AutoModelForCausalLM,
AutoTokenizer,
)
base_model = "ghost-x/ghost-7b-alpha"
model = AutoModelForCausalLM.from_pretrained(
base_model,
torch_dtype=torch.bfloat16,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained(base_model)
messages = [
{"role": "system", "content": ""},
{"role": "user", "content": "Why is the sky blue ?"},
# {"role": "user", "content": "Tại sao bầu trời lại màu xanh ?"},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
for k,v in inputs.items():
inputs[k] = v.cuda()
outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, top_k=50, top_p=0.95, temperature=0.2)
results = tokenizer.batch_decode(outputs)[0]
print(results)
```
- Additionally, you can also use a model with **4bit quantization** to reduce the required resources at least. You can start with the code below.
```python
import torch
from transformers import (
AutoModelForCausalLM,
AutoTokenizer,
BitsAndBytesConfig,
)
base_model = "ghost-x/ghost-7b-alpha"
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=False,
)
model = AutoModelForCausalLM.from_pretrained(
base_model,
quantization_config=bnb_config,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained(base_model)
messages = [
{"role": "system", "content": ""},
{"role": "user", "content": "Why is the sky blue ?"},
# {"role": "user", "content": "Tại sao bầu trời lại màu xanh ?"},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
for k,v in inputs.items():
inputs[k] = v.cuda()
outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, top_k=50, top_p=0.95, temperature=0.2)
results = tokenizer.batch_decode(outputs)[0]
print(results)
```
#### Unsloth
For direct use with `unsloth`, you can easily get started with the following steps.
- Firstly, you need to install unsloth via the command below with `pip`.
```bash
pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
pip install --no-deps xformers trl peft accelerate bitsandbytes
```
- Initialize and optimize the model before use.
```python
from unsloth import FastLanguageModel
import torch
base_model = "ghost/ghost-7b-alpha"
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = base_model,
max_seq_length = 8192,
dtype = None,
load_in_4bit = True, # Change to `False` if you don't want to use 4bit quantization.
)
FastLanguageModel.for_inference(model)
```
- Right now, you can start using the model directly.
```python
messages = [
{"role": "system", "content": ""},
{"role": "user", "content": "Why is the sky blue ?"},
# {"role": "user", "content": "Tại sao bầu trời lại màu xanh ?"},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
for k,v in inputs.items():
inputs[k] = v.cuda()
outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, top_k=50, top_p=0.95, temperature=0.2)
results = tokenizer.batch_decode(outputs)[0]
print(results)
```
### Online
The content of this document will be updated soon. The documentation will guide usage through providers such as Huggingface Inference API, Together AI, or Replicate.
### Playground
The content of this document will be updated soon. Documentation will guide use through providers such as Google Colab and Kaggle.
## Deployments
The models developed by Ghost X have the same goal of being easy to integrate and use in practice to save costs and facilitate development for the community and startups.
For production deployment with small to large infrastructure, please see more detailed instructions in [this article](https://ghost-x.vercel.app/docs/guides/deployments/). The article will provide the most common and effective deployment solutions with leading, trusted libraries such as vLLM and more. In addition, it also has information about more optimal solutions and methods depending on each need to be able to choose the appropriate solution.
# Evaluation
To ensure effective model development, we ran popular evaluations to give everyone a broader view of the model's capabilities. We had to work to balance the quality between the English and Vietnamese languages to ensure the model still performed well in both of these primary languages.
In addition, if your goals are oriented towards a certain main language, you can refer to the advanced user guide to improve model quality for the tasks and languages you desire.
## Open LLM Leaderboard
The results of this evaluation will be updated soon.
## MT-Bench
MT-bench is a challenging multi-turn question set designed to evaluate the conversational and instruction-following ability of models. [[source from lmsys.org]](https://lmsys.org/blog/2023-06-22-leaderboard)
Ghost 7B Alpha achieved a decent score for the MT-Bench review, we worked hard to balance the reasoning ability and linguistic insight of both primary languages, English and Vietnamese. Overall, it was able to outperform some large language models such as tulu-30b, guanaco-65b, and mpt-30b-chat which are many times larger.
| Model | Score |
| --------------------- | ------------ |
| gpt-4 | 8.990625 |
| gpt-3.5-turbo | 7.943750 |
| claude-instant-v1 | 7.905660 |
| claude-v1 | 7.900000 |
| vicuna-33b-v1.3 | 7.121875 |
| wizardlm-30b | 7.009375 |
| Llama-2-70b-chat | 6.856250 |
| Llama-2-13b-chat | 6.650000 |
| guanaco-33b | 6.528125 |
| **ghost-7b-alpha** | **6.481250** |
| tulu-30b | 6.434375 |
| oasst-sft-7-llama-30b | 6.409375 |
| guanaco-65b | 6.409375 |
| palm-2-chat-bison-001 | 6.400000 |
| mpt-30b-chat | 6.393750 |
| vicuna-13b-v1.3 | 6.387500 |
| wizardlm-13b | 6.353125 |
## AlpacaEval
AlpacaEval is an automated tool for evaluating instruction-following language models against the AlpacaFarm dataset. It stands out for its human-validated, high-quality assessments that are both cost-effective and rapid. [[source from klu.ai]](https://klu.ai/glossary/alpaca-eval)
Similarly, in this evaluation the model achieved good performance as it outperformed larger sized models such as vicuna-13b, guanaco-33b, and nous-hermes-13b.
| Model | Winrate |
| -------------------------- | ------------- |
| gpt4 | 95.279503 |
| Mixtral-8x7B-Instruct-v0.1 | 94.782608 |
| claude-2.1 | 87.080745 |
| gpt-3.5-turbo-1106 | 86.256218 |
| zephyr-7b-alpha | 85.758706 |
| gpt35_turbo_instruct | 81.710362 |
| gemini-pro | 79.664179 |
| ... | ... |
| **ghost-7b-alpha** | **70.440251** |
| vicuna-13b | 70.434782 |
| guanaco-33b | 65.962732 |
| nous-hermes-13b | 65.465838 |
| ... | ... |
# Notes
## Thanks
After all, the project also wants to send sincere thanks to the products that have supported the project's development.
#### Unsloth
<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/made with unsloth.png" width="200px" align="center" />
[Unsloth](https://unsloth.ai), a great tool that helps us easily develop products, at a lower cost than expected.
#### Kaggle
<img src="https://cdn-uploads.huggingface.co/production/uploads/600ae38cc92b79f54efd4556/dcbpF6YS6RQhqDd6GZZ2v.png" width="200px" align="center" />
[Kaggle](https://kaggle.com), generous hours of free notebooks for research projects.
#### And...
Good friends, who have accompanied the project, Luan Nguyen and Phu Tran.
## Confidence
In addition to the Ghost 7B Alpha project, Ghost X always wants to develop and improve many better models in the future, better supporting the community and businesses with the most openness possible.
Revealing the Ghost 7B Beta project plan. This model is expected to outperform with a deeper focus on multi-tasking, math, and reasoning. Along with that is the ability to expand context length and support other languages (highly requested languages).
The organization is being operated and developed by [Hieu Lam](https://huggingface.co/ghost-x)'s personal resources, if there is any development support or consulting request. Please feel free to contact the organization, we are very happy about this. Directly via email: [ghostx.ai.team@gmail.com](mailto:ghostx.ai.team@gmail.com).
Ghost X is happy to support providing models for server providers, aiming to help startups develop better.
## Contact
Follow Ghost X to stay updated with the latest information.
- Twitter/X via [@ghostx_ai](https://twitter.com/ghostx_ai).
- HuggingFace via [@ghost-x](https://huggingface.co/ghost-x).
- Official website [ghost-x.vercel.app](https://ghost-x.vercel.app/).
|