--- library_name: transformers license: apache-2.0 datasets: - HuggingFaceFW/fineweb-edu language: - en --- # Model Details This model is a 1B llama3 model pretrained from scratch with torchtitan on fineweb-edu with C_AdamW optimizer. 100B tokens seen. # How to use ``` import torch from transformers import pipeline pipe = pipeline( "text-generation", model="kz919/llama3_1b_cautious_100B_token_8222025", ) print(pipe("The key to life is")) ``` # Downstream Eval ## ARC, Hellaswag, Lambda_OpenAI, OpenbookQA, PIQA ``` lm_eval --model hf --model_args pretrained=kz919/llama3_1b_cautious_100B_token_8222025,dtype="bfloat16",add_bos_token=True --tasks lambada_openai,hellaswag,piqa,arc_easy,arc_challenge,openbookqa --device cuda:7 --batch_size 8 ``` | Tasks |Version|Filter|n-shot| Metric | | Value | |Stderr| |--------------|------:|------|-----:|----------|---|------:|---|-----:| |arc_challenge | 1|none | 0|acc |↑ | 0.3183|± |0.0136| | | |none | 0|acc_norm |↑ | 0.3379|± |0.0138| |arc_easy | 1|none | 0|acc |↑ | 0.6650|± |0.0097| | | |none | 0|acc_norm |↑ | 0.6061|± |0.0100| |hellaswag | 1|none | 0|acc |↑ | 0.3999|± |0.0049| | | |none | 0|acc_norm |↑ | 0.5025|± |0.0050| |lambada_openai| 1|none | 0|acc |↑ | 0.3912|± |0.0068| | | |none | 0|perplexity|↓ |23.8709|± |0.8855| |openbookqa | 1|none | 0|acc |↑ | 0.2580|± |0.0196| | | |none | 0|acc_norm |↑ | 0.3740|± |0.0217| |piqa | 1|none | 0|acc |↑ | 0.7116|± |0.0106| | | |none | 0|acc_norm |↑ | 0.7149|± |0.0105| ## MMLU | Groups |Version|Filter|n-shot|Metric| |Value | |Stderr| |------------------|------:|------|------|------|---|-----:|---|-----:| |mmlu | 2|none | |acc |↑ |0.2519|± |0.0037| | - humanities | 2|none | |acc |↑ |0.2540|± |0.0064| | - other | 2|none | |acc |↑ |0.2527|± |0.0078| | - social sciences| 2|none | |acc |↑ |0.2480|± |0.0078| | - stem | 2|none | |acc |↑ |0.2518|± |0.0077|