# yzinuvge-rlhf-checkpoint-gpt-neo-125m-irl-epoch-10 This is a RLHF model checkpoint trained at epoch 10. ## Model Information - **Base Model**: EleutherAI/gpt-neo-125M - **Reward Type**: irl - **Dataset**: allenai/real-toxicity-prompts - **Training Epoch**: 10 ## IRL Configuration - **Likelihood Type**: bradley_terry - **Normalization Strategy**: none - **IRL Artifact**: matthieubou-imperial-college-london/bayes_irl_vi/posterior_bradley_terry_rkiq5pd8:v0 - **Use Raw Score**: True ## Usage This checkpoint can be loaded using the HuggingFace Transformers library: ```python from transformers import AutoModelForCausalLM from trl import AutoModelForCausalLMWithValueHead # Load the checkpoint model = AutoModelForCausalLMWithValueHead.from_pretrained("MattBou00/yzinuvge-rlhf-checkpoint-gpt-neo-125m-irl-epoch-10") ``` ## Training Configuration The training configuration is saved in `training_config.yaml`. --- language: en tags: - rlhf - checkpoint - irl - gpt-neo-125m library_name: transformers pipeline_tag: text-generation ---