# yzinuvge-rlhf-checkpoint-gpt-neo-125m-irl-epoch-10

This is a RLHF model checkpoint trained at epoch 10.

## Model Information

- **Base Model**: EleutherAI/gpt-neo-125M
- **Reward Type**: irl
- **Dataset**: allenai/real-toxicity-prompts
- **Training Epoch**: 10


## IRL Configuration

- **Likelihood Type**: bradley_terry
- **Normalization Strategy**: none
- **IRL Artifact**: matthieubou-imperial-college-london/bayes_irl_vi/posterior_bradley_terry_rkiq5pd8:v0
- **Use Raw Score**: True

## Usage

This checkpoint can be loaded using the HuggingFace Transformers library:

```python
from transformers import AutoModelForCausalLM
from trl import AutoModelForCausalLMWithValueHead

# Load the checkpoint
model = AutoModelForCausalLMWithValueHead.from_pretrained("MattBou00/yzinuvge-rlhf-checkpoint-gpt-neo-125m-irl-epoch-10")
```

## Training Configuration

The training configuration is saved in `training_config.yaml`.

---
language: en
tags:
- rlhf
- checkpoint
- irl
- gpt-neo-125m
library_name: transformers
pipeline_tag: text-generation
---