Self-Training Generative Foundation Reward Models for Reward Reasoning
wangchenglong
wangclnlp
·
AI & ML interests
None yet
Recent Activity
updated
a model
2 days ago
wangclnlp/GRAM-RR-LLaMA-3.2-3B-RewardModel
updated
a model
2 days ago
wangclnlp/GRAM-RR-LLaMA-3.1-8B-RewardModel
updated
a dataset
2 days ago
wangclnlp/GRAM-RR-TrainingData