GRAM-Qwen3-4B-RewardModel-GGUF

GRAM-Qwen3-4B-RewardModel is a generative reward model developed to address reward generalization for Large Language Models (LLMs), released by NiuTrans. Unlike traditional models that depend heavily on task-specific labeled data, this model leverages both labeled and unlabeled data—a novel approach that allows it to generalize better across various tasks. It introduces a generative reward model framework that pre-trains on large amounts of unlabeled data and is subsequently fine-tuned with supervised data. The methodology also employs label smoothing and a regularized ranking loss to further boost performance, effectively bridging the gap between generative and discriminative reward modeling techniques.

This model is built on the Qwen3-4B base and can be directly used or adapted for aligning LLMs without the need to train a reward model from scratch on extensive datasets. In evaluations on the JudgeBench benchmark—covering Chat, Code, Math, and Safety tasks—GRAM-Qwen3-4B-RewardModel achieves a competitive average score of 65.9, making it suitable for use as an open-source, plug-and-play reward model for a variety of LLM alignment scenarios. The repository provides usage instructions and demonstration code to facilitate immediate adoption for research and development purposes

Model Files

Model File name Size QuantType
GRAM-Qwen3-4B-RewardModel.BF16.gguf 8.05 GB BF16
GRAM-Qwen3-4B-RewardModel.F16.gguf 8.05 GB F16
GRAM-Qwen3-4B-RewardModel.F32.gguf 16.1 GB F32
GRAM-Qwen3-4B-RewardModel.Q2_K.gguf 1.67 GB Q2_K
GRAM-Qwen3-4B-RewardModel.Q3_K_L.gguf 2.24 GB Q3_K_L
GRAM-Qwen3-4B-RewardModel.Q3_K_M.gguf 2.08 GB Q3_K_M
GRAM-Qwen3-4B-RewardModel.Q3_K_S.gguf 1.89 GB Q3_K_S
GRAM-Qwen3-4B-RewardModel.Q4_K_M.gguf 2.5 GB Q4_K_M
GRAM-Qwen3-4B-RewardModel.Q4_K_S.gguf 2.38 GB Q4_K_S
GRAM-Qwen3-4B-RewardModel.Q5_K_M.gguf 2.89 GB Q5_K_M
GRAM-Qwen3-4B-RewardModel.Q5_K_S.gguf 2.82 GB Q5_K_S
GRAM-Qwen3-4B-RewardModel.Q6_K.gguf 3.31 GB Q6_K
GRAM-Qwen3-4B-RewardModel.Q8_0.gguf 4.28 GB Q8_0

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):

image.png

Downloads last month
57
GGUF
Model size
4.02B params
Architecture
qwen3
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for prithivMLmods/GRAM-Qwen3-4B-RewardModel-GGUF

Base model

Qwen/Qwen3-4B-Base
Finetuned
Qwen/Qwen3-4B
Quantized
(1)
this model