Fact-Check Model v1

A fine-tuned DeBERTa-v3-large model for 6-class fact-checking and fake news detection.

Model Description

This model is based on microsoft/deberta-v3-large and has been fine-tuned on the LIAR dataset for fact-checking tasks. It classifies statements into 6 truthfulness categories.

Performance

Validation Accuracy: 40.81%
Test Accuracy: 37.57%
F1 Score (macro): 36.89%

Labels

The model predicts one of six truthfulness labels:

true (0): The statement is accurate
mostly-true (1): The statement is mostly accurate
half-true (2): The statement has some truth but is incomplete/misleading
barely-true (3): The statement has minimal truth
false (4): The statement is inaccurate
pants-fire (5): The statement is completely false and ridiculous

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("Arko007/fact-check-v1")
model = AutoModelForSequenceClassification.from_pretrained("Arko007/fact-check-v1")

# Example usage
text = "The economy is doing great under this administration"
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=384)

with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.softmax(outputs.logits, dim=-1)
    predicted_class = torch.argmax(predictions, dim=-1)

# Map prediction to label
labels = ["true", "mostly-true", "half-true", "barely-true", "false", "pants-fire"]
print(f"Prediction: {labels[predicted_class.item()]}")
print(f"Confidence: {predictions[0][predicted_class].item():.4f}")

Training Details

Training Data

Dataset: LIAR dataset
Training samples: 10,240
Validation samples: 1,284
Test samples: 1,267

Training Configuration

Base Model: microsoft/deberta-v3-large (435M parameters)
Hardware: NVIDIA A100 80GB
Training Time: 7 minutes 21 seconds
Batch Size: 64
Learning Rate: 1e-5
Epochs: 4
Optimizer: AdamW with cosine scheduling
Class Weighting: Balanced for imbalanced dataset

Features Used

The model was trained with enhanced features including:

Original statement text
Speaker information and credibility scores
Political party affiliation
Historical claim statistics
Context and subject matter

Limitations

The model was trained specifically on political statements and may not generalize well to other domains
Performance is limited by the inherent difficulty of the 6-class fact-checking task
May exhibit bias present in the training data
Should not be used as the sole source for fact-checking decisions

Citation

If you use this model, please cite:

@misc{fact-check-v1,
  title={Fact-Check Model v1: DeBERTa-based Fake News Detection},
  author={Arko007},
  year={2025},
  url={https://huggingface.co/Arko007/fact-check-v1}
}

License

Apache 2.0

Downloads last month: 62

Safetensors

Model size

435M params

Tensor type

F32

Arko007
/

fact-check-v1