Fact-Check Model v1
A fine-tuned DeBERTa-v3-large model for 6-class fact-checking and fake news detection.
Model Description
This model is based on microsoft/deberta-v3-large
and has been fine-tuned on the LIAR dataset for fact-checking tasks. It classifies statements into 6 truthfulness categories.
Performance
- Validation Accuracy: 40.81%
- Test Accuracy: 37.57%
- F1 Score (macro): 36.89%
Labels
The model predicts one of six truthfulness labels:
true
(0): The statement is accuratemostly-true
(1): The statement is mostly accuratehalf-true
(2): The statement has some truth but is incomplete/misleadingbarely-true
(3): The statement has minimal truthfalse
(4): The statement is inaccuratepants-fire
(5): The statement is completely false and ridiculous
Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("Arko007/fact-check-v1")
model = AutoModelForSequenceClassification.from_pretrained("Arko007/fact-check-v1")
# Example usage
text = "The economy is doing great under this administration"
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=384)
with torch.no_grad():
outputs = model(**inputs)
predictions = torch.softmax(outputs.logits, dim=-1)
predicted_class = torch.argmax(predictions, dim=-1)
# Map prediction to label
labels = ["true", "mostly-true", "half-true", "barely-true", "false", "pants-fire"]
print(f"Prediction: {labels[predicted_class.item()]}")
print(f"Confidence: {predictions[0][predicted_class].item():.4f}")
Training Details
Training Data
- Dataset: LIAR dataset
- Training samples: 10,240
- Validation samples: 1,284
- Test samples: 1,267
Training Configuration
- Base Model: microsoft/deberta-v3-large (435M parameters)
- Hardware: NVIDIA A100 80GB
- Training Time: 7 minutes 21 seconds
- Batch Size: 64
- Learning Rate: 1e-5
- Epochs: 4
- Optimizer: AdamW with cosine scheduling
- Class Weighting: Balanced for imbalanced dataset
Features Used
The model was trained with enhanced features including:
- Original statement text
- Speaker information and credibility scores
- Political party affiliation
- Historical claim statistics
- Context and subject matter
Limitations
- The model was trained specifically on political statements and may not generalize well to other domains
- Performance is limited by the inherent difficulty of the 6-class fact-checking task
- May exhibit bias present in the training data
- Should not be used as the sole source for fact-checking decisions
Citation
If you use this model, please cite:
@misc{fact-check-v1,
title={Fact-Check Model v1: DeBERTa-based Fake News Detection},
author={Arko007},
year={2025},
url={https://huggingface.co/Arko007/fact-check-v1}
}
License
Apache 2.0
- Downloads last month
- 62