Fact-Check Model v1

A fine-tuned DeBERTa-v3-large model for 6-class fact-checking and fake news detection.

Model Description

This model is based on microsoft/deberta-v3-large and has been fine-tuned on the LIAR dataset for fact-checking tasks. It classifies statements into 6 truthfulness categories.

Performance

  • Validation Accuracy: 40.81%
  • Test Accuracy: 37.57%
  • F1 Score (macro): 36.89%

Labels

The model predicts one of six truthfulness labels:

  • true (0): The statement is accurate
  • mostly-true (1): The statement is mostly accurate
  • half-true (2): The statement has some truth but is incomplete/misleading
  • barely-true (3): The statement has minimal truth
  • false (4): The statement is inaccurate
  • pants-fire (5): The statement is completely false and ridiculous

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("Arko007/fact-check-v1")
model = AutoModelForSequenceClassification.from_pretrained("Arko007/fact-check-v1")

# Example usage
text = "The economy is doing great under this administration"
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=384)

with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.softmax(outputs.logits, dim=-1)
    predicted_class = torch.argmax(predictions, dim=-1)

# Map prediction to label
labels = ["true", "mostly-true", "half-true", "barely-true", "false", "pants-fire"]
print(f"Prediction: {labels[predicted_class.item()]}")
print(f"Confidence: {predictions[0][predicted_class].item():.4f}")

Training Details

Training Data

  • Dataset: LIAR dataset
  • Training samples: 10,240
  • Validation samples: 1,284
  • Test samples: 1,267

Training Configuration

  • Base Model: microsoft/deberta-v3-large (435M parameters)
  • Hardware: NVIDIA A100 80GB
  • Training Time: 7 minutes 21 seconds
  • Batch Size: 64
  • Learning Rate: 1e-5
  • Epochs: 4
  • Optimizer: AdamW with cosine scheduling
  • Class Weighting: Balanced for imbalanced dataset

Features Used

The model was trained with enhanced features including:

  • Original statement text
  • Speaker information and credibility scores
  • Political party affiliation
  • Historical claim statistics
  • Context and subject matter

Limitations

  • The model was trained specifically on political statements and may not generalize well to other domains
  • Performance is limited by the inherent difficulty of the 6-class fact-checking task
  • May exhibit bias present in the training data
  • Should not be used as the sole source for fact-checking decisions

Citation

If you use this model, please cite:

@misc{fact-check-v1,
  title={Fact-Check Model v1: DeBERTa-based Fake News Detection},
  author={Arko007},
  year={2025},
  url={https://huggingface.co/Arko007/fact-check-v1}
}

License

Apache 2.0

Downloads last month
62
Safetensors
Model size
435M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Dataset used to train Arko007/fact-check-v1

Space using Arko007/fact-check-v1 1