---
base_model: answerdotai/ModernBERT-base
library_name: peft
tags:
- text-classification
- reddit
- conversation-analysis
- constructive-dialogue
- modernbert
- lora
- transformers
- lightweight
- high-throughput
language:
- en
datasets:
- reddit
pipeline_tag: text-classification
repo_url: https://github.com/Niklas257/Reddit-Constructiveness-Classification.git
---

# ModernBERT Reddit Discussion Classifier

A lightweight, high-throughput ModernBERT-based model for classifying constructive vs non-constructive conversations in online forums like Reddit. Optimized for processing vast amounts of Reddit discussion data efficiently.

## Model Description

This model is a QLoRA (Quantized LoRA) fine-tuned version of `answerdotai/ModernBERT-base` specifically designed as a **lightweight** solution for large-scale Reddit discussion analysis.

- **Model Type**: Text Classification (Binary)
- **Base Model**: answerdotai/ModernBERT-base
- **Training Method**: QLoRA with self-training
- **Task**: Binary classification of conversation constructiveness
- **Language**: English

### Model Source

- **Repository**: https://github.com/Niklas257/Reddit-Constructiveness-Classification.git

## Intended Uses

### Primary Use Case
- Classifying Reddit discussions as constructive or non-constructive
- Content moderation assistance
- Large-scale conversation quality analysis
- Social media research

### Direct Use
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from peft import PeftModel
import torch

# Load base model and tokenizer
base_model_name = "answerdotai/ModernBERT-base"
tokenizer = AutoTokenizer.from_pretrained(base_model_name)
model = AutoModelForSequenceClassification.from_pretrained(
    base_model_name,
    num_labels=2
)

# Load the fine-tuned adapters
model = PeftModel.from_pretrained(model, "NiklasKoch/modernbert-discussion-classifier")
model.eval()

# Classify text (optimized for batch processing)
def classify_text(text):
    inputs = tokenizer(
        text, 
        return_tensors="pt", 
        truncation=True, 
        padding=True, 
        max_length=4096
    )
    
    # Move inputs to same device as model (important for GPU usage)
    inputs = {k: v.to(next(model.parameters()).device) for k, v in inputs.items()}
    
    with torch.no_grad():
        outputs = model(**inputs)
        predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
        
    # 0 = non-constructive, 1 = constructive
    predicted_class = torch.argmax(predictions, dim=-1).item()
    confidence = predictions[0][predicted_class].item()
    
    return {
        'class': 'constructive' if predicted_class == 1 else 'non-constructive',
        'confidence': confidence,
        'scores': {
            'non-constructive': predictions[0][0].item(),
            'constructive': predictions[0][1].item()
        }
    }

# Example usage - Reddit discussion
text = "[author0] LEGO: What do you think you're doing?!? [author1] I don't get it did he reveal bionicle reboot or smthn? [author2] Not really, he did announce something but was super vague, seems like a sort of passion project we wants to do with the community, he even said it might not even be bionicle. [author1] So is that image fan made or is it one of his passion projects [author2] Those pictures are real and on his insta, he did a stream talking about it I'm sure you can find somewhere, search up Fabre bionicle stream 2020 or something. [author1] OK thanks"
result = classify_text(text)
print(result)
```

## Training Details

### Training Data
- **Source**: https://archive.org/download/pushshift_reddit_200506_to_202212/
- **Size**: ~1.4 million Reddit threads filtered for English language and minimum 2 authors
- **Labels**: Binary (constructive/non-constructive conversations)
- **Additional Data**: YNACC and IAC datasets for initial supervised training

### Training Procedure
- **Training Method**: Self-training
- **Quantization**: 4-bit QLoRA for efficiency
- **LoRA Config**:
  - `r`: 16
  - `lora_alpha`: 32
  - `lora_dropout`: 0.1
  - Target modules: `Wqkv`, `Wo`, `Wi`, `dense`
- **Loss Function**: Focal Loss with class weighting
- **Max Sequence Length**: 4096 tokens
- **Batch Size**: 64
- **Learning Rate**: 2e-6

### Training Hardware
- 48 hours on 4x NVIDIA A100 40GB GPUs

## Performance

### Evaluation Results

```
YNACC:
Accuracy: 0.63
Precision: 0.63
F1-Score: 0.65

IAC:
Accuracy: 0.79
Precision: 0.85
F1-Score: 0.87

Reddit:
Accuracy: 0.57
Precision: 0.74
F1-Score: 0.67
```

## Limitations and Bias

- **Language**: English only
- **Bias**: May reflect biases present in Reddit discussions and training data

## Ethical Considerations

- Human oversight is recommended for important moderation decisions  

## Technical Specifications

- **Model Architecture**: ModernBERT + Classification Head
- **Parameters**: ~150M base + LoRA adapters + classification head
- **Precision**: 4-bit quantized base model with full-precision adapters
- **Framework**: PyTorch, Transformers, PEFT (any recent version - you may see harmless warnings about configuration parameters)

## Model Card Authors

Niklas Koch, Georg August University of Göttingen

## Model Card Contact

niklas.koch01@stud.uni-goettingen.de