autoencoder / README.md
AndrewMayesPrezzee
Readme Updated
f7451e7
---
# Metadata for Hugging Face repo card
library_name: transformers
pipeline_tag: feature-extraction
license: apache-2.0
tags:
- autoencoder
- pytorch
- reconstruction
- preprocessing
- normalizing-flow
- scaler
---
# Autoencoder Implementation for Hugging Face Transformers
A complete autoencoder implementation that integrates seamlessly with the Hugging Face Transformers ecosystem, providing all the standard functionality you expect from transformer models.
### Install-and-Use from the Hub (code repo)
If you want to use the implementation directly from the Hub code repository (without a packaged pip install), you can download the repo and add it to `sys.path`:
```python
from huggingface_hub import snapshot_download
import sys, torch
# 1) Download the code+weights for your repo β€œas is”
repo_dir = snapshot_download(
repo_id="amaye15/autoencoder",
repo_type="model",
allow_patterns=["*.py", "config.json", "*.safetensors"], # note the * wildcards
)
# 2) Add to import path so plain imports work
sys.path.append(repo_dir)
# 3) Import your classes from the repo code
from configuration_autoencoder import AutoencoderConfig
from modeling_autoencoder import AutoencoderForReconstruction
# 4) Load the placeholder weights from the local folder (no internet, no code refresh)
model = AutoencoderForReconstruction.from_pretrained(repo_dir)
# 5) Quick smoke test
x = torch.randn(8, 20)
out = model(input_values=x)
print("latent:", out.last_hidden_state.shape, "reconstructed:", out.reconstructed.shape)
```
## πŸš€ Features
- **Full Hugging Face Integration**: Compatible with `AutoModel`, `AutoConfig`, and `AutoTokenizer` patterns
- **Standard Training Workflows**: Works with `Trainer`, `TrainingArguments`, and all HF training utilities
- **Model Hub Compatible**: Save and share models on Hugging Face Hub with `push_to_hub()`
- **Flexible Architecture**: Configurable encoder-decoder architecture with various activation functions
- **Multiple Loss Functions**: Support for MSE, BCE, L1, Huber, Smooth L1, KL Divergence, Cosine, Focal, Dice, Tversky, SSIM, and Perceptual loss
- **Multiple Autoencoder Types (7)**: Classic, Variational (VAE), Beta-VAE, Denoising, Sparse, Contractive, and Recurrent autoencoders
- **Extended Activation Functions**: 18+ activation functions including ReLU, GELU, Swish, Mish, ELU, and more
- **Learnable Preprocessing**: Neural Scaler, Normalizing Flow, MinMax Scaler (learnable), Robust Scaler (learnable), and Yeo-Johnson preprocessors (2D and 3D tensors)
- **Extensible Design**: Easy to extend for new autoencoder variants and custom loss functions
- **Production Ready**: Proper serialization, checkpointing, and inference support
## πŸ—οΈ Architecture
The implementation consists of three main components:
### 1. AutoencoderConfig
Configuration class that inherits from `PretrainedConfig`:
- Defines model architecture parameters
- Handles validation and serialization
- Enables `AutoConfig.from_pretrained()` functionality
### 2. AutoencoderModel
Base model class that inherits from `PreTrainedModel`:
- Implements encoder-decoder architecture
- Provides latent space representation
- Returns structured outputs with `AutoencoderOutput`
### 3. AutoencoderForReconstruction
Task-specific model for reconstruction:
- Adds reconstruction loss calculation
- Compatible with `Trainer` for easy training
- Returns `AutoencoderForReconstructionOutput` with loss
## πŸ”§ Quick Start
### Basic Usage
```python
from configuration_autoencoder import AutoencoderConfig
from modeling_autoencoder import AutoencoderForReconstruction
import torch
# Create configuration
config = AutoencoderConfig(
input_dim=784, # Input dimensionality (e.g., 28x28 images flattened)
hidden_dims=[512, 256], # Encoder hidden layers
latent_dim=64, # Latent space dimension
activation="gelu", # Activation function (18+ options available)
reconstruction_loss="mse", # Loss function (12+ options available)
autoencoder_type="classic", # Autoencoder type (7 types available)
# Optional learnable preprocessing
use_learnable_preprocessing=True,
preprocessing_type="neural_scaler", # or "normalizing_flow", "minmax_scaler", "robust_scaler", "yeo_johnson"
)
# Create model
model = AutoencoderForReconstruction(config)
# Forward pass
input_data = torch.randn(32, 784) # Batch of 32 samples
outputs = model(input_values=input_data)
print(f"Reconstruction loss: {outputs.loss}")
print(f"Latent shape: {outputs.last_hidden_state.shape}")
print(f"Reconstructed shape: {outputs.reconstructed.shape}")
```
### Training with Hugging Face Trainer
```python
from transformers import Trainer, TrainingArguments
from torch.utils.data import Dataset
class AutoencoderDataset(Dataset):
def __init__(self, data):
self.data = torch.FloatTensor(data)
def __len__(self):
return len(self.data)
def __getitem__(self, idx):
return {
"input_values": self.data[idx],
"labels": self.data[idx] # For autoencoder, input = target
}
# Prepare data
train_dataset = AutoencoderDataset(your_training_data)
val_dataset = AutoencoderDataset(your_validation_data)
# Training arguments
training_args = TrainingArguments(
output_dir="./autoencoder_output",
num_train_epochs=10,
per_device_train_batch_size=64,
per_device_eval_batch_size=64,
warmup_steps=500,
weight_decay=0.01,
logging_dir="./logs",
evaluation_strategy="steps",
eval_steps=500,
save_steps=1000,
load_best_model_at_end=True,
)
# Create trainer
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=val_dataset,
)
# Train
trainer.train()
# Save model
model.save_pretrained("./my_autoencoder")
config.save_pretrained("./my_autoencoder")
```
### Using AutoModel Framework
```python
from register_autoencoder import register_autoencoder_models
from transformers import AutoConfig, AutoModel
# Register models with AutoModel framework
register_autoencoder_models()
# Now you can use standard HF patterns
config = AutoConfig.from_pretrained("./my_autoencoder")
model = AutoModel.from_pretrained("./my_autoencoder")
# Use the model
outputs = model(input_values=your_data)
```
## βš™οΈ Configuration Options
The `AutoencoderConfig` class supports extensive customization:
```python
config = AutoencoderConfig(
input_dim=784, # Input dimension
hidden_dims=[512, 256, 128], # Encoder hidden layers
latent_dim=64, # Latent space dimension
activation="gelu", # Activation function (see full list below)
dropout_rate=0.1, # Dropout rate (0.0 to 1.0)
use_batch_norm=True, # Use batch normalization
tie_weights=False, # Tie encoder/decoder weights
reconstruction_loss="mse", # Loss function (see full list below)
autoencoder_type="variational", # Autoencoder type (see types below)
beta=0.5, # Beta parameter for Ξ²-VAE
temperature=1.0, # Temperature for Gumbel softmax
noise_factor=0.1, # Noise factor for denoising AE
# Recurrent autoencoder parameters
rnn_type="lstm", # RNN type: "lstm", "gru", "rnn"
num_layers=2, # Number of RNN layers
bidirectional=True, # Bidirectional encoding
sequence_length=None, # Fixed sequence length (None for variable)
teacher_forcing_ratio=0.5, # Teacher forcing ratio during training
# Learnable preprocessing parameters
use_learnable_preprocessing=False, # Enable learnable preprocessing
preprocessing_type="none", # "none", "neural_scaler", "normalizing_flow"
preprocessing_hidden_dim=64, # Hidden dimension for preprocessing networks
preprocessing_num_layers=2, # Number of layers in preprocessing networks
learn_inverse_preprocessing=True, # Learn inverse transformation
flow_coupling_layers=4, # Number of coupling layers for flows
)
```
### πŸŽ›οΈ Available Activation Functions
**Standard Activations:**
- `relu`, `leaky_relu`, `relu6`, `elu`, `prelu`
- `tanh`, `sigmoid`, `hardsigmoid`, `hardtanh`
- `gelu`, `swish`, `silu`, `hardswish`
- `mish`, `softplus`, `softsign`, `tanhshrink`, `threshold`
### πŸ“Š Available Loss Functions
**Regression Losses:**
- `mse` - Mean Squared Error
- `l1` - L1/MAE Loss
- `huber` - Huber Loss
- `smooth_l1` - Smooth L1 Loss
**Classification/Probability Losses:**
- `bce` - Binary Cross Entropy
- `kl_div` - KL Divergence
- `focal` - Focal Loss
**Similarity Losses:**
- `cosine` - Cosine Similarity Loss
- `ssim` - Structural Similarity Loss
- `perceptual` - Perceptual Loss
**Segmentation Losses:**
- `dice` - Dice Loss
- `tversky` - Tversky Loss
### πŸ—οΈ Available Autoencoder Types
**Classic Autoencoder (`classic`)**
- Standard encoder-decoder architecture
- Direct reconstruction loss minimization
**Variational Autoencoder (`variational`)**
- Probabilistic latent space with mean and variance
- KL divergence regularization
- Reparameterization trick for sampling
**Beta-VAE (`beta_vae`)**
- Variational autoencoder with adjustable Ξ² parameter
- Better disentanglement of latent factors
**Denoising Autoencoder (`denoising`)**
- Adds noise to input during training
- Learns robust representations
- Configurable noise factor
**Sparse Autoencoder (`sparse`)**
- Encourages sparse latent representations
- L1 regularization on latent activations
- Useful for feature selection
**Contractive Autoencoder (`contractive`)**
- Penalizes large gradients of latent w.r.t. input
- Learns smooth manifold representations
- Robust to small input perturbations
**Recurrent Autoencoder (`recurrent`)**
- LSTM/GRU/RNN encoder-decoder architecture
- Bidirectional encoding for better sequence representations
- Variable length sequence support with padding
- Teacher forcing during training for stable learning
- Sequence-to-sequence reconstruction
```
## πŸ“Š Model Outputs
### AutoencoderOutput
The base model `AutoencoderModel` returns the following output:
```
```python
@dataclass
class AutoencoderOutput(ModelOutput):
last_hidden_state: torch.FloatTensor = None # Latent representation
reconstructed: torch.FloatTensor = None # Reconstructed input
hidden_states: Tuple[torch.FloatTensor] = None # Intermediate states
attentions: Tuple[torch.FloatTensor] = None # Not used
```
### AutoencoderForReconstructionOutput
```python
@dataclass
class AutoencoderForReconstructionOutput(ModelOutput):
loss: torch.FloatTensor = None # Reconstruction loss
reconstructed: torch.FloatTensor = None # Reconstructed input
last_hidden_state: torch.FloatTensor = None # Latent representation
hidden_states: Tuple[torch.FloatTensor] = None # Intermediate states
```
## πŸ”¬ Advanced Usage
### Custom Loss Functions
You can easily extend the model with custom loss functions:
```python
class CustomAutoencoder(AutoencoderForReconstruction):
def _compute_reconstruction_loss(self, reconstructed, target):
# Custom loss implementation
return your_custom_loss(reconstructed, target)
```
### Recurrent Autoencoder for Sequences
Perfect for time series, text, and sequential data:
```python
config = AutoencoderConfig(
input_dim=50, # Feature dimension per timestep
latent_dim=32, # Compressed representation size
autoencoder_type="recurrent",
rnn_type="lstm", # or "gru", "rnn"
num_layers=2, # Number of RNN layers
bidirectional=True, # Bidirectional encoding
teacher_forcing_ratio=0.7, # Teacher forcing during training
sequence_length=None # Variable length sequences
)
# Usage with sequence data
model = AutoencoderForReconstruction(config)
sequence_data = torch.randn(batch_size, seq_len, input_dim)
outputs = model(input_values=sequence_data)
```
### Learnable Preprocessing
Deep learning-based data normalization that adapts to your data:
```python
# Neural Scaler - Learnable alternative to StandardScaler
config = AutoencoderConfig(
input_dim=20,
latent_dim=10,
use_learnable_preprocessing=True,
preprocessing_type="neural_scaler",
preprocessing_hidden_dim=64
)
# Normalizing Flow - Invertible transformations
config = AutoencoderConfig(
input_dim=20,
latent_dim=10,
use_learnable_preprocessing=True,
preprocessing_type="normalizing_flow",
flow_coupling_layers=4
)
# Works with all autoencoder types and sequence data
model = AutoencoderForReconstruction(config)
outputs = model(input_values=data)
print(f"Preprocessing loss: {outputs.preprocessing_loss}")
```
```python
# Learnable MinMax Scaler - scales to [0, 1] with learnable bounds
config = AutoencoderConfig(
input_dim=20,
latent_dim=10,
use_learnable_preprocessing=True,
preprocessing_type="minmax_scaler",
)
# Learnable Robust Scaler - robust to outliers using median/IQR
config = AutoencoderConfig(
input_dim=20,
latent_dim=10,
use_learnable_preprocessing=True,
preprocessing_type="robust_scaler",
)
# Learnable Yeo-Johnson - power transform for skewed distributions
config = AutoencoderConfig(
input_dim=20,
latent_dim=10,
use_learnable_preprocessing=True,
preprocessing_type="yeo_johnson",
)
```
### Variational Autoencoder Extension
The configuration supports variational autoencoders:
```python
config = AutoencoderConfig(
autoencoder_type="variational",
beta=0.5, # Ξ²-VAE parameter
# ... other parameters
)
```
### Integration with Datasets Library
```python
from datasets import Dataset
# Convert your data to HF Dataset
dataset = Dataset.from_dict({
"input_values": your_data_list
})
# Use with Trainer
trainer = Trainer(
model=model,
train_dataset=dataset,
# ... other arguments
)
```
## πŸ“ Project Structure
```
autoencoder/
β”œβ”€β”€ __init__.py # Package initialization
β”œβ”€β”€ configuration_autoencoder.py # Configuration class
β”œβ”€β”€ modeling_autoencoder.py # Model implementations
β”œβ”€β”€ register_autoencoder.py # AutoModel registration
β”œβ”€β”€ pyproject.toml # Project metadata and dependencies
└── README.md # This file
```
## 🀝 Contributing
This implementation follows Hugging Face conventions and can be easily extended:
1. **Adding new architectures**: Extend `AutoencoderModel` or create new model classes
2. **Custom configurations**: Add parameters to `AutoencoderConfig`
3. **Task-specific heads**: Create new classes like `AutoencoderForReconstruction`
4. **Integration**: Register new models with the AutoModel framework
## πŸ“š References
- [Hugging Face Transformers Documentation](https://huggingface.co/docs/transformers)
- [Custom Models Guide](https://huggingface.co/docs/transformers/custom_models)
- [AutoModel Documentation](https://huggingface.co/docs/transformers/model_doc/auto)
## 🎯 Use Cases
This autoencoder implementation is perfect for:
- **Dimensionality Reduction**: Compress high-dimensional data to lower dimensions
- **Anomaly Detection**: Identify outliers based on reconstruction error
- **Data Denoising**: Remove noise from corrupted data
- **Feature Learning**: Learn meaningful representations for downstream tasks
- **Data Generation**: Generate new samples similar to training data
- **Pretraining**: Initialize encoders for other tasks
## πŸ” Model Comparison
| Feature | Standard PyTorch | This Implementation |
|---------|------------------|-------------------|
| HF Integration | ❌ | βœ… |
| AutoModel Support | ❌ | βœ… |
| Trainer Compatible | ❌ | βœ… |
| Hub Integration | ❌ | βœ… |
| Config Management | Manual | βœ… Automatic |
| Serialization | Manual | βœ… Built-in |
| Checkpointing | Manual | βœ… Built-in |
## πŸš€ Performance Tips
1. **Batch Size**: Use larger batch sizes for better GPU utilization
2. **Learning Rate**: Start with 1e-3 and adjust based on convergence
3. **Architecture**: Gradually decrease hidden dimensions for better compression
4. **Regularization**: Use dropout and batch normalization for better generalization
5. **Loss Function**: Choose appropriate loss based on your data type
## πŸ“„ License
This implementation is provided as an example and follows the same license terms as Hugging Face Transformers.