autoencoder / README.md

AndrewMayesPrezzee

Readme Updated

f7451e7 1 minute ago

16.6 kB

	---
	# Metadata for Hugging Face repo card
	library_name: transformers
	pipeline_tag: feature-extraction
	license: apache-2.0
	tags:
	- autoencoder
	- pytorch
	- reconstruction
	- preprocessing
	- normalizing-flow
	- scaler
	---

	# Autoencoder Implementation for Hugging Face Transformers

	A complete autoencoder implementation that integrates seamlessly with the Hugging Face Transformers ecosystem, providing all the standard functionality you expect from transformer models.


	### Install-and-Use from the Hub (code repo)

	If you want to use the implementation directly from the Hub code repository (without a packaged pip install), you can download the repo and add it to `sys.path`:

	```python
	from huggingface_hub import snapshot_download
	import sys, torch

	# 1) Download the code+weights for your repo “as is”
	repo_dir = snapshot_download(
	repo_id="amaye15/autoencoder",
	repo_type="model",
	allow_patterns=[".py", "config.json", ".safetensors"], # note the * wildcards
	)

	# 2) Add to import path so plain imports work
	sys.path.append(repo_dir)

	# 3) Import your classes from the repo code
	from configuration_autoencoder import AutoencoderConfig
	from modeling_autoencoder import AutoencoderForReconstruction

	# 4) Load the placeholder weights from the local folder (no internet, no code refresh)
	model = AutoencoderForReconstruction.from_pretrained(repo_dir)

	# 5) Quick smoke test
	x = torch.randn(8, 20)
	out = model(input_values=x)
	print("latent:", out.last_hidden_state.shape, "reconstructed:", out.reconstructed.shape)
	```

	## 🚀 Features

	- Full Hugging Face Integration: Compatible with `AutoModel`, `AutoConfig`, and `AutoTokenizer` patterns
	- Standard Training Workflows: Works with `Trainer`, `TrainingArguments`, and all HF training utilities
	- Model Hub Compatible: Save and share models on Hugging Face Hub with `push_to_hub()`
	- Flexible Architecture: Configurable encoder-decoder architecture with various activation functions
	- Multiple Loss Functions: Support for MSE, BCE, L1, Huber, Smooth L1, KL Divergence, Cosine, Focal, Dice, Tversky, SSIM, and Perceptual loss
	- Multiple Autoencoder Types (7): Classic, Variational (VAE), Beta-VAE, Denoising, Sparse, Contractive, and Recurrent autoencoders
	- Extended Activation Functions: 18+ activation functions including ReLU, GELU, Swish, Mish, ELU, and more
	- Learnable Preprocessing: Neural Scaler, Normalizing Flow, MinMax Scaler (learnable), Robust Scaler (learnable), and Yeo-Johnson preprocessors (2D and 3D tensors)
	- Extensible Design: Easy to extend for new autoencoder variants and custom loss functions
	- Production Ready: Proper serialization, checkpointing, and inference support


	## 🏗️ Architecture

	The implementation consists of three main components:

	### 1. AutoencoderConfig
	Configuration class that inherits from `PretrainedConfig`:
	- Defines model architecture parameters
	- Handles validation and serialization
	- Enables `AutoConfig.from_pretrained()` functionality

	### 2. AutoencoderModel
	Base model class that inherits from `PreTrainedModel`:
	- Implements encoder-decoder architecture
	- Provides latent space representation
	- Returns structured outputs with `AutoencoderOutput`

	### 3. AutoencoderForReconstruction
	Task-specific model for reconstruction:
	- Adds reconstruction loss calculation
	- Compatible with `Trainer` for easy training
	- Returns `AutoencoderForReconstructionOutput` with loss

	## 🔧 Quick Start

	### Basic Usage

	```python
	from configuration_autoencoder import AutoencoderConfig
	from modeling_autoencoder import AutoencoderForReconstruction
	import torch

	# Create configuration
	config = AutoencoderConfig(
	input_dim=784, # Input dimensionality (e.g., 28x28 images flattened)
	hidden_dims=[512, 256], # Encoder hidden layers
	latent_dim=64, # Latent space dimension
	activation="gelu", # Activation function (18+ options available)
	reconstruction_loss="mse", # Loss function (12+ options available)
	autoencoder_type="classic", # Autoencoder type (7 types available)
	# Optional learnable preprocessing
	use_learnable_preprocessing=True,
	preprocessing_type="neural_scaler", # or "normalizing_flow", "minmax_scaler", "robust_scaler", "yeo_johnson"
	)

	# Create model
	model = AutoencoderForReconstruction(config)

	# Forward pass
	input_data = torch.randn(32, 784) # Batch of 32 samples
	outputs = model(input_values=input_data)

	print(f"Reconstruction loss: {outputs.loss}")
	print(f"Latent shape: {outputs.last_hidden_state.shape}")
	print(f"Reconstructed shape: {outputs.reconstructed.shape}")
	```


	### Training with Hugging Face Trainer

	```python
	from transformers import Trainer, TrainingArguments
	from torch.utils.data import Dataset

	class AutoencoderDataset(Dataset):
	def __init__(self, data):
	self.data = torch.FloatTensor(data)

	def __len__(self):
	return len(self.data)

	def __getitem__(self, idx):
	return {
	"input_values": self.data[idx],
	"labels": self.data[idx] # For autoencoder, input = target
	}

	# Prepare data
	train_dataset = AutoencoderDataset(your_training_data)
	val_dataset = AutoencoderDataset(your_validation_data)

	# Training arguments
	training_args = TrainingArguments(
	output_dir="./autoencoder_output",
	num_train_epochs=10,
	per_device_train_batch_size=64,
	per_device_eval_batch_size=64,
	warmup_steps=500,
	weight_decay=0.01,
	logging_dir="./logs",
	evaluation_strategy="steps",
	eval_steps=500,
	save_steps=1000,
	load_best_model_at_end=True,
	)

	# Create trainer
	trainer = Trainer(
	model=model,
	args=training_args,
	train_dataset=train_dataset,
	eval_dataset=val_dataset,
	)

	# Train
	trainer.train()

	# Save model
	model.save_pretrained("./my_autoencoder")
	config.save_pretrained("./my_autoencoder")
	```

	### Using AutoModel Framework

	```python
	from register_autoencoder import register_autoencoder_models
	from transformers import AutoConfig, AutoModel

	# Register models with AutoModel framework
	register_autoencoder_models()

	# Now you can use standard HF patterns
	config = AutoConfig.from_pretrained("./my_autoencoder")
	model = AutoModel.from_pretrained("./my_autoencoder")

	# Use the model
	outputs = model(input_values=your_data)
	```

	## ⚙️ Configuration Options

	The `AutoencoderConfig` class supports extensive customization:

	```python
	config = AutoencoderConfig(
	input_dim=784, # Input dimension
	hidden_dims=[512, 256, 128], # Encoder hidden layers
	latent_dim=64, # Latent space dimension
	activation="gelu", # Activation function (see full list below)
	dropout_rate=0.1, # Dropout rate (0.0 to 1.0)
	use_batch_norm=True, # Use batch normalization
	tie_weights=False, # Tie encoder/decoder weights
	reconstruction_loss="mse", # Loss function (see full list below)
	autoencoder_type="variational", # Autoencoder type (see types below)
	beta=0.5, # Beta parameter for β-VAE
	temperature=1.0, # Temperature for Gumbel softmax
	noise_factor=0.1, # Noise factor for denoising AE
	# Recurrent autoencoder parameters
	rnn_type="lstm", # RNN type: "lstm", "gru", "rnn"
	num_layers=2, # Number of RNN layers
	bidirectional=True, # Bidirectional encoding
	sequence_length=None, # Fixed sequence length (None for variable)
	teacher_forcing_ratio=0.5, # Teacher forcing ratio during training
	# Learnable preprocessing parameters
	use_learnable_preprocessing=False, # Enable learnable preprocessing
	preprocessing_type="none", # "none", "neural_scaler", "normalizing_flow"
	preprocessing_hidden_dim=64, # Hidden dimension for preprocessing networks
	preprocessing_num_layers=2, # Number of layers in preprocessing networks
	learn_inverse_preprocessing=True, # Learn inverse transformation
	flow_coupling_layers=4, # Number of coupling layers for flows
	)
	```

	### 🎛️ Available Activation Functions

	Standard Activations:
	- `relu`, `leaky_relu`, `relu6`, `elu`, `prelu`
	- `tanh`, `sigmoid`, `hardsigmoid`, `hardtanh`
	- `gelu`, `swish`, `silu`, `hardswish`
	- `mish`, `softplus`, `softsign`, `tanhshrink`, `threshold`

	### 📊 Available Loss Functions

	Regression Losses:
	- `mse` - Mean Squared Error
	- `l1` - L1/MAE Loss
	- `huber` - Huber Loss
	- `smooth_l1` - Smooth L1 Loss

	Classification/Probability Losses:
	- `bce` - Binary Cross Entropy
	- `kl_div` - KL Divergence
	- `focal` - Focal Loss

	Similarity Losses:
	- `cosine` - Cosine Similarity Loss
	- `ssim` - Structural Similarity Loss
	- `perceptual` - Perceptual Loss

	Segmentation Losses:
	- `dice` - Dice Loss
	- `tversky` - Tversky Loss

	### 🏗️ Available Autoencoder Types

	Classic Autoencoder (`classic`)
	- Standard encoder-decoder architecture
	- Direct reconstruction loss minimization

	Variational Autoencoder (`variational`)
	- Probabilistic latent space with mean and variance
	- KL divergence regularization
	- Reparameterization trick for sampling

	Beta-VAE (`beta_vae`)
	- Variational autoencoder with adjustable β parameter
	- Better disentanglement of latent factors

	Denoising Autoencoder (`denoising`)
	- Adds noise to input during training
	- Learns robust representations
	- Configurable noise factor

	Sparse Autoencoder (`sparse`)
	- Encourages sparse latent representations
	- L1 regularization on latent activations
	- Useful for feature selection

	Contractive Autoencoder (`contractive`)
	- Penalizes large gradients of latent w.r.t. input
	- Learns smooth manifold representations
	- Robust to small input perturbations

	Recurrent Autoencoder (`recurrent`)
	- LSTM/GRU/RNN encoder-decoder architecture
	- Bidirectional encoding for better sequence representations
	- Variable length sequence support with padding
	- Teacher forcing during training for stable learning
	- Sequence-to-sequence reconstruction
	```

	## 📊 Model Outputs

	### AutoencoderOutput

	The base model `AutoencoderModel` returns the following output:
	```
	```python

	@dataclass
	class AutoencoderOutput(ModelOutput):
	last_hidden_state: torch.FloatTensor = None # Latent representation
	reconstructed: torch.FloatTensor = None # Reconstructed input
	hidden_states: Tuple[torch.FloatTensor] = None # Intermediate states
	attentions: Tuple[torch.FloatTensor] = None # Not used
	```

	### AutoencoderForReconstructionOutput
	```python
	@dataclass
	class AutoencoderForReconstructionOutput(ModelOutput):
	loss: torch.FloatTensor = None # Reconstruction loss
	reconstructed: torch.FloatTensor = None # Reconstructed input
	last_hidden_state: torch.FloatTensor = None # Latent representation
	hidden_states: Tuple[torch.FloatTensor] = None # Intermediate states
	```

	## 🔬 Advanced Usage

	### Custom Loss Functions

	You can easily extend the model with custom loss functions:

	```python
	class CustomAutoencoder(AutoencoderForReconstruction):
	def _compute_reconstruction_loss(self, reconstructed, target):
	# Custom loss implementation
	return your_custom_loss(reconstructed, target)
	```

	### Recurrent Autoencoder for Sequences

	Perfect for time series, text, and sequential data:

	```python
	config = AutoencoderConfig(
	input_dim=50, # Feature dimension per timestep
	latent_dim=32, # Compressed representation size
	autoencoder_type="recurrent",
	rnn_type="lstm", # or "gru", "rnn"
	num_layers=2, # Number of RNN layers
	bidirectional=True, # Bidirectional encoding
	teacher_forcing_ratio=0.7, # Teacher forcing during training
	sequence_length=None # Variable length sequences
	)

	# Usage with sequence data
	model = AutoencoderForReconstruction(config)
	sequence_data = torch.randn(batch_size, seq_len, input_dim)
	outputs = model(input_values=sequence_data)
	```

	### Learnable Preprocessing

	Deep learning-based data normalization that adapts to your data:

	```python
	# Neural Scaler - Learnable alternative to StandardScaler
	config = AutoencoderConfig(
	input_dim=20,
	latent_dim=10,
	use_learnable_preprocessing=True,
	preprocessing_type="neural_scaler",
	preprocessing_hidden_dim=64
	)

	# Normalizing Flow - Invertible transformations
	config = AutoencoderConfig(
	input_dim=20,
	latent_dim=10,
	use_learnable_preprocessing=True,
	preprocessing_type="normalizing_flow",
	flow_coupling_layers=4
	)

	# Works with all autoencoder types and sequence data
	model = AutoencoderForReconstruction(config)
	outputs = model(input_values=data)
	print(f"Preprocessing loss: {outputs.preprocessing_loss}")
	```

	```python
	# Learnable MinMax Scaler - scales to [0, 1] with learnable bounds
	config = AutoencoderConfig(
	input_dim=20,
	latent_dim=10,
	use_learnable_preprocessing=True,
	preprocessing_type="minmax_scaler",
	)

	# Learnable Robust Scaler - robust to outliers using median/IQR
	config = AutoencoderConfig(
	input_dim=20,
	latent_dim=10,
	use_learnable_preprocessing=True,
	preprocessing_type="robust_scaler",
	)

	# Learnable Yeo-Johnson - power transform for skewed distributions
	config = AutoencoderConfig(
	input_dim=20,
	latent_dim=10,
	use_learnable_preprocessing=True,
	preprocessing_type="yeo_johnson",
	)
	```


	### Variational Autoencoder Extension

	The configuration supports variational autoencoders:

	```python
	config = AutoencoderConfig(
	autoencoder_type="variational",
	beta=0.5, # β-VAE parameter
	# ... other parameters
	)
	```

	### Integration with Datasets Library

	```python
	from datasets import Dataset

	# Convert your data to HF Dataset
	dataset = Dataset.from_dict({
	"input_values": your_data_list
	})

	# Use with Trainer
	trainer = Trainer(
	model=model,
	train_dataset=dataset,
	# ... other arguments
	)
	```

	## 📁 Project Structure

	```
	autoencoder/
	├── __init__.py # Package initialization
	├── configuration_autoencoder.py # Configuration class
	├── modeling_autoencoder.py # Model implementations
	├── register_autoencoder.py # AutoModel registration
	├── pyproject.toml # Project metadata and dependencies
	└── README.md # This file
	```

	## 🤝 Contributing

	This implementation follows Hugging Face conventions and can be easily extended:

	1. Adding new architectures: Extend `AutoencoderModel` or create new model classes
	2. Custom configurations: Add parameters to `AutoencoderConfig`
	3. Task-specific heads: Create new classes like `AutoencoderForReconstruction`
	4. Integration: Register new models with the AutoModel framework

	## 📚 References

	- [Hugging Face Transformers Documentation](https://huggingface.co/docs/transformers)
	- [Custom Models Guide](https://huggingface.co/docs/transformers/custom_models)
	- [AutoModel Documentation](https://huggingface.co/docs/transformers/model_doc/auto)

	## 🎯 Use Cases

	This autoencoder implementation is perfect for:

	- Dimensionality Reduction: Compress high-dimensional data to lower dimensions
	- Anomaly Detection: Identify outliers based on reconstruction error
	- Data Denoising: Remove noise from corrupted data
	- Feature Learning: Learn meaningful representations for downstream tasks
	- Data Generation: Generate new samples similar to training data
	- Pretraining: Initialize encoders for other tasks

	## 🔍 Model Comparison

	\| Feature \| Standard PyTorch \| This Implementation \|
	\|---------\|------------------\|-------------------\|
	\| HF Integration \| ❌ \| ✅ \|
	\| AutoModel Support \| ❌ \| ✅ \|
	\| Trainer Compatible \| ❌ \| ✅ \|
	\| Hub Integration \| ❌ \| ✅ \|
	\| Config Management \| Manual \| ✅ Automatic \|
	\| Serialization \| Manual \| ✅ Built-in \|
	\| Checkpointing \| Manual \| ✅ Built-in \|

	## 🚀 Performance Tips

	1. Batch Size: Use larger batch sizes for better GPU utilization
	2. Learning Rate: Start with 1e-3 and adjust based on convergence
	3. Architecture: Gradually decrease hidden dimensions for better compression
	4. Regularization: Use dropout and batch normalization for better generalization
	5. Loss Function: Choose appropriate loss based on your data type

	## 📄 License

	This implementation is provided as an example and follows the same license terms as Hugging Face Transformers.