---
license: mit
license_link: https://huggingface.co/TheoVincent/EauDeQN/blob/main/LICENSE
tags:
  - reinforcement-learning
  - jax
  - atari
co2_eq_emissions:
  emissions: 1800000
---

# Sparse model parameters on [10 Atari games](#games)
The sparse model parameters were obtained with [EauDeQN](https://arxiv.org/pdf/2503.01437) and [PolyPruneQN](https://arxiv.org/pdf/2402.12479) leading to ```EauDeDQN``` and ```PolyPruneDQN``` in an online scenario, and leading to ```EauDeCQL``` and ```PolyPruneCQL``` in an offline scenario 🎮 **While PolyPruneQN applies a *fixed polynomial pruning schedule* to reach a final sparsity level of 95%, EauDeQN prunes the network parameters at the *agents learning pace* 🪡**

*We also release the model parameters for the dense approach using [DQN](https://www.nature.com/articles/nature14236.pdf) and [CQL](https://papers.neurips.cc/paper_files/paper/2020/file/0d2b2061826a5df3221116a5085a6052-Paper.pdf)* 🏋️ The online training were made for 40M frames and the offline training were made for 50 $\times$ 62 500 gradient steps ⏱️. We used the CNN architecture, where the number of neurons of the first linear layer is reported as "Feature Size":

| Training type | Feature Size: | **32** (Small) | **512** (Medium) | **2048** (Large) |
|----------|--------------|:------:|:------:|:-------:|
| Online   | ```EauDeDQN```     | ✅ | ✅ | ✅ |
| Online   | ```PolyPruneDQN```  | ✅ | ✅ | ✅ |
| Online   | ```DQN``` (dense) | ✅ | ✅ | ✅ |
| Offline  | ```EauDeCQL```     | ✅ | | ✅ |
| Offline  | ```PolyPruneCQL```  | ✅ | | ✅ |
| Offline  | ```CQL``` (dense) | ✅ | | ✅ |


5 seeds are available for each configuration which makes a total of **750 available models** 📈.

The [evaluate.ipynb](./evaluate.ipynb) notebook contains a minimal example to evaluate to model parameters 🧑‍🏫 It uses JAX 🚀 The hyperparameters used during training are reported in [config.json](./config.json) 🔧

The training code is available soon ⏳

### Model sparsity & performances
| <div style="width:300px; font-size: 30px; font-family:Serif; font-name:Times New Roman" > **EauDeDQN** and **EauDeCQL** achieve high sparsity while keeping performances high. <br> Published at [RLDM](https://arxiv.org/pdf/2503.01437)✨ </br> <div style="font-size: 16px"> <details> <summary id=games>List of Atari games</summary> *BeamRider, MsPacman, Qbert, Pong, Enduro, SpaceInvaders, Assault, CrazyClimber, Boxing, VideoPinball.* </details> </div> </div> | <img src="sparsities.png" alt="drawing" width="600px"/> |
| :-: | :-: |

The episodic returns and lenghts are available in the __evaluations__ folder 🔬

## User installation
Python 3.11 is recommended. Create a Python virtual environment, activate it, update pip and install the package and its dependencies in editable mode:
```bash
python3 -m venv env
source env/bin/activate
pip install --upgrade pip setuptools wheel
pip install -r requirements.txt
```

## Citing `Eau De Q-Network`
```
@inproceedings{Vincent_CRLDM_2025,
    title={Eau De $ Q $-Network: Adaptive Distillation of Neural Networks in Deep Reinforcement Learning},
    author={Vincent, Th{\'e}o and Faust, Tim and Tripathi, Yogesh and Peters, Jan and D'Eramo, Carlo},
    booktitle={Conference on Reinforcement Learning and Decision Making (RLDM)},
    year={2025}
}
```