--- license: mit license_link: https://huggingface.co/TheoVincent/EauDeQN/blob/main/LICENSE tags: - reinforcement-learning - jax - atari co2_eq_emissions: emissions: 1800000 --- # Sparse model parameters on [10 Atari games](#games) The sparse model parameters were obtained with [EauDeQN](https://arxiv.org/pdf/2503.01437) and [PolyPruneQN](https://arxiv.org/pdf/2402.12479) leading to ```EauDeDQN``` and ```PolyPruneDQN``` in an online scenario, and leading to ```EauDeCQL``` and ```PolyPruneCQL``` in an offline scenario 🎮 **While PolyPruneQN applies a *fixed polynomial pruning schedule* to reach a final sparsity level of 95%, EauDeQN prunes the network parameters at the *agents learning pace* 🪡** *We also release the model parameters for the dense approach using [DQN](https://www.nature.com/articles/nature14236.pdf) and [CQL](https://papers.neurips.cc/paper_files/paper/2020/file/0d2b2061826a5df3221116a5085a6052-Paper.pdf)* 🏋️ The online training were made for 40M frames and the offline training were made for 50 $\times$ 62 500 gradient steps ⏱️. We used the CNN architecture, where the number of neurons of the first linear layer is reported as "Feature Size": | Training type | Feature Size: | **32** (Small) | **512** (Medium) | **2048** (Large) | |----------|--------------|:------:|:------:|:-------:| | Online | ```EauDeDQN``` | ✅ | ✅ | ✅ | | Online | ```PolyPruneDQN``` | ✅ | ✅ | ✅ | | Online | ```DQN``` (dense) | ✅ | ✅ | ✅ | | Offline | ```EauDeCQL``` | ✅ | | ✅ | | Offline | ```PolyPruneCQL``` | ✅ | | ✅ | | Offline | ```CQL``` (dense) | ✅ | | ✅ | 5 seeds are available for each configuration which makes a total of **750 available models** 📈. The [evaluate.ipynb](./evaluate.ipynb) notebook contains a minimal example to evaluate to model parameters 🧑🏫 It uses JAX 🚀 The hyperparameters used during training are reported in [config.json](./config.json) 🔧 The training code is available soon ⏳ ### Model sparsity & performances |