Edit Models filters

Model Tree

llama-moe/LLaMA-MoE-v1-3_5B-4_16

Inference Providers

Nebius AI Studio

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

7

Full-text search

Active filters: llama-moe/LLaMA-MoE-v1-3_5B-4_16

PrunaAI/llama-moe-LLaMA-MoE-v1-3_5B-4_16-HQQ-1bit-smashed

Text Generation • Updated Jul 12, 2024 • 3

PrunaAI/llama-moe-LLaMA-MoE-v1-3_5B-4_16-HQQ-2bit-smashed

Text Generation • Updated Jul 12, 2024 • 5

PrunaAI/llama-moe-LLaMA-MoE-v1-3_5B-4_16-HQQ-4bit-smashed

Text Generation • Updated Jul 12, 2024 • 3

PrunaAI/llama-moe-LLaMA-MoE-v1-3_5B-4_16-QUANTO-int2bit-smashed

Updated Jul 19, 2024

PrunaAI/llama-moe-LLaMA-MoE-v1-3_5B-4_16-QUANTO-int4bit-smashed

Updated Jul 19, 2024

PrunaAI/llama-moe-LLaMA-MoE-v1-3_5B-4_16-QUANTO-int8bit-smashed

Updated Jul 19, 2024 • 1

PrunaAI/llama-moe-LLaMA-MoE-v1-3_5B-4_16-QUANTO-float8bit-smashed

Updated Jul 19, 2024