--- pipeline_tag: text-generation inference: false license: apache-2.0 library_name: transformers tags: - language - aquif_moe - text-generation-inference - 17b - qwen-like - bailing-like - science - math - code base_model: - aquiffoo/aquif-3-moe-17b-a2.8b language: - en --- # aquif-3-moe (17B) Thinking A high-performance mixture-of-experts language model optimized for efficiency, coding, science, and general use. With 17B total parameters and 2.8B active parameters, aquif-3-moe delivers competitive performance across multiple domains while maintaining computational efficiency. ## Model Details **Architecture**: Mixture of Experts (MoE) **Total Parameters**: 17 billion **Active Parameters**: 2.8 billion **License**: Apache 2.0 **Library**: transformers ## Performance Benchmarks Benchmark Comparison Chart | Metric | aquif-3-moe (Thinking 17B a2.8B) | Phi-4 (Thinking 14B) | Qwen3 (Thinking 8B) | DeepSeek R1 (Qwen3 8B) | Magistral Small (24B) | Gemini 2.5 Flash-Lite (Propr.) | | ---------------------- | -------------------------------- | -------------------- | ------------------- | ---------------------- | --------------------- | ------------------------------ | | LiveCodeBench (Coding) | **63.2** | *53.8* | 58.1 | 60.5 | 51.4 | 59.3 | | AIME 2024 (Math) | **80.2** | *75.3* | 74.7 | 65.0 | 71.3 | 70.3 | | GPQA Diamond (Science) | *64.2* | **65.8** | 62.0 | 61.1 | 64.1 | 62.5 | | **Average** | **69.2** | *65.0* | 64.9 | 62.2 | 62.3 | 64.0 | ## Key Strengths - **Mathematical Reasoning**: Achieves 91.4% on MATH-500, demonstrating exceptional mathematical problem-solving capabilities - **Scientific Understanding**: Leads in GPQA Diamond with 56.7%, showing strong scientific reasoning - **Efficiency**: Delivers competitive performance with only 2.8B active parameters - **General Knowledge**: Strong MMLU performance at 83.2% ## Usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM model_name = "aquiffoo/aquif-3-moe-17b-a2.8b-thinking" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name) # Generate text inputs = tokenizer("Explain quantum entanglement:", return_tensors="pt") outputs = model.generate(**inputs, max_length=200) response = tokenizer.decode(outputs[0], skip_special_tokens=True) ``` ## Intended Use Cases - Mathematical problem solving and reasoning - Scientific research and analysis - Code generation and programming assistance - General question answering and text generation - Educational content creation ## Model Architecture The mixture-of-experts architecture enables efficient scaling by activating only a subset of parameters for each input, providing the benefits of a larger model while maintaining computational efficiency comparable to much smaller dense models. ## License Apache 2.0