--- license: apache-2.0 language: - en base_model: - deepseek-ai/DeepSeek-R1-Distill-Qwen-14B pipeline_tag: text-generation library_name: transformers tags: - text-generation-inference - Math - Reasoning - Code - RL --- ![9.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/AB65eZr2rKLVDUSjtZn-d.png) # **Geminorum-Wasat-14B-Instruct** > **Geminorum-Wasat-14B-Instruct** is built on the Qwen 2.5 14B modality architecture, engineered to excel in mathematical reasoning, distributed reinforcement learning (RL), and general-purpose problem solving. This model is fine-tuned with chain-of-thought reasoning datasets, optimization-focused corpora, and advanced structured reasoning datasets to maximize its capabilities in logical deduction, multi-step reasoning, and intelligent decision-making. ## **Key Improvements** 1. **Advanced Mathematical Reasoning**: Excels in solving complex equations, performing symbolic computation, theorem proving, and step-by-step mathematical problem-solving. 2. **Distributed Reinforcement Learning Expertise**: Specially fine-tuned for robust policy optimization using distributed RL techniques, providing resilience and optimality across dynamic problem spaces. 3. **General-Purpose Reasoning and Problem Solving**: Strong across a broad range of domains, handling factual questions, logical analysis, and multi-step cognitive tasks. 4. **Long-Context Mastery**: Supports up to 128K tokens for context and can generate up to 8K tokens, enabling detailed, coherent long-form outputs and complex derivations. 5. **Superior Instruction Following**: Capable of following complex and structured prompts precisely, maintaining focus and clarity over extended dialogues. 6. **Coding and Algorithmic Fluency**: Highly effective in code generation, debugging, algorithm design, and optimization problem modeling across various programming languages. ## **Quickstart with transformers** You can load and use the model easily with the `transformers` library and `apply_chat_template`: ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "prithivMLmods/Geminorum-Wasat-14B-Instruct" model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained(model_name) prompt = "Explain the connection between distributed reinforcement learning and robust policy optimization." messages = [ {"role": "system", "content": "You are an expert assistant specializing in mathematics, optimization, and reinforcement learning."}, {"role": "user", "content": prompt} ] text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) model_inputs = tokenizer([text], return_tensors="pt").to(model.device) generated_ids = model.generate( **model_inputs, max_new_tokens=512 ) generated_ids = [ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) ] response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] ``` ## **Intended Use** 1. **Mathematical and Optimization Problem Solving**: Designed for solving complex mathematical problems, optimization modeling, symbolic logic, and structured derivations. 2. **Distributed Reinforcement Learning Research**: Supports designing, analyzing, and explaining distributed RL systems, robust policy optimization, and autonomous decision systems. 3. **General Knowledge and Reasoning**: Effective in answering a wide range of questions and performing structured reasoning across scientific, technical, and educational domains. 4. **Educational and Research Support**: Ideal for students, researchers, and professionals seeking detailed explanations, derivations, and robust scientific insights. 5. **Code Writing and Algorithm Design**: Excels at creating, optimizing, and explaining algorithms, particularly those relevant to mathematical computation and optimization. 6. **Intelligent Conversational Systems**: Perfect for technical conversational agents and educational bots requiring deep understanding and detailed reasoning capabilities. 7. **Long-Form Technical Content Generation**: Capable of producing structured, coherent articles, tutorials, and research papers, especially in technical and mathematical fields. 8. **Structured Data Generation**: Supports outputting structured formats such as proofs, equations, tables, and JSON useful for scientific and technical workflows. ## **Limitations** 1. **Heavy Hardware Requirements**: Due to its large parameter count and long-context handling, it requires powerful GPUs or TPUs with significant memory. 2. **Potential for Training Biases**: Outputs may still reflect biases from the mathematical, technical, or optimization-specific datasets used during training. 3. **Less Effective in Creative Tasks**: Focused more on technical and logical reasoning than on freeform creative writing or storytelling. 4. **No Real-Time Event Awareness**: Limited to knowledge prior to its training cutoff, without access to live or real-world updates. 5. **Prompt Sensitivity**: Performance may vary based on the clarity, structure, and specificity of the prompt, particularly for complex multi-step tasks. 6. **Error Propagation Risk**: Small inaccuracies in early stages of long-form outputs could propagate, affecting the overall answer coherence.