1 96 20

NK

NeuralKartMocker

AI & ML interests

Gen AI, GAN, LLMs, NLP, Gen Music

Recent Activity

upvoted a paper about 23 hours ago

Beyond Theorem Proving: Formulation, Framework and Benchmark for Formal Problem-Solving

upvoted a paper about 23 hours ago

Beyond Recognition: Evaluating Visual Perspective Taking in Vision Language Models

upvoted a paper about 23 hours ago

HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation

View all activity

Organizations

NeuralKartMocker's activity

upvoted 3 papers about 23 hours ago

Beyond Theorem Proving: Formulation, Framework and Benchmark for Formal Problem-Solving

Paper • 2505.04528 • Published 3 days ago • 10

Beyond Recognition: Evaluating Visual Perspective Taking in Vision Language Models

Paper • 2505.03821 • Published 7 days ago • 21

HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation

Paper • 2505.04512 • Published 3 days ago • 27

upvoted a paper about 24 hours ago

Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play

Paper • 2505.02707 • Published 5 days ago • 77

upvoted 5 papers 1 day ago

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published 4 days ago • 91

Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities

Paper • 2505.02567 • Published 5 days ago • 62

InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks

Paper • 2312.14238 • Published Dec 21, 2023 • 20

Augmenting CLIP with Improved Visio-Linguistic Reasoning

Paper • 2307.09233 • Published Jul 18, 2023 • 9

Analysis of the Evolution of Advanced Transformer-Based Language Models: Experiments on Opinion Mining

Paper • 2308.03235 • Published Aug 7, 2023 • 2

upvoted 6 papers 2 days ago

Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning

Paper • 2505.03318 • Published 4 days ago • 83

RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale

Paper • 2505.03005 • Published 4 days ago • 26

Multi-Agent System for Comprehensive Soccer Understanding

Paper • 2505.03735 • Published 3 days ago • 17

upvoted 5 papers 7 days ago

T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT

Paper • 2505.00703 • Published 8 days ago • 39

A Robust Deep Networks based Multi-Object MultiCamera Tracking System for City Scale Traffic

Paper • 2505.00534 • Published 9 days ago • 2

Spatial Speech Translation: Translating Across Space With Binaural Hearables

Paper • 2504.18715 • Published 14 days ago • 7

LLMs for Engineering: Teaching Models to Design High Powered Rockets

Paper • 2504.19394 • Published 12 days ago • 13

AdaR1: From Long-CoT to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization

Paper • 2504.21659 • Published 10 days ago • 11