3 21 22

Andrew Zhao

andrewzh

https://andrewzh112.github.io/

AI & ML interests

Reinforcement Learning, Agents

Recent Activity

upvoted a paper 3 days ago

A Survey of Reinforcement Learning for Large Reasoning Models

upvoted a paper about 2 months ago

A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence

upvoted a paper 2 months ago

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

View all activity

Organizations

None yet

upvoted a paper 3 days ago

A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published 3 days ago • 131

upvoted a paper about 2 months ago

A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence

Paper • 2507.21046 • Published Jul 28 • 81

upvoted a paper 2 months ago

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Paper • 2506.24119 • Published Jun 30 • 50

upvoted a paper 3 months ago

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 180

upvoted 2 papers 4 months ago

Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space

Paper • 2505.13308 • Published May 19 • 27

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6 • 183

upvoted a collection 4 months ago

Absolute Zero Reasoner

Collection

6 items • Updated May 9 • 56

upvoted 2 papers 5 months ago

CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning

Paper • 2504.13820 • Published Apr 18 • 16

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published Apr 18 • 133

upvoted 2 papers 7 months ago

ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation

Paper • 2502.18364 • Published Feb 25 • 37

Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarsity

Paper • 2502.11901 • Published Feb 17 • 6

upvoted 2 papers 10 months ago

DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution

Paper • 2411.02359 • Published Nov 4, 2024 • 13

How Far is Video Generation from World Model: A Physical Law Perspective

Paper • 2411.02385 • Published Nov 4, 2024 • 34

upvoted a paper 11 months ago

LLM-based Optimization of Compound AI Systems: A Survey

Paper • 2410.16392 • Published Oct 21, 2024 • 17

upvoted 2 papers about 1 year ago

Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing

Paper • 2407.08770 • Published Jul 11, 2024 • 21

Multimodal Needle in a Haystack: Benchmarking Long-Context Capability of Multimodal Large Language Models

Paper • 2406.11230 • Published Jun 17, 2024 • 34

upvoted a paper over 1 year ago

DiveR-CT: Diversity-enhanced Red Teaming with Relaxing Constraints

Paper • 2405.19026 • Published May 29, 2024 • 8

upvoted 3 papers almost 2 years ago

Andrew Zhao

AI & ML interests

Recent Activity

Organizations

andrewzh's activity