arxiv:2509.26603

DeepScientist: Advancing Frontier-Pushing Scientific Findings Progressively

Published on Sep 30

· Submitted by

WENGSYX on Oct 1

Text Intelligence Lab of Westlake University

Upvote

Authors:

Abstract

DeepScientist autonomously conducts scientific discovery through Bayesian Optimization, surpassing human state-of-the-art methods on multiple AI tasks.

AI-generated summary

While previous AI Scientist systems can generate novel findings, they often lack the focus to produce scientifically valuable contributions that address pressing human-defined challenges. We introduce DeepScientist, a system designed to overcome this by conducting goal-oriented, fully autonomous scientific discovery over month-long timelines. It formalizes discovery as a Bayesian Optimization problem, operationalized through a hierarchical evaluation process consisting of "hypothesize, verify, and analyze". Leveraging a cumulative Findings Memory, this loop intelligently balances the exploration of novel hypotheses with exploitation, selectively promoting the most promising findings to higher-fidelity levels of validation. Consuming over 20,000 GPU hours, the system generated about 5,000 unique scientific ideas and experimentally validated approximately 1100 of them, ultimately surpassing human-designed state-of-the-art (SOTA) methods on three frontier AI tasks by 183.7\%, 1.9\%, and 7.9\%. This work provides the first large-scale evidence of an AI achieving discoveries that progressively surpass human SOTA on scientific tasks, producing valuable findings that genuinely push the frontier of scientific discovery. To facilitate further research into this process, we will open-source all experimental logs and system code at https://github.com/ResearAI/DeepScientist/.

View arXiv page View PDF Project page GitHub 40 Add to collection

Community

WENGSYX

Paper submitter 8 days ago

•

edited 8 days ago

🚀 Check out our new work: 《DEEPSCIENTIST: ADVANCING FRONTIER-PUSHING SCIENTIFIC FINDINGS PROGRESSIVELY》

We introduce DeepScientist, the first system to demonstrate fully autonomous, goal-oriented scientific discovery that progressively surpasses the human state-of-the-art. In a striking demonstration, it achieved progress in two weeks comparable to three years of human research and delivered performance gains of up to 183.7% over human SOTA methods. Operating without human intervention, DeepScientist requires only a baseline method and its code repository to launch a month-long discovery cycle. It works day and night to generate thousands of hypotheses, perform repository-level code modifications, and validate its own ideas. This entire autonomous process is governed by an Multi-Agent framework , which leverages a cumulative Findings Memory to intelligently navigate the vast space of possibilities. This work heralds the beginning of an era where AI and human scientists will conduct scientific discovery in parallel!

WENGSYX

Paper submitter 8 days ago

🚀 Check out our new work: 《DEEPSCIENTIST: ADVANCING FRONTIER-PUSHING SCIENTIFIC FINDINGS PROGRESSIVELY》

DeepScientist is the first system designed for goal-oriented, fully autonomous scientific discovery that progressively surpasses human state-of-the-art (SOTA) research.

Current AI Scientist systems often lack the focus to produce scientifically valuable contributions that address human-defined challenges.

We model discovery as a Bayesian Optimization problem, enabling continuous, progressive breakthroughs via a "hypothesize-verify-analyze" closed loop.

🔥 DeepScientist Core Highlights:

Light-speed Research: In AI text detection, DeepScientist achieved progress in just two weeks that is comparable to three years of cumulative human research.

Beyond Human SOTA: Outperformed human 2025 SOTA methods on three frontier AI tasks by up to 183.7%.

Massive Scale: Consumed over 20,000 GPU hours, generated about 5,000 unique ideas, and experimentally validated approximately 1100 of them.

Peer-Reviewed Quality: The AI-generated papers are on par with the average quality of human submissions to ICLR 2025, and achieved a 60% acceptance rate in an automated AI review (vs. 0% for other systems).

📃 Paper Page: DeepScientist
🌐 Project: https://ai-researcher.net
🛠️ Code: https://github.com/ResearAI/DeepScientist

DeepScientist generated over 5,000 ideas, but only 21 ultimately led to scientific innovations—a success rate of less than 0.5%! This reveals that the central question in AI science has shifted from 'Can AI innovate?' to 'How can we efficiently guide its powerful exploratory process?'. Come explore the future of automated scientific discovery with us!

Please consider giving our paper an upvote on the paper page. Thank you for your support! ❤️