Collections
Discover the best community collections!
Collections including paper arxiv:2404.12253
-
Octopus v2: On-device language model for super agent
Paper • 2404.01744 • Published • 58 -
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs
Paper • 2404.05719 • Published • 82 -
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Paper • 2404.07972 • Published • 50 -
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing
Paper • 2404.12253 • Published • 55
-
Communicative Agents for Software Development
Paper • 2307.07924 • Published • 6 -
Self-Refine: Iterative Refinement with Self-Feedback
Paper • 2303.17651 • Published • 2 -
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper • 2312.10003 • Published • 44 -
ReAct: Synergizing Reasoning and Acting in Language Models
Paper • 2210.03629 • Published • 29
-
Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset
Paper • 2403.09029 • Published • 55 -
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression
Paper • 2403.12968 • Published • 25 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 72 -
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
Paper • 2403.09629 • Published • 78
-
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
Paper • 2310.04406 • Published • 10 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 109 -
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization
Paper • 2402.09320 • Published • 6 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 117
-
Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models
Paper • 2404.02575 • Published • 50 -
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing
Paper • 2404.12253 • Published • 55 -
SnapKV: LLM Knows What You are Looking for Before Generation
Paper • 2404.14469 • Published • 27 -
FlowMind: Automatic Workflow Generation with LLMs
Paper • 2404.13050 • Published • 34
-
Challenge LLMs to Reason About Reasoning: A Benchmark to Unveil Cognitive Depth in LLMs
Paper • 2312.17080 • Published • 1 -
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing
Paper • 2404.12253 • Published • 55 -
SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension
Paper • 2404.16790 • Published • 10 -
A Thorough Examination of Decoding Methods in the Era of LLMs
Paper • 2402.06925 • Published • 1
-
Can large language models explore in-context?
Paper • 2403.15371 • Published • 33 -
Advancing LLM Reasoning Generalists with Preference Trees
Paper • 2404.02078 • Published • 46 -
Long-context LLMs Struggle with Long In-context Learning
Paper • 2404.02060 • Published • 37 -
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Paper • 2404.03715 • Published • 62
-
Evaluating Very Long-Term Conversational Memory of LLM Agents
Paper • 2402.17753 • Published • 20 -
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding
Paper • 2402.16671 • Published • 29 -
Do Large Language Models Latently Perform Multi-Hop Reasoning?
Paper • 2402.16837 • Published • 29 -
Divide-or-Conquer? Which Part Should You Distill Your LLM?
Paper • 2402.15000 • Published • 24
-
Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models
Paper • 2402.14848 • Published • 20 -
Teaching Large Language Models to Reason with Reinforcement Learning
Paper • 2403.04642 • Published • 50 -
How Far Are We from Intelligent Visual Deductive Reasoning?
Paper • 2403.04732 • Published • 23 -
Learning to Reason and Memorize with Self-Notes
Paper • 2305.00833 • Published • 5
-
Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models
Paper • 2404.02575 • Published • 50 -
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing
Paper • 2404.12253 • Published • 55 -
SnapKV: LLM Knows What You are Looking for Before Generation
Paper • 2404.14469 • Published • 27 -
FlowMind: Automatic Workflow Generation with LLMs
Paper • 2404.13050 • Published • 34
-
Octopus v2: On-device language model for super agent
Paper • 2404.01744 • Published • 58 -
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs
Paper • 2404.05719 • Published • 82 -
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Paper • 2404.07972 • Published • 50 -
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing
Paper • 2404.12253 • Published • 55
-
Challenge LLMs to Reason About Reasoning: A Benchmark to Unveil Cognitive Depth in LLMs
Paper • 2312.17080 • Published • 1 -
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing
Paper • 2404.12253 • Published • 55 -
SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension
Paper • 2404.16790 • Published • 10 -
A Thorough Examination of Decoding Methods in the Era of LLMs
Paper • 2402.06925 • Published • 1
-
Communicative Agents for Software Development
Paper • 2307.07924 • Published • 6 -
Self-Refine: Iterative Refinement with Self-Feedback
Paper • 2303.17651 • Published • 2 -
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper • 2312.10003 • Published • 44 -
ReAct: Synergizing Reasoning and Acting in Language Models
Paper • 2210.03629 • Published • 29
-
Can large language models explore in-context?
Paper • 2403.15371 • Published • 33 -
Advancing LLM Reasoning Generalists with Preference Trees
Paper • 2404.02078 • Published • 46 -
Long-context LLMs Struggle with Long In-context Learning
Paper • 2404.02060 • Published • 37 -
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Paper • 2404.03715 • Published • 62
-
Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset
Paper • 2403.09029 • Published • 55 -
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression
Paper • 2403.12968 • Published • 25 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 72 -
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
Paper • 2403.09629 • Published • 78
-
Evaluating Very Long-Term Conversational Memory of LLM Agents
Paper • 2402.17753 • Published • 20 -
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding
Paper • 2402.16671 • Published • 29 -
Do Large Language Models Latently Perform Multi-Hop Reasoning?
Paper • 2402.16837 • Published • 29 -
Divide-or-Conquer? Which Part Should You Distill Your LLM?
Paper • 2402.15000 • Published • 24
-
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
Paper • 2310.04406 • Published • 10 -
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 109 -
ICDPO: Effectively Borrowing Alignment Capability of Others via In-context Direct Preference Optimization
Paper • 2402.09320 • Published • 6 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 117
-
Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models
Paper • 2402.14848 • Published • 20 -
Teaching Large Language Models to Reason with Reinforcement Learning
Paper • 2403.04642 • Published • 50 -
How Far Are We from Intelligent Visual Deductive Reasoning?
Paper • 2403.04732 • Published • 23 -
Learning to Reason and Memorize with Self-Notes
Paper • 2305.00833 • Published • 5