START

by Alibaba Group, University of Science and Technology of China

START is a tool-enhanced reasoning model that improves the reasoning capabilities of large language models by integrating external tools like Python code executors.

What is START?

START (Self-Taught Reasoner with Tools) is a novel reasoning model developed by Alibaba Group and the University of Science and Technology of China. It enhances the reasoning capabilities of large language models (LLMs) by integrating external tools such as Python code executors. START employs the "Hint-infer" technique to insert prompts during the reasoning process, encouraging the model to use external tools. It also utilizes the "Hint-RFT" framework for self-learning and fine-tuning. START introduces tool invocation on top of long-chain reasoning (Long CoT), significantly improving accuracy and efficiency in complex mathematical problems, scientific questions, and programming challenges. It has outperformed existing models in multiple benchmarks and is the first open-source model to combine long-chain reasoning with tool integration.

START

Main Features of START

Complex Calculations and Verification: Calls Python code executors to perform complex mathematical calculations, logical verifications, and simulations.
Self-Debugging and Optimization: Uses tools to execute code and verify outputs, automatically detecting errors and debugging to improve answer accuracy.
Multi-Strategy Exploration: Guides the model to try various reasoning paths and methods based on hints, enhancing flexibility and adaptability when facing complex problems.
Improved Reasoning Efficiency: Reduces hallucinations in complex tasks through tool invocation and self-verification, improving reasoning efficiency and reliability.

Technical Principles of START

Long-Chain Reasoning: Inherits the advantages of long-chain reasoning, breaking down problems into multiple intermediate reasoning steps to simulate deep human thinking and improve reasoning capabilities in complex tasks.
Tool Integration: Compensates for the shortcomings of traditional long-chain reasoning by invoking external tools like Python code executors. The model generates code during reasoning and uses tools to verify results.
Hint-infer: Inserts manually designed hints during the reasoning process to encourage the model to invoke external tools. Guides the model to call tools at specific nodes without additional demonstration data.
Hint-RFT: Combines Hint-infer with Rejection Sampling Fine-Tuning (RFT) to score, filter, and modify the reasoning trajectories generated by the model, further optimizing the model's tool usage capabilities.
Self-Learning Framework: Uses active learning to select valuable data from the model's reasoning trajectories for fine-tuning, enabling the model to learn how to use tools more effectively.
Test-Time Expansion: Inserts hints at the end of reasoning to increase the model's thinking time and tool invocation frequency, improving reasoning accuracy and success rates.

Project Address of START

arXiv Technical Paper: https://arxiv.org/pdf/2503.04625

Application Scenarios of START

Mathematical Problem Solving: Solves complex mathematical problems, such as math competitions and advanced mathematics, using code verification to improve accuracy.
Scientific Research Assistance: Helps with complex calculations and scientific problems in physics, chemistry, biology, and other fields.
Programming and Debugging: Generates code and automatically debugs to solve programming challenges, improving development efficiency.
Interdisciplinary Problem Solving: Integrates knowledge from multiple disciplines to solve complex tasks in engineering design, data analysis, and more.
Education and Learning: Serves as an intelligent tutoring tool to assist students in learning mathematics and science, providing detailed problem-solving processes and feedback.

Model Capabilities

Model Type

language

Supported Tasks

Mathematical Problem Solving Scientific Research Assistance Programming And Debugging Interdisciplinary Problem Solving Education And Learning

Usage & Integration

License

Open Source

Screenshots & Images

Primary Screenshot

Additional Images

Try Now

Stats

92 Views

0 Favorites

Similar Models

Ola by Tsinghua University, Tencent Hunyuan Research Team, NUS S-Lab

316

Zonos by Zyphra

291

Step-Video-T2V by Leapfrogging Star

315

START

What is START?

Main Features of START

Technical Principles of START

Project Address of START

Application Scenarios of START

Model Capabilities

Usage & Integration

Screenshots & Images

Stats

Similar Models

What’s in Startup Plan?

What’s in Startup Plan?

What’s in Startup Plan?

What’s in Startup Plan?

Details

Frameworks

Database

Billing

Completed

Project Type

Project Settings

Drop files here or click to upload.

Budget

Build a Team

Set First Target

Upload Files

Drop files here or click to upload.

Project Created!

No result found

Advanced Search

Search Preferences

START

What is START?

Main Features of START

Technical Principles of START

Project Address of START

Application Scenarios of START

Model Capabilities

Usage & Integration

Screenshots & Images

Stats

Similar Models

Drop files here or click to upload.

Drop files here or click to upload.