Light R1 14B DS : An open-source 14B-parameter mathematical model, trained using reinforcement learning, with excellent performance.

Light R1 14B DS

AI Model Research Tools #Reinforcement Learning #Mathematical Model #Open Source #Natural Language Processing #Education Standard Picks Open Source

Overview :

Light-R1-14B-DS is an open-source mathematical model developed by Qihoo 360 Technology Co., Ltd. Trained using reinforcement learning based on DeepSeek-R1-Distill-Qwen-14B, it achieved high scores of 74.0 and 60.2 on the AIME24 and AIME25 mathematics competition benchmarks, respectively, surpassing many 32B parameter models. It successfully implemented reinforcement learning on an already long-chain reasoning fine-tuned model under a lightweight budget, providing the open-source community with a powerful mathematical model tool. Its open-source nature promotes the application of natural language processing in education, particularly in mathematical problem-solving, offering researchers and developers valuable research foundations and practical tools.

Target Users :

This model is suitable for researchers and developers in natural language processing, especially those focusing on mathematical problem-solving, educational applications, and reinforcement learning. It provides an excellent reference for teams aiming for high-performance model training on a lightweight budget, enabling quick adoption and research and development.

Total Visits： 25.3M

Top Region： US(17.94%)

Website Views ： 65.7K

Use Cases

Researchers can utilize this model to research and improve mathematical problem-solving algorithms.

Developers can build educational applications based on this model to help students better solve mathematical problems.

Businesses can apply this model to intelligent customer service systems to improve the ability to answer math-related questions.

Features

Reinforcement learning-based long-chain reasoning training enhances mathematical problem-solving capabilities.

Open-source model facilitates secondary development and research by researchers and developers.

Excellent performance in mathematical benchmark tests such as AIME24 and AIME25, with high accuracy.

Supports efficient training under a lightweight budget, reducing computational costs.

Provides detailed training logs and technical reports for easy understanding and reproducibility.

How to Use

1. Visit the Hugging Face website and locate the Light-R1-14B-DS model page.

2. Download the model files and related resources, including training logs and technical reports.

3. Load the model using a supported framework, such as PyTorch or TensorFlow.

4. Fine-tune the model or apply it directly to mathematical problem-solving tasks based on specific needs.

5. Refer to the technical report and training logs to understand the model's training process and optimization methods for better use and improvement.