Yuan2.0-M32
Y
Yuan2.0 M32
Overview :
Yuan2.0-M32 is a mixed expert (MoE) language model featuring 32 experts, out of which 2 are active. It introduces a novel routing network—attention routing—to improve expert selection efficiency, achieving a 3.8% increase in accuracy. The model is trained from scratch using 2000B tokens, with a training computational load only 9.25% of that required by a dense model with the same parameter scale. It demonstrates competitive performance in coding, mathematics, and various specialized fields, utilizing just 3.7B active parameters, with a per-token forward computation requirement of only 7.4 GFLOPS, which is 1/19 of what Llama3-70B demands. It surpasses Llama3-70B in MATH and ARC-Challenge benchmark tests, achieving accuracy rates of 55.9% and 95.8%, respectively.
Target Users :
Yuan2.0-M32 is designed for developers and researchers who require efficient computation and reasoning in coding, mathematics, and specialized fields. Its low computational demand and high accuracy make it an ideal choice for large-scale language model applications.
Total Visits: 29.7M
Top Region: US(17.94%)
Website Views : 49.4K
Use Cases
Used for developing natural language understanding applications.
Provides precise computational support in solving mathematical problems.
Serves as an assistive tool for knowledge acquisition and reasoning in specialized fields.
Features
Mixed expert (MoE) model with 32 experts, 2 of which are active.
Utilizes a new attention routing network to enhance the efficiency of expert selection.
Trained from scratch using 2000B tokens with a low training computational demand.
Excels in coding, mathematics, and specialized fields with competitive performance.
Outperforms other models in MATH and ARC-Challenge benchmark tests.
Operates with just 3.7B active parameters, demonstrating high computational efficiency.
How to Use
1. Set up the environment; it's recommended to use the latest Docker image of Yuan2.0-M32.
2. Preprocess the data according to the provided scripts.
3. Use the sample scripts for model pre-training.
4. Follow the detailed deployment plan from vllm for inference service deployment.
5. Check the GitHub repository for more information and documentation.
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase