

Yuan2.0 M32 Hf Int8
Overview :
Yuan2.0-M32-hf-int8 is a mixture of experts (MoE) language model featuring 32 experts, of which 2 are active. By adopting a new routing network—the attention router—it enhances the efficiency of expert selection, resulting in an accuracy improvement of 3.8% compared to models using traditional routing networks. Yuan2.0-M32 was trained from scratch on 200 billion tokens, with its training computation demand being just 9.25% of that required by a dense model of equivalent parameter size. This model is competitive in programming, mathematics, and various specialized fields while utilizing only 3.7 billion active parameters, which is a small portion of a total of 4 billion parameters. The forward computation per token requires only 7.4 GFLOPS, just 1/19th of what Llama3-70B demands. Yuan2.0-M32 outperformed Llama3-70B in the MATH and ARC-Challenge benchmark tests, achieving accuracy rates of 55.9% and 95.8%, respectively.
Target Users :
The Yuan2.0-M32-hf-int8 model is designed for developers and researchers who need to handle large volumes of data and complex tasks, particularly in programming, mathematics, and specialized fields. Its high efficiency and accuracy make it an ideal choice in these areas.
Use Cases
Utilized for developing complex programming projects, enhancing code generation accuracy.
Provides precise calculations and reasoning in solving mathematical problems.
Applies to knowledge acquisition and text generation in professional fields.
Features
Only 2 out of 32 experts are active, enhancing efficiency.
Utilizes attention routers to improve accuracy by 3.8%.
Trained from scratch using 200 billion tokens.
Training computation cost is just 9.25% of an equally sized dense model.
Competitive performance in programming, mathematics, and other fields.
Excels in MATH and ARC-Challenge benchmark tests.
How to Use
1. Set up the environment and start the Yuan2.0 container using the recommended Docker image.
2. Perform data preprocessing according to the provided scripts.
3. Use example scripts for model pre-training.
4. Refer to the vllm documentation for detailed deployment to provide inference services.
5. Visit the GitHub repository for more information.
Featured AI Tools

Gemini
Gemini is the latest generation of AI system developed by Google DeepMind. It excels in multimodal reasoning, enabling seamless interaction between text, images, videos, audio, and code. Gemini surpasses previous models in language understanding, reasoning, mathematics, programming, and other fields, becoming one of the most powerful AI systems to date. It comes in three different scales to meet various needs from edge computing to cloud computing. Gemini can be widely applied in creative design, writing assistance, question answering, code generation, and more.
AI Model
11.4M
Chinese Picks

Liblibai
LiblibAI is a leading Chinese AI creative platform offering powerful AI creative tools to help creators bring their imagination to life. The platform provides a vast library of free AI creative models, allowing users to search and utilize these models for image, text, and audio creations. Users can also train their own AI models on the platform. Focused on the diverse needs of creators, LiblibAI is committed to creating inclusive conditions and serving the creative industry, ensuring that everyone can enjoy the joy of creation.
AI Model
6.9M