

Yuan2 M32 Hf Int4
Overview :
Yuan2.0-M32 is a mixture of experts (MoE) language model featuring 32 experts, of which 2 are active. It introduces a new routing network—an attention router—to improve the efficiency of expert selection, resulting in a 3.8% accuracy boost over models using traditional routing networks. Yuan2.0-M32 was trained from scratch using 200 billion tokens, with a computational cost only 9.25% of that required by similarly parameterized dense models. It demonstrates competitive performance in coding, mathematics, and various professional fields, with only 370 million active parameters out of a total of 4 billion, and a forward computation requirement of just 7.4 GFLOPS per token, which is only 1/19th of Llama3-70B's requirements. In MATH and ARC-Challenge benchmark tests, Yuan2.0-M32 outperformed Llama3-70B, achieving accuracies of 55.9% and 95.8%, respectively.
Target Users :
The Yuan2.0-M32 model is designed for developers and researchers who need to handle vast amounts of data and complex computational tasks, particularly in programming, mathematical calculations, and specialized fields. Its high efficiency and low computational requirements make it an ideal choice for large-scale language model applications.
Use Cases
In the programming domain, Yuan2.0-M32 can be used for code generation and code quality assessment.
In mathematics, the model can solve complex mathematical problems and perform logical reasoning.
In specialized fields such as healthcare or law, Yuan2.0-M32 can assist professionals in knowledge retrieval and document analysis.
Features
Mixture of experts (MoE) model with 32 experts, 2 of which are active.
Utilizes attention routers for more efficient expert selection.
Trained from scratch using 200 billion tokens.
Training computational requirements only 9.25% of similarly parameterized models.
Shows competitive performance in coding, mathematics, and specialized areas.
Has low forward computation demands, requiring only 7.4 GFLOPS per token.
Excels in MATH and ARC-Challenge benchmark tests.
How to Use
1. Set up the environment by launching the Yuan2.0 container using the recommended Docker image.
2. Prepare the data according to the documentation instructions.
3. Use the provided scripts to pre-train the model.
4. Follow the detailed deployment plan in vllm for deployment of inference services.
5. Visit the GitHub repository for additional information and documentation.
6. Adhere to the Apache 2.0 open-source license agreement and understand the 'Yuan2.0 Model License Agreement'.
Featured AI Tools

Gemini
Gemini is the latest generation of AI system developed by Google DeepMind. It excels in multimodal reasoning, enabling seamless interaction between text, images, videos, audio, and code. Gemini surpasses previous models in language understanding, reasoning, mathematics, programming, and other fields, becoming one of the most powerful AI systems to date. It comes in three different scales to meet various needs from edge computing to cloud computing. Gemini can be widely applied in creative design, writing assistance, question answering, code generation, and more.
AI Model
11.4M
Chinese Picks

Liblibai
LiblibAI is a leading Chinese AI creative platform offering powerful AI creative tools to help creators bring their imagination to life. The platform provides a vast library of free AI creative models, allowing users to search and utilize these models for image, text, and audio creations. Users can also train their own AI models on the platform. Focused on the diverse needs of creators, LiblibAI is committed to creating inclusive conditions and serving the creative industry, ensuring that everyone can enjoy the joy of creation.
AI Model
6.9M