

Phi 3 Mini 128k Instruct Onnx
Overview :
Phi-3 Mini is a lightweight top-tier open-source model built upon the synthetic data and filtered websites used by Phi-2, focusing on high-quality inference-intensive data. This model belongs to the Phi-3 series and the mini version has two variants supporting 4K and 128K context lengths. The model has undergone rigorous enhancement processes, including supervised fine-tuning and direct preference optimization, to ensure precise instruction following and robust security measures. These ONNX-optimized Phi-3 Mini models run efficiently on CPUs, GPUs, and mobile devices. Microsoft has also introduced the ONNX Runtime Generate() API, simplifying the usage of Phi-3.
Target Users :
["? Machine learning researchers and developers can leverage this optimized model to enhance inference performance","? Enterprises and organizations that need to deploy large language models on various devices (servers, Windows, Linux, Mac, mobile devices)","? Professionals in dialogue systems, Q&A systems, content generation, and other tasks can utilize this model to generate high-quality outputs","? Any application requiring natural language processing can benefit from the model's powerful performance"]
Use Cases
1. A technology company can use the Phi-3 Mini model to build high-performance conversational agents to provide automated Q&A services to customers.
2. A news agency can leverage the model to automatically generate high-quality news article summaries and headlines.
3. Researchers can use the model to conduct experiments and research related to natural language processing, exploring new uses of language models.
Features
? Supports ONNX format to accelerate inference on CPUs, GPUs, and mobile devices
? Provides various optimization configurations, including int4 quantization for DirectML, fp16 and int4 quantization for NVIDIA GPUs, and int4 quantization for CPUs and mobile devices
? Enhanced through training to ensure precise instruction following and robust security
? A lightweight design focused on high-quality inference-intensive data
? Offers the new ONNX Runtime Generate() API to simplify the usage of Phi-3
? Perfomance tested and optimized on a variety of hardware and platforms
How to Use
1. Download the ONNX model file suitable for your hardware configuration from the GitHub repository.
2. Install necessary Python packages, such as ONNX Runtime and transformers.
3. Load the model and perform inference using the ONNX Runtime Generate() API.
4. Prepare your input text or instructions.
5. Call the model to make predictions or generate outputs.
6. Perform post-processing on the output results if necessary.
7. Integrate the generated output into your application or workflow.
Featured AI Tools

Gemini
Gemini is the latest generation of AI system developed by Google DeepMind. It excels in multimodal reasoning, enabling seamless interaction between text, images, videos, audio, and code. Gemini surpasses previous models in language understanding, reasoning, mathematics, programming, and other fields, becoming one of the most powerful AI systems to date. It comes in three different scales to meet various needs from edge computing to cloud computing. Gemini can be widely applied in creative design, writing assistance, question answering, code generation, and more.
AI Model
11.4M
Chinese Picks

Liblibai
LiblibAI is a leading Chinese AI creative platform offering powerful AI creative tools to help creators bring their imagination to life. The platform provides a vast library of free AI creative models, allowing users to search and utilize these models for image, text, and audio creations. Users can also train their own AI models on the platform. Focused on the diverse needs of creators, LiblibAI is committed to creating inclusive conditions and serving the creative industry, ensuring that everyone can enjoy the joy of creation.
AI Model
6.9M