Mobile-Agent
M
Mobile Agent
Overview :
Mobile-Agent is an autonomous multi-modal mobile device agent that leverages Multi-Modal Large Language Model (MLLM) technology. Firstly, it utilizes visual perception tools to accurately recognize and locate visual and textual elements on the front-end interface of applications. Based on the perceived visual environment, it autonomously plans and decomposes complex operational tasks and navigates mobile applications through step-by-step operations. Unlike previous solutions that relied on application-specific XML files or mobile system metadata, Mobile-Agent's vision-centric approach offers greater adaptability in various mobile operational environments, eliminating the need for customization to specific systems. To evaluate the performance of Mobile-Agent, we introduced Mobile-Eval, a benchmark for evaluating mobile device operations. Based on Mobile-Eval, we conducted a comprehensive evaluation of Mobile-Agent. Experimental results show that Mobile-Agent achieved significant accuracy and completion rates. Even with challenging instructions, such as multi-app operations, Mobile-Agent was still able to fulfill the requirements.
Target Users :
Mobile-Agent can be used to automate mobile device operations, evaluate mobile device performance, and improve the adaptability of mobile applications.
Total Visits: 0
Website Views : 258.1K
Use Cases
Automation of Mobile Device Operations: Mobile-Agent can be used to automate the execution of tasks within mobile applications, increasing efficiency.
Mobile Device Performance Evaluation: Leverage Mobile-Agent to evaluate mobile device operations to enhance performance.
Improved Adaptability of Mobile Applications: Mobile-Agent can help mobile applications achieve greater adaptability across different environments.
Features
Leverages Multi-Modal Large Language Model (MLLM) technology
Utilizes visual perception tools to accurately recognize and locate visual and textual elements on the front-end interface of applications
Autonomously plans and decomposes complex operational tasks
Navigates mobile applications through step-by-step operations
Offers greater adaptability, eliminating the need for customization to specific systems
Introduced Mobile-Eval, a benchmark for evaluating mobile device operations
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase