X

autoglm

Bing Rank
Average Position of Bing Search Engine Ranking of related query such as 'Sales AI Agent', 'Coding AI Agent', etc.

Last Updated: 2025-04-15

Information

## AutoGLM: Autonomous Foundation Agents for GUIs AutoGLM is a new series developed from ChatGLM family, which targets autonomous mission completion agents via Graphical User Interfaces (GUIs) such as Phone and Web. Its web use ability will be progressively available to public via Qingyan Plugin and its phone use ability on Android is currently under invited internal testing (Application Form for CHN Mainland or for outside CHN Mainland). We present AutoGLM, a new series in the ChatGLM family~\cite\{glm2024chatglm\}, designed to serve as foundation agents for autonomous control of digital devices through Graphical User Interfaces (GUIs). While foundation models excel at acquiring human knowledge, they often struggle with decision-making in dynamic real-world environments, limiting their progress toward artificial general intelligence. This limitation underscores the importance of developing foundation agents capable of learning through autonomous environmental interactions by reinforcing existing models. Focusing on Web Browser and Android as representative GUI scenarios, we have developed AutoGLM as a practical foundation agent system for real-world GUI interactions. Our approach integrates a comprehensive suite of techniques and infrastructures to create deployable agent systems suitable for user delivery. Through this development, we have derived two key insights: First, the design of an appropriate “intermediate interface” for GUI control is crucial, enabling the separation of planning and grounding behaviors, which require distinct optimization for flexibility and accuracy respectively. Second, we have developed a novel progressive training framework that enables self-evolving online curriculum reinforcement learning with AutoGLM. Our evaluations demonstrate AutoGLM’s effectiveness across multiple domains. For web browsing, AutoGLM achieves a 55.2% success rate on VAB-WebArena-Lite (improving to 59.1% with a second attempt) and 96.2% on OpenTable evaluation tasks. In Android device control, AutoGLM attains a 36.2% success rate on AndroidLab (VAB-Mobile) and 89.7% on common tasks in popular Chinese APPs.

Prompts

1

Order a hot coconut latte from Luckin Coffee with half sugar

Reviews

Tags


  • chenxz18 2025-01-06 09:12
    Interesting:5,Helpfulness:5,Correctness:5
    Prompt: Order a hot coconut latte from Luckin Coffee with half sugar

    After reading the paper and tried some of AutoGLM demo, I found the idea of using LLM to convert prompt to user actions on Android Mobile Apps are very promising. It's great work that deserves to follow.

Write Your Review

Detailed Ratings

ALL
Correctness
Helpfulness
Interesting
Upload Pictures and Videos

Name
Size
Type
Download
Last Modified

Upload Files

  • Community

Add Discussion

Upload Pictures and Videos