Since its release at the end of 2025, it has been a product of collaboration between ByteDance and ZTE."Doubao AI Phone"Not only has it sparked heated discussions in the Chinese market, but its unique AI agent operation model has also attracted significant attention from the Western tech community. Recently inAsia Society seminarSeveral US and Chinese technology policy experts likened it to the "DeepSeek moment" in the hardware industry, believing that this product is a "game changer" capable of changing the rules of the industry, but also pointed out the severe ecological challenges it faces in its implementation and promotion.
Evolving from "dialogue" to "action," true OS-level AI
Paul Triolo, a partner at consulting firm DGA Group, pointed out that the biggest breakthrough of the Doubao AI phone lies in advancing the level of AI interaction from simple "text dialogue" to "operational system-level action".
Unlike traditional mobile phone voice assistants that can only passively answer questions or execute simple commands, Doubao AI phone emphasizes multi-agent collaboration.
For example, in a test, American investor Taylor Ogan simply said "Find someone to queue for me" in English, and his phone understood the intention, autonomously searching for services across different apps, placing orders, and setting tasks, all without any manual intervention from the user. This ability to "understand human language and get things done for you" is seen by experts as a major paradigm shift in the interaction logic of mobile devices.
The GUI Agent technology approach challenges the "walled garden" of the internet.
However, technological innovation has also encountered significant challenges from commercial realities. Samm Sacks, a researcher at Yale Law School, points out that although China is leading the way in the field of AI agents, its highly fragmented mobile ecosystem makes it the most difficult market to implement.
Doubao AI phone uses GUI Agent (Graphical Interface Agent) technology to operate third-party apps by simulating human clicks. This model directly bypasses the "user dwell time," "ad clicks," and "traffic redirection paths" that traditional apps rely on, touching a nerve in the business models of major internet giants.
This also explains why the Doubao AI phone was quickly blocked by several mainstream apps after its launch (e.g., forced logout, inability to log in). Experts believe that this is not simply a security and privacy issue, but a battle for "user attention" and "data control." Major manufacturers are unwilling to give up their hard-earned "walled gardens," and even less willing to let the phone manufacturers' AI become the only traffic entry point.
Whoever sets the standards defines the future.
Beyond commercial competition, this contest also points to the right to set global standards for AI agent services.
Currently, there is no globally unified standard for AI agent interaction (e.g., how can AI prove it is authorized? How can cross-application collaboration be secure?). Although Anthropic launched the open-source MCP protocol, which subsequently gained support from companies such as Google and Microsoft, and the Linux Foundation established the Agentic AI Foundation, there is still no unified consensus on mobile devices.
Paul Triolo emphasized that just as TCP/IP enabled the interconnection of the Internet, the widespread adoption of AI agents will inevitably require a set of cross-border and cross-system interoperability standards. If Chinese manufacturers take the lead in establishing feasible operating specifications in this wave of hardware development, these standards are likely to expand globally, thereby defining the rules of the game for future AI devices.
Analysis of viewpoints
The predicament faced by the Doubao AI phone actually foreshadows the core contradiction that global tech giants will encounter in the next few years: "Super gateway vs. Super app".
For the past decade, we've been accustomed to opening individual apps to solve problems; however, the ultimate vision of generative AI is to help you manage all apps through a "super assistant." This presents an opportunity for companies like ByteDance that are trying to control a "new gateway," but for Chinese giants like Tencent and Alibaba, and even American giants like Meta and Google, it represents a potential threat to their survival through "disintermediation."
The so-called "DeepSeek moment" is not just about technological leadership, but also about forcing the industry to confront the clash between "AI agents" and "closed ecosystems." This battle has started in China first, but the same drama will definitely play out between Gemini, ChtGPT, Apple Intelligence, and third-party apps in the future. Whoever can find a balance between privacy and security, commercial interests, and user convenience first will be the real winner.



