Amidst the onslaught of smartphone brands integrating AI into their calling functions, traditional telecom operators are attempting a counterattack through underlying network technologies. At MWC 2026 this year, the GSMA (GSM Association) officially released the white paper "Giga Uplink, Deterministic Latency, and Network Evolution in the Era of Mobile AI" at the 5G Futures Summit. This white paper not only clearly defines the development trend of "native AI voice calling" (AI Calling) for telecom operators, but also proposes for the first time a new AI call experience evaluation standard (including the QoI metric).
While tech giants like Google and Apple are trying to retain the added value of voice services on mobile devices, the GSMA is calling on global telecom operators to upgrade AI calling into an "inclusive subscription service" that does not rely on high-end mobile phones by combining 5G-Advanced with network computing power.
Tech giants' encroachment: Mobile OS completely takes over the call experience
Before delving into the GSMA white paper, let's first understand the strong trends in the current smartphone market.
In recent years, the innovation of the call experience has gradually shifted from "telecom operators" to "mobile operating systems" and terminal device manufacturers. Google has long promoted its Call Screen and Direct My Call features in its Pixel series phones, allowing Google's AI assistant to answer calls and transcribe conversations. Apple introduced Live Voicemail in iOS 17 and recently integrated call recording and AI summarization into Apple Intelligence. Samsung is also heavily promoting its "two-way real-time translation" feature in calls with Galaxy AI.
These features, spearheaded by major smartphone manufacturers, share a common characteristic: they "highly rely on the NPU computing power built into the phone." This means that consumers must purchase the latest and most expensive flagship models to enjoy the convenience of these AI-powered calls.
GSMA's Counterattack: "Native AI Calling" that Works with Any Phone
Faced with the crisis of becoming merely a provider of "dumb pipes," the GSMA, through this newly released white paper, proposes a solution led by telecom operators: directly overlaying AI algorithms and computing power onto the native IMS voice network.
The white paper defines several core applications of native AI Calling:
• Unlimited AI noise reduction (immersive call experience):Through AI algorithms on the network, ambient noise is eliminated directly during transmission. Whether in an office (above 40 decibels), a noisy street (above 60 decibels), or a construction site with heavy machinery (above 80 decibels), extremely clear calls are achieved. Most importantly, this feature is completely independent of the user's phone's hardware specifications.
• Instant translation that breaks down language barriers (interactive calls):By combining the data channel (DC) and the video channel (VC), the network can provide accurate real-time voice transcription or translation directly during video calls, and even support screen sharing and customer service chatbot interaction.
From "Traffic Monetization" to "Experience Monetization" and the New QoI Metric
For telecom operators, the core purpose of promoting AI Calling is to find new business models. The white paper points out that telecom operators can package these AI functions into subscription services, allowing users to pay an additional monthly fee to unlock AI enhancements during calls, successfully shifting the business model from a single "data traffic monetization" (such as selling unlimited data) to "experience monetization".
To standardize this service, the GSMA has more systematically defined the experience evaluation model for AI Calling. In addition to the traditional QoE (Quality of Experience), QoS (Quality of Service), and coverage, three key items have been added:
• AI Immersive Experience:The improvement in MOS (Mean Opinion Score) and signal-to-noise ratio (SNR) after noise reduction was evaluated.
• AI-powered interactive experience:Evaluate the interaction latency and accuracy of the new channel (DC/VC).
• QoI (Quality of Intelligence):This is a key metric for measuring how "intelligent" a voice network is, including the quality of AI models, state-aware decision-making capabilities, and the ability to provide inclusive AI services.
Currently, the International Telecommunication Union (ITU) has launched a working project called P.AI-MOS to evaluate multimodal AI experiences, and the GSMA is actively calling on the industry to establish corresponding rules between key quality indicators (KQI) for AI applications and network KPIs.
Analysis of viewpoints
From a user's perspective, the AI calling features offered by Apple and Google on mobile devices provide an excellent experience, typically deeply integrated with the phone's OS and requiring no additional payment to telecom operators. However, while telecom operator-led "network-based AI Calling" may require additional fees, it possesses an irreplaceable advantage: inclusivity and cross-platform interoperability.
As long as telecom operators deploy AI computing power in their data centers, even users with inexpensive entry-level 5G phones, or even feature phones that only support VoLTE, can still enjoy top-tier AI noise reduction and real-time translation services. Furthermore, it breaks down system barriers between iOS and Android, ensuring a consistent enhanced experience for both parties during calls.
The future voice call market is expected to become a hybrid of "device-side edge computing" and "network-side cloud computing power." Whether telecom operators can successfully persuade consumers to pay a subscription fee for this "network-native AI experience" will depend on whether they can provide call value-added services that are more stable, have lower latency, and are more accurate than those offered by Google and Apple.




