Although OpenAI's ChatGPT is already quite eloquent, this AI company is not content with that. According to...The Information websiteSources indicate that OpenAI is planning to expand its reach to physical devices, focusing its core technology entirely on audio interaction. To achieve this goal, OpenAI has reportedly undergone a major restructuring of its internal teams over the past two months, allocating more resources to the development of audio models. All of this is geared towards the long-rumored product, expected to launch in about a year (early 2027).Mysterious AI Hardware.
Say goodbye to translation delays and create native auditory AI.
Current AI voice assistants (including ChatGPT Voice) mostly operate on a "speech-to-text (STT) -> text model processing -> text-to-speech (TTS)" process. While usable, the conversion process inevitably introduces latency. Furthermore, according to industry insiders, most current audio models are still less capable of logical reasoning than pure text models.
The report indicates that OpenAI's new team is working on developing a new "audio-first" model, attempting to enable AI to directly understand and generate sound, eliminating the intermediate step of translating it into text. This would not only significantly improve the immediacy of conversations but also allow AI to more sensitively capture emotional changes in tone.
More than just glasses, it's about being "Always On".
As for what this mysterious hardware actually looks like.
Currently, the trend of AI development in Silicon Valley seems to be shifting from screen devices to wearable devices. For example, Google is promoting the development of Audio Overviews voice search, while Meta has achieved initial success with Ray-Ban smart glasses and recently acquired Limitless, a startup that focuses on wearable AI recording.
OpenAI, on the other hand, claims that its hardware will..."More than just a pair of glasses"While specific details remain confidential, the device will emphasize its "Always on" functionality.
This means that this hardware device may not require waking up or unlocking like a mobile phone, but rather act like a portable, invisible secretary.Continuously listen to and perceive the state of the environmentAnd they are ready to provide assistance at any time. This also aligns with Silicon Valley's current vision of "screenless computing"—allowing AI to blend into the background and only appear when needed.
Three devices manufactured by Foxconn?
Further reports suggest that OpenAI will have at least three hardware devices, not just one, with one codenamed "Gumdrop" to take the form of an "AI pen." Previous rumors indicated that OpenAI's hardware devices would be designed to be worn on the body, similar to...AI Pin, previously acquired by HP, was created by Humane..
As for the hardware device codenamed "Gumdrop," it's rumored that OpenAI originally intended to have Luxshare Precision manufacture it. However, considering the current US-China trade war leading to high tariffs on "Made in China" products, it's possible that production will be shifted to Foxconn's manufacturing lines in Vietnam and other regions.It is possible that the assembly will be carried out by Foxconn's production lines in the United States..
Analysis: Hardware is merely a carrier; the soul lies in "response speed".
In my opinion, OpenAI's shift of focus to audio is a very accurate judgment.
Looking back at 2024-2025, the reason why devices such as Humane AI Pin or Rabbit r1 failed so miserably was largely due to their "slow response" and "lack of intelligence." If OpenAI can truly achieve "zero latency" and "emotional" dialogue between machines and humans through native audio models, then whether the hardware is made into glasses, necklaces, or headphones is merely a matter of form.
If a year from now, we can see a device that allows us to converse like a real person simply by speaking, without having to take out our phones or say "Hey Siri," that might be the true "iPhone Moment" for AI hardware.
