Apple researchers unveils new AI system for enhanced voice assistant interactions

By Abdul Raouf Al Sbeei - Apple Reporter Published April 2, 2024 4 Min Read

Apple researchers recently unveiled an advancement in artificial intelligence designed to improve voice assistant interactions with the introduction of ReALM (Reference Resolution As Language Modeling) which tackles a key challenge: understanding user references to what’s on their screen (via. VentureBeat).

Voice assistants usually struggles with interpreting ambiguous user commands, particularly those referencing visual elements on a device’s display. ReALM tackles this hurdle by leveraging the power of large language models. These models analyze the on-screen content and contextualize user queries, enabling them to pinpoint the specific information being referenced.

Being able to understand context, including references, is essential for a conversational assistant. Enabling the user to issue queries about what they see on their screen is a crucial step in ensuring a true hands-free experience in voice assistants.

Discover new horizons, always connected with eSIM
Travel the world stress and hassle-free with the best eSIM service available. Enjoy unlimited data, 5G speeds, and global coverage for affordable prices with Holafly. And, enjoy an exclusive 5% discount.
5% OffExplore Now

Apple research team

This innovation hinges on ReALM’s ability to reconstruct the user’s screen. By parsing on-screen elements and their locations, it generates a textual representation that captures the visual layout. This allows ReALM to translate visual information into a language model’s familiar territory. This approach, combined with fine-tuned language models, surpasses existing systems like GPT-4 in understanding screen-based references.

The benefits extend beyond convenience. ReALM paves the way for a truly hands-free experience. Users can interact with their devices seamlessly, issuing voice commands directly related to what they see on the screen. This is particularly valuable for visually impaired users or situations where touching the device is impractical.

Apple researchers acknowledge the limitations of this technology. ReALM relies on automated parsing, which can struggle with complex visual references, like distinguishing between multiple images. Future iterations might incorporate computer vision and multi-modal techniques to address these challenges.

Apple’s upcoming Worldwide Developers Conference (WWDC) on June 10 is expected to serve as a platform for showcasing its AI advancements alongside iOS 18, a major update for iPhones. Speculation also suggests the unveiling of a new large language model framework, an “Apple GPT” chatbot, and a broader integration of AI features within their ecosystem.

Note: Portions of this article may have been generated using AI and have been reviewed for accuracy and quality by our editorial team. The use of AI alone for writing content is strictly prohibited.

Apple releases iOS 18.1.1 with bug and security fixes to the public

Apple now offering to invest $100 million in Indonesia to lift iPhone 16 sales ban

Apple TV+ original movies could soon be licensed to other services

AirTag 2 set to launch by mid-2025, feature better anti-stalking measures

Apple admits iCloud notes disappearing bug, offers fix

Apple releases iOS 18.1.1 with bug and security fixes to the public

Apple now offering to invest $100 million in Indonesia to lift iPhone 16 sales ban

Apple TV+ original movies could soon be licensed to other services

AirTag 2 set to launch by mid-2025, feature better anti-stalking measures

Apple admits iCloud notes disappearing bug, offers fix

Apple researchers unveils new AI system for enhanced voice assistant interactions

Discover new horizons, always connected with eSIM

Leave a Reply Cancel reply

Leave a Reply Cancel reply

Wake up every morning and get caught up on the stories you care about with your personalized feed, curated just for you based on what you love. All, with 0 ads. Explore it now.

Stay Updated

Explore

Microsoft reportedly considering ‘Microsoft Intelligence’ branding

Apple Watch Series 10 to gain sleep apnea detection, lack blood pressure sensor

Roadside Assistance via satellite for iPhone now available in UK

Your Feed

EVERYWHERE

Our Features

Stories to Read

iOS 18 rumored to add hearing health features to AirPods

iPhone 15 lineup sales continue to underwhelm compared to iPhone 14

iOS 18.2 with ChatGPT integration, Genmoji, and Image Playground said to be released in first week of December

Better Reading Expereince

Latest Headlines

Cellebrite iPhone cracking tool cannot forcibly unlock devices running iOS 17.4 and later

Apple says iOS 17 adoption rate is lower than that of iOS 16

Apple releases iOS 17.5.1 with fix for bug causing deleted photos to reappear

Latest Headlines

Apple releases iOS 18.1.1 with bug and security fixes to the public

Apple now offering to invest $100 million in Indonesia to lift iPhone 16 sales ban

Apple TV+ original movies could soon be licensed to other services

You Control What You See

We're Strict

No Tracking or Creepy Cookies