Apple is all set to host its ‘Let Loose’ event on Tuesday, May 7, where it is expected to introduce new iPad Air and iPad Pro models and a new Apple Pencil. However, the Worldwide Developers Conference (WWDC) 2024 on June 10, could be the event to lookout for as it can potentially change the company’s approach to its devices, particularly the iPhone. It is said that the Cupertino-based tech giant will unveil its artificial intelligence (AI) strategy and introduce new features with iOS 18. Based on the published papers by Apple researchers, we can see the company’s vision behind it.
In the last couple of months, Apple researchers have published several new papers focusing on AI models and their functionalities. We have seen new AI models with computer vision, an AI model that can detect what’s visible on the screen, and even image editing AI models. Further, there are particular research papers that also focus on improving on-device chatbot and adds contextual prompt processing capabilities. This particular model could be for Siri, and makes it more efficient and capable of performing more complex tasks.
In most of Apple’s published research papers, there is a focus on small language models (SLMs) that can operate independently inside a device. For example, the company published a paper on an AI model dubbed ReALM, which is shortened for Reference Resolution As Language Model. This model’s functionality is described as performing and completing tasks that are prompted using contextual language. The description has led to the belief that this model could be used to upgrade Siri.
Another such research paper mentions a ‘Ferret-UI’, a multimodal AI model that is “designed to execute precise referring and grounding tasks specific to UI screens, while adeptly interpreting and acting upon open-ended language instructions.” In essence, it can read your screen and perform actions on any interface, be it the Home Screen, or an app. This functionality could essentially make it much more intuitive to use an iPhone via verbal commands over finger gestures.
Then there is Keyframer, which claims it can generate animation from static images, and another AI model that can edit images using AI. These capabilities could exponentially improve the Photos app and allow users to perform complex edits in simple steps, similar to what DALL-E and Midjourney offer.
However, it should be noted that these speculations are based on the published research papers by Apple, and there is no guarantee that they will be turned into a feature. Apple’s vision behind AI will be clearer after the keynote session at the WWDC 2024.
Affiliate links may be automatically generated – see our ethics statement for details.