top of page
  • alicia5593

Thoughts on the Apple #VisionPro



Having had a chance to work with the Apple Vision Pro for some time now, not just professionally, but participating in a dev jam this weekend to build an app, I’d like to share some thoughts on the device. My hope is to capture some of the technical nuance developers and AR/VR practitioners might want to know in real settings, rather than just general impressions or hype.

Regardless of anything below, I’m happy to be working with a new device and seeing the attention its drawing to the industry at large. Take the ramblings with a grain of salt, opinions are my own.

TL;DRA good device for the right use-case. Some impressive features, some steps back, and overall an incremental improvement to the landscape. Despite its name, the #VisionPro lacks vision.

The UI/UX:

#VisionPro’s use of eye+hand tracking is fun, and if you haven’t experienced high-quality eye-tracking before it is very impressive. However, from a practical standpoint it can cause a lot of frustration. We often aren’t aware of how much our eyes move, or even the resting state or twitching of our hands – both of which have led users (including myself) to “mis-click” a number of times. While simple activities, such as navigating to Apple TV and watching a movie are great (and even intuitive), more productivity-focused activities quickly show that frustrating circumstances are far from edge cases. More importantly, it also leads to what I would consider bad UX in some common use-cases. 

The best example I saw of this is browsing image thumbnails carousels in Safari. For many websites, a button on the edge of the thumbnail has to be clicked to advance the carousel (as opposed to scrolling). This requires the user to look away from the image itself, advance the carousel and then return their gaze to the image. Compare this to a controller/mouse-based interface where you can click through without taking your eyes off the image. That may seem like a small annoyance, but if you’re attempting to quickly assess image comparisons or preview several large collections the annoyance is substantial. And yes, developers on the #VisionPro can (and should) avoid such a situation, but in the case of the internet at large it's unlikely we will see broad-spectrum accommodations any time soon. And this is just one example, when productivity and efficiency is expected, the UX leaves something to be desired with current design paradigms. I’m also unconvinced those paradigms should change in all cases, rather the headset shouldn’t be approached as a total replacement for many workflows.

There are also natural uncertainties to this type of interface that limit its usability. We’ve made no robust technical determination of the accuracy of the eye or hand-tracking. Rather, what we are seeing in testing is that human variabilities (e.g., small variations in a given user’s motions) cause frustrating situations to occur frequently. And this should be expected, especially with a lack of tactile feedback – my hands have little to no real time feedback until the system has responded to my action, especially with discrete user input like gestures.

This has a critical impact on what developers need to consider when designing their own UX for #VisionPro apps. Achieving axis-based inputs, such as one might want for even very casual interactive applications, requires accommodating the uncertainties of how a user moves their hands. As an example, a user is unlikely to be able to reliably move their hand in a perfectly horizontal manner. And within immersive applications, where one generally wants the user to freely observe the virtual surroundings developers and designs must be conscious of avoiding accidental interactions driven by eye-gaze. In our designs the past few weeks, we have encountered this exact issue not only in our own applications but in Apple first party applications. Consider the following simple thought experiment (or run it yourself if you have a headset!): How might you compare the hover to not-hover state of a UI element in-device, when looking at it activates hover? The casual reader can surely imagine solutions, but introduces considerations to generating and reviewing design comps.

Finally, the lack of convenient mechanics for force-closing apps or returning to the home launcher without lifting one's hands to the headset leaves something to be desired. Hand gestures to return to the home screen are commonplace on other headsets, and even iOS has made those mechanics convenient. I imagine we’ll see this refined in visionOS as well. If an application crashes, the current force-quit workflow is quite clunky. From a practical perspective, this means #VisionPro developers need to ensure their UI workflows work robustly and provide methods for exit with minimal friction.

Screens and Passthrough:

The screens are amazing. Full stop. 

As a person who spends a great deal of time in various AR and VR headsets, the #VisionPro’s superiority is immediately noticeable. If screen resolution is the only factor in making a decision on a headset, don’t bother looking elsewhere. 

Unfortunately, the same cannot be said for the passthrough. I’ve been lucky to have worked closely with every major (and some not-so-major) headset offering passthrough for the past several generations. Each generation has shown dramatic gains, but we are seeing those gains start to level off. While somewhat better than the passthrough on a Quest 3 in a side-by-side comparison, the #VisionPro is not dramatically better. And like other headsets, the passthrough is highly impacted by lighting. Spatial warping at near distances is very noticeable, as one might expect especially given the headset design. In AR/MR applications we’re very concerned with motion-to-photon latency, for which there are several additional considerations even with Apple’s 12ms photon-to-photon latency. Our active testing in high-motion MR settings demonstrated that standalone passthrough still isn’t quite ready for some of the uses-cases we care about.

Despite hoping for better, the #VisionPro continues in the same vein as other headsets where the real world simply is not a first-class citizen in terms of content. It is still quite apparent you are viewing a video feed, graininess and color distortion is quite noticeable, and visibility at distances is severely diminished. This is, somewhat ironically, made even more noticeable when juxtaposed with the incredible resolution of the virtual content.

Overall, the lack of material improvements in passthrough is the greatest disappointment, and shows that others in the industry are leading that charge. This is very much a VR headset, and expectations for its use should reflect that. As primarily a practitioner of optical AR, where rendering of the real world is literally as good as it gets, I’m somewhat biased. But I hold that is what we should be shooting for, and that the real value of the technology is in integrating the physical and the virtual. I can’t help but feel Apple didn’t push for a big enough vision with the #VisionPro

Apps & Ecosystem:

Despite a lot of hope for the converse, I think we are seeing the natural outcome of Apple making very limited availability of the #VisionPro to developers prior to release. The amount of content that makes even limited use of the new form factor and spatial capabilities is disappointingly small. Yes, this will change. But for the moment it does have many rightfully wondering why, exactly, they got the device.

However, the screen resolution really does make using the #VisionPro as an external monitor a wonderful experience. Latency is acceptable, all windows remain readable and of course using the computer avoids all the frustrations of the eye-tracking UX in the headset. Importantly, this also helped me answer a critical question: Yes, one can have a monitor that is too large to use comfortably.  I’m looking forward to multi-screen support in the future. 

There are issues and limitations, such as a technically interesting one by Karl Guttag. But in general, this feature meets exactly what many have come to expect from Apple.

As an aside: Comparisons to the iPhone launch are tired and fall short. The industry, user and societal landscapes are very different. The iPhone had a killer use-case (and apps) out the gate. But that’s a discussion for another time.

Development:

Both Apple and Unity have done a great job of providing a development ecosystem, right from the start. Many have had understandable concerns about Unity given recent events, but their SDKs supporting visionOS have certainly not suffered. Whether doing native development or using Unity we found it is possible to get up and running in under an hour. Everything from device simulation to integrating with existing codebases is smooth and parallels exactly what iOS developers have come to expect. In particular, I’d like to call out Unity and Apple’s fairly successful attempt at an API that is syntactically and semantically consistent (even if not fully compliant) with OpenXR.

Yes, there are limitations and shortcomings in the Unity SDK (such as a lack of usable animation culling), but overall it’s been pleasant to develop in with documentation about on par with other SDKs (which isn’t saying much).

Device Comfort and Use:

The device is relatively heavy, and the (non-breakaway) tethered battery is not only an annoyance but a hindrance to certain use-case due to safety concerns (even more so as a non-optical AR device). 

For practical implementations, the #VisionPro does not currently look well-suited for multi-user environments. At the current price point, most organizations are unlikely to find the value in enterprise applications to justify purchasing a headset per user. But this drawback is more than just software support for multiple users, or personalized nature of avatars – subtle aspects like the front glass will give many enterprise purchasers pause as to how well the device can survive being handed off between several people. Are there solutions to the lack of ruggedness? Of course. And more will hit the market over time. But in the case of adopting new technologies, first impressions really do matter particularly to enterprise buyers. Take all of that with this caveat: It obviously really depends on the enterprise market being sold into.

On the plus side, the knitted headband is surprisingly comfortable. The distribution of the weight to a larger surface area of the skull, plus the choice of a soft fabric really do make a difference. I hope other headset manufacturers will consider similar designs.




Mik Bertolli, PhD

CSO Avrio Analytics


19 views0 comments
bottom of page