Insights

Interacting in space

Natural interactions using hand tracking and gestures

Doug Cook

—

Sep

2023

Readers of our newsletter and visitors to our website know that we love to play at the intersection of design and technology. We’re particularly drawn to products and experiences that explore what we call “emerging interactions.”

These experiences are unique applications of technology that typically require new and less proven ways of interacting with computers, often challenging the status quo. One of the more recent examples of this is Apple’s Vision Pro. But while the Vision Pro may be a technological marvel from a hardware perspective, its most redeeming feature is the interaction paradigm it supports through spatial computing.

Recently, our team set out to explore Apple’s guiding principles for this new paradigm. While we’ve been exploring natural interactions and spatial computing with a few of our partners, much of this work remains under wraps. So the release of Apple’s Spatial Design Guidelines is a great opportunity to talk more about this work.

‍

Designing for spatial input

One of the most compelling aspects of Apple’s visionOS is its support for hand gestures. Not only are these extremely intuitive, they are also simple and discreet.

Tracked using cameras on Apple’s visor, these include affordances for tapping, dragging, and zooming with finger precision.

‍

‍

Hand tracking and gestures

Using machine learning and a number of readily available models, these interactions can be achieved with a simple web camera and browser.

One of the more popular libraries we’ve used in the past is OpenCV, but Google recently released an even more impressive solution called MediaPipe which is a cross-platform framework for building custom machine learning pipelines with built-in support for streaming media.

Each has its advantages, but both support detailed hand and finger tracking, making them ideal for prototyping these types of interactions.

‍

With MediaPipe, users can detect hand landmarks by applying a machine learning model to a continuous stream, allowing them to plot live coordinates, determine handedness (left/right), and even detect multiple hands.

Trained on over 30,000 images, including both real and synthetic models, Google’s default model tracks 21 hand and knuckle coordinates. By looking at the location of each of these at a given time, it’s possible to infer gestures in real time.

‍

Pinching and dragging

To test this model, we prototyped a simple pinch and drag gesture. In the following example, pinching grabs the object, while pinching and holding allows you to drag. This gesture is super natural and even feels a little magical in this simple context.

‍

Zooming with two hands

Using these models, we can also support macro hand gestures by detecting both hands and the presence of palms. The following example shows how these can be used to zoom in and out on an object.

‍

You can quickly see how with more explorations we might extend this interaction to support rotation as well.

Stay tuned for more of our experiments around gesture-based interactions and interfaces. Until then, let us know what you’ve been exploring.

Have an idea or interested in learning more? Reach out to us on Instagram or Twitter!

‍

Special thanks to Ky Le, Natalie Vanderveen, and Morgan Gerber

Doug Cook

FOUNDER AND PRINCIPAL

Doug is the founder of thirteen23. When he’s not providing strategic creative leadership on our engagements, he can be found practicing the time-honored art of getting out of the way.

Around the studio

UPDATES

Insights

Interacting in space

Designing for spatial input

Hand tracking and gestures

Pinching and dragging

Zooming with two hands

Doug Cook

Around the studio

The prompt-box paradox

LogSeam unveils new agentic security platform

Nebulock raises $8.5M for AI-powered threat hunting

Designing the ghost in the machine

Part III: Beyond the interface

Part II: Human computing

Part I: From scenic to semantic

Designing agentic AI with Dell

Death of a browser

A minimalist’s guide to agents

BAM! BOOM! KAPOW!

Creating a more inclusive, connected world with AI

Fixify raises $25M in Series A

The future of accessibility

Seeing the world through AI

Modeling new shoes

The new language of experience design

How LLMs are reshaping digital experiences

Celebrating Earth Day

Lost in translation

Designing for health and longevity

We made Inc’s 2024 Regionals list!

Intelligent care

Design boom

Designing invisible interfaces

Speaking in gestures

The internship experience

AI in the kitchen

Mentoring our interns

Designing in the age of intelligence

thirteen23 honored by Inc Magazine

2022: A retrospective

Looking forward to new innovations

Camp thirteen23

Rebranding thirteen23

Design collaboration from afar

Bringing Design Friday to our team

Sign up to our newsletter

Thanks for subscribing!