On Spatial Computing

December 7, 2023, By Iris Cuppen

With the advent of spatial computing, interfaces are rapidly changing from two-dimensional screens to virtual environments that address all our senses; beyond our eyes, ears, voice and fingertips. It’s time to embrace the physical space as a computational medium.

Towards a more fluid human-computer interaction

Over the past decades, digital technologies have drastically changed the ways we physically interact with computers and the external world. From the introduction of the mouse and virtual windows, to the development of touch screens and voice commands — our bodies got used to dealing with machines, and tools attached to those machines, through the means of an interface: a point where two systems, subjects, or organisations, meet and interact.

The main purpose of the interfaces we design at Bakken & Bæck is to establish a link between humans and computers. Today, these human-computer interactions happen mostly by fingers pressing on keyboards, clicking on mouses, and tapping on screens. Motion sensing input devices like Microsoft’s Azure Kinect, miniature radars like Google’s Soli chip, and hardware sensor devices like Leap Motion, however, promise us more fluid interfaces, redefining the ways we physically control and interact with the virtual spaces around us. In the last twenty years, time processors and sensor technologies that are used to build these virtual spaces have become cheaper, smaller and more reliable, while being embedded in a more established ecosystem of design tools. The spatial computing discipline is here to stay.

Unlike the two-dimensional interfaces of our phones, tablets and personal computers, spatial computing (e.g. human interaction with a computer that is spatially aware) allows us to be physically “present” in virtual environments. Here, we can walk, turn and look around, and use our whole bodies while doing so. The interactions we have in these environments, however, still mostly follow the same User Interface (UI) guidelines and rules that we established for two-dimensional interfaces. The input mode is focused on precise hand movements (clicking, swiping, holding), ignoring the body as a complex system capable of doing so much more (gesturing, grasping, grooving, and so on). It’s time to investigate and integrate new modes of interaction that address our bodies, and the spaces around us, in a more holistic manner.

From immersion to interaction

Creating spatial environments by addressing (or deceiving) parts of the body is not a digital invention. From Panoramas to Cineramas to Sensoramas, various pre-digital inventions made use of human perception to create a sense of space — exploring monocular depth cues, such as occlusion, relative size, texture gradient, linear and aerial perspective, but also binocular disparity, stereopsis and, most effectively, motion. The presentation of images changing into three-dimensional figures by rotating them before human eyes, the so-called kinetic depth effect, described the optical illusion of three dimensions and could create a sense of “immersion” — of being in a space — long before digital technologies started to do the same.

What we can gather from these pre-digital examples of immersive media is that the techniques used to create illusions or imitations of spaces are not new as such. With or without computers, the biological basis of our perception of space remains the same. What digital technologies have brought us, however, is the ability to actively influence these spaces. With the advent of the personal computer, we started to utilise our motor systems (fingers) to “talk” to computers (input), while our sensory systems (eyes) could be utilised to “listen” to the computer (output). We started to interact through machines with the spaces around us, no longer as mere spectators, but as users.

For a long time, this “user-interaction” remained very limited. Back in the 70s, when computer hardware was expensive, programmers were mostly focused on optimising the use of limited computation, and the few specialists using computers were occupied with learning tedious but machine-efficient operations. You had to understand how the computer worked before you could interact with it, and all interaction was geared towards the machine. As hardware became more powerful and less expensive, the focus started to evolve from convenience for computers towards convenience for people.

Addressing the body as a whole

Tom Igoe and Dan O’Sullivan published Physical Computing (2004), in which they write: “To change how the computer reacts to us, we have to change how it sees us”. They summarise this untapped potential in one simple illustration that contains one big finger (to emphasise human input through sequential tapping, which might as well be the same single finger), with an eye in the middle (to emphasise the focus on two-dimensional screens, for which you basically only need one eye), and two small ears (to emphasise the lack of good audio experiences in our daily encounters with computers). To create HCIs that speak to the whole body, Igoe and O’Sullivan argue, we need to extend this image to incorporate a more accurate sensorial representation of the human body.

If we would draw this same illustration in 2022, it would look slightly different. Eighteen years later, we unlock our phones by just staring at it, we put on our favourite songs by singing to a home assistant, we monitor our heartbeat by carrying a computer around our wrists. Emerging technologies like Artificial Intelligence (AI), the Internet of Things (IoT), and Augmented Reality (AR) all change the ways we interact with and through the spaces we move around in. Even though the underlying technologies have still not reached their zenith, as many things are still in the development phase, it has become pretty clear they are changing how we interact with computers, including the senses we use to do so.

Seeing has been, for a long time, the predominant sense through which we encounter output from the digital realm, mostly in the form of graphic user interfaces. Now that we start to use other parts of our bodies to provide the computer with input (our voice, for instance), the output or feedback we receive from the computer starts to take on new forms as well, which might not be visible (the audio of a voice assistant, for instance). Computers have become both a more integral and a more invisible part of our physical environment. While we see less of the machine, it sees more of us.

The intelligence your hands hold

So, what can our hands do beyond clicking, swapping, or tapping? Let’s expand from the idea of a finger to that of a hand. Embryologists perceive the hand as a kind of antenna that is equipped with five high-sensory organs. Hands are our principle organ of sensation and expression; they feel things (sensation), and they manipulate things (expression). They allow for complicated movement, while their skin has the highest tactile acuity of all body parts. An object can feel wet, dry, sharp, or soft, and those qualities teach us something about the world we live in.

We learn about the world by touching, weighing, feeling it. Imagine the architect who crafts and observes a physical model. The computer does a fine job in helping her design and envision the blueprint, but we sometimes need to hold things in order to truly understand them, to see them from a different angle, to connect with them in new ways. Yet, as Adam Gopnik noted in a 2016 essay in The New Yorker on touch, for every 50 research papers on the science of vision in the last half-century, there has been only one on touch.

We also talk to the world with our hands, often without realising it. Looking at children who are learning to communicate, we see that gestures precede speaking. And like any other form of communication, these gestures have different cultural and contextual meanings. In scuba diving, waving is not a greeting but an indication of danger. In India, you avoid using your left hand while eating or shaking hands. Each body has its own history, its own memories, its own learned behaviour. The virtual interfaces we build need to take these differences into account.

Reclaiming the physical space as a computation medium

When you learn how to play the piano, you teach your hands, that complex network of muscle, skin, hair and bone, to perform one task: play a musical piece. When we interact with a digital interface, we usually employ much simpler bodily actions (tapping, swiping, clicking) for a much wider variety of tasks. This universal language of traditional user interfaces is both a strength and a weakness. It allows us to use different tools in a similar way. On the other hand, it limits the ways in which we can express ourselves more broadly.

While we already swipe, tap and click our way through everyday life, we tend to forget that our bodies can do so much more. With the advent of spatial user interfaces, we can no longer solely rely on the design principles of two-dimensional interactions; we need to learn from disciplines that study the principles of the built environment (like architecture or urban studies) and the movements of the human body (like kinesics (the study of body language) or choreography). It forces us to think differently about what interaction design is, and what new modes of interaction we need now that screens, keyboards and mouses might be replaced by more fluid digital environments.

Former MIT Media Lab researcher Simon Greenwold, who introduced the term “spatial computing” in 2003, wrote: “now that computation’s denial of physicality has gone as far as it can, it is time for the reclamation of space as a computation medium.” Spatial computing brings the physical and digital world together, by attaching the digital interface closer to our bodies (VR) or seamlessly adding a virtual layer over the physical spaces (AR) around us. As digital designers and developers, we are interested in reclaiming space as a computation medium, while shaping the bodily interactions required to intuitively communicate in these spaces.

Leveraging emerging technologies like machine learning, we are continuously trying to find more intuitive modes of interaction between man and machine. Spatial computing offers an exciting new phase in that exploration.

Credits: Baron Lanteigne (visuals)