Aiming to explore new ways for designers to engage with generative neural networks, we worked towards a series of computer windows that invite you into the latent space: a mathematical space that represents endless visual possibilities.
What we did
- Contextual research
- Machine learning data training
- Machine learning model development
- UI design
- Frontend development
“[The computer’s] immense combinatorial capacity facilitates the systematic investigation of the infinite field of possibilities”Vera Molnár (1)
Jorge Luis Borges
“I was afraid that not a single thing on earth would ever again surprise me; I was afraid I would never again be free of all I had seen.”
As a technology-driven studio, we have been infusing digital design with machine learning for many years. When based on human design principles and verified quality input data, we have seen that these technologies offer us the perfect tool to handle, analyse, and generate much more complicated patterns than a human ever accurately could. This helps us navigate the abundance of choice: from sorting spam mail and checking our spelling, to recommending songs or predicting the weather.
While these straight-forward use-cases have proven to be handy in our everyday lives, we have been interested in investigating a much more abstract domain that is central to our everyday practice: the role generated machine patterns could play in processes of visual discovery.
For this contextual research project, we assembled an internal team of machine learning engineers, designers, developers, writers, and researchers, who collaborated with machine learning researcher and resident Claartje Barkhof to develop interactive ways for exploring the possibilities and complexities of generative machine learning.
How can we see more, not less, of the visual material neural networks can generate, and the logics that produce them?
Today’s internet users increasingly rely on complex algorithmic technologies while understanding less and less about them. In public discourse, the “algorithm” has become shorthand for explaining a process that is opaque, almost magical, while having a growing influence over how we navigate and design the world. This tension only grows with the demand for user experiences that are as seamless as possible, with little need for human intervention or understanding.
To imagine, describe and make sense of the world complex computational systems have constructed, given our inability to fully comprehend them, we fall back on metaphors, like the black box (2). While the mental image of the black box (rightfully!) conveys that deep learning processes are inherently hard to conceive in human cognition, it also reduces, simplifies and obscures the computational complexity it captures, and positions us as users without agency, and with little hope for further understanding.
The black box metaphor shrouds machine learning models in magical mystery, as if we’re dealing with an occult power instead of complex mathematical formulas.
Looking into the machine
Attempting to lift the veil, many experts have written about how the more complex contemporary machine learning techniques (e.g. deep learning) actually work, diving deep into their inner-workings: that opaque interplay of logic, abstractions, randomisations and probabilities. Yet, these explainers naturally rely on mathematical processes, which remain hard to interrogate for those who don’t speak the language of numbers, or lack the ability to decode their graphical representations. The human eye needs training in order to comprehend them — training not all of us have gone through.
We wanted to experiment with ways that we could provide glimpses into the complexity of machine learning, interactions that would allow us to explore without getting lost in mathematical mazes. We moved towards the idea of developing a machine window — an interface that acknowledges the barrier between human and computer logic, while still providing us some means to look through it.
We worked towards machine windows that could provide more users with means of orientation in the world of computational decision-making, while surfacing elements of its untranslatable, multidimensional logic.
Embracing visual otherness
Within a visual context, neural networks are used in numerous ways: from optimising existing images (e.g. super-resolution) to generating new images (e.g. deep fakes). The “intelligence” of these applications is evaluated based on the computer’s ability to draw general conclusions from specific examples we feed it, and predict what imagery we want to see in return. Applications like Dall-E are built on vast amounts of data, lots of machine power, and years of research to generate images that resemble and — when pushed by users — remix our world.
While the results are certainly fascinating, they don’t really tell us much about the machine’s reality and the complex mathematical process it has to go through to generate a visual language we recognise. By merely showcasing the input-output binary — human typing a string of words, computer miraculously spitting out one image — these types of interactions don’t chip away at the mystique around machine learning, but rather, enhance it.
Wanting to stay close to what happens in between, to the mathematical process, we wondered if we could go in the opposite direction, away from resemblance to existing visual realities, towards computational abstraction. We aimed to develop machine windows that would not deny that “neural” networks are wired differently than our own brains — despite being named after them — but instead enable more people to appreciate this otherness of the machine as a source of visual inspiration.
Can we build interfaces that visually highlight, instead of obscure, the otherness of the machine?
Before building computer windows onto mathematical landscapes, we had to decide on the viewpoint, and the view itself. In other words: we needed to decide on the neural networks, and the datasets we would use for training the model. We began with two neural image generating networks with limited capacity for ease of experimentation. Each network was trained on a different dataset, developing connections and ways of organising data that enabled them to eventually reproduce images of their own.
Instead of trying to directly explain this process of image production, we opted to surface and use the networks’ very behaviour as a creative resource — specifically, their use of the latent space (3).
Within the endless latent space, there exist millions of datapoints, resulting in millions of images that look almost identical, but with small differences. Between these datapoints, we can also find theoretically infinite “intermediate” samples, hybrid images that combine aspects of the datapoints between which they are positioned. In that way, there are any number of points between two selected points. It’s the infinity of this space, the endless visual routes that can be found within these landscapes, that we wanted the user to explore.
Latent spaces represent images as points in a continuous space, meaning that they are filled to the brink with samples of potential images. What can we learn from being in such an infinite space?
Through the aperture
We saw the opportunity to both showcase the existence of this theoretically infinite manifold of newly discovered images, and glimpse the logics through which they are produced. Working with Claartje, we created two machine windows: Continuous Coordinates and Compositional Computations.
This interface invites you to walk the first network’s latent space on a path of your own making, giving you insight into how the model organises information, and the vast visual material it contains.
To do so, the interface selects a “slice” of the (in this case) 100-dimensional latent space (4) for you to roam around in. You can probe this space, and create a sequence of points to form a path. The network generates images for each latent point you select, as well as for the points in between. Once your path is complete, you can “travel” along it, viewing the infinite continuum of images between points.
Computational Compositions invites you to create a composition with our second neural network, and while doing so, observe the unique artefacts the network produces in its pursuit of visual connections.
Each composition is made of a set of tiles generated by the network, and curated by you. When you click on a cell, a random point from the latent space is chosen and translated to an image by the network. For each adjacent tile generated, an additional algorithmic scheme, stochastic gradient descent, roams the network’s latent space, seeking out a datapoint that can generate a tile that fits seamlessly with the ones that came before. This algorithm does so by working iteratively, measuring the in-paint defect of each version of the tile, and then translating this to a direction of movement within the latent space, towards a point that yields a better fit. The end result is a composition defined by you, and driven by a machine trained to generate images that fit certain visual constraints.
For this particular interface, Claartje purposely overwhelmed a limited capacity model with a more complex dataset. Because the model was able to capture some, but not all of this complexity, it started producing intriguing “artefacts” — shapes that are more organic than the dataset it was trained on, exactly because of its lack of capacity. These unique patterns let us glimpse the machine’s visual otherness, an “intelligence” that can make unexpected combinations. Through the process of creating this composition, we have a window to the model’s sometimes unlikely decision-making, and the vast number of visual possibilities it contains.
A new palette and point-of-view
With Machine Windows, we invite the user on an interactive stroll through the land of the machine, creating pathways that draw attention to the breadth of material it contains, and the visual otherness it offers. Instead of seeking a single, optimised answer, we showcase a multitude of potential answers. By taking this approach, we have deliberately prioritised complexity over simplicity, variability over efficiency.
On the one hand, the windows offer us a new palette for making, a landscape of pre-generated images that we can take, remix, and build upon, shifting the role of these visuals from conclusive outcome, to process material. On the other hand, they offer us a new way of showcasing the richness of computer decision-making processes for those who don't speak the language of mathematics, a point-of-view that combines the imaginative with the informative.
The computer window acts as an interface between human and machine, designer and developer, image and datapoint, formula and formlessness, random walks and optimised routes.
Combining tech and design in one team, we are continuously building interfaces that help us see machine learning at work, within the image-generating domain and beyond. Taking the end-user into account, we need to constantly question how much of the machine to reveal and when, how to manage the complexity of a generative system without obscuring it completely, and the terms of engagement and intervention between the user and the network. This is an on-going quest.
Through navigating these questions, we hope to move toward a clearer view of the world of machine decision-making, and how we interact with it.