The next 3 interfaces for music’s near future

Our changing media reality means everyone in music will have to come to grips with three important new trends.

Understanding the music business means understanding how people access, discover, and continuously listen to music. This used to be the record player, cassette player, radio, cd player, and now increasingly happens on our computers and smartphones. First by playing downloads in media players like WinampMusicmatch Jukebox, or iTunes, but now mostly via streaming services like Spotify, Apple Music, but also YouTube.

Whenever the interface for music changes, the rules of the game change. New challenges emerge, new players get to access the space, and those to best leverage the new media reality gain a significant lead over competing services or companies, like Spinnin Records‘ early YouTube success.

What is a media reality?

I was recently talking with Gigi Johnson, the Executive Director of the UCLA Center for Music Innovation, for their podcast, and as we were discussing innovation, I wanted to point out two different types of innovation. There is technological innovation, like invention, but you don’t have to be a scientist or an inventor to be innovative.

When the aforementioned categories of innovations get rolled out, they create new realities. Peer-to-peer technology helped Spotify with the distribution of music early on (one of their lead engineers is Ludvig Strigeus, creator of BitTorrent client utorrent), and for this to work, Spotify needed a media reality in which computers were linked to each other in networks with decent bandwidth (ie. the internet).

So that’s the second type of innovation: leveraging a reality created by the proliferation of a certain technology. Studios didn’t have to invent the television in order to dominate the medium. Facebook didn’t have to invent the world wide web.

A media reality is any reality in which innovation causes a shift to a new type of media. Our media reality is increasingly shifting towards smart assistants like Siri, an ‘internet of things’ (think smart home), and we’re creating, watching, and interacting through more high quality video than ever before.

Any new media reality brings with it new interfaces through which people interact with knowledge, their environment, friends, entertainment, and whatever else might be presented through these interfaces. So let’s look at the new interfaces everyone in music will have to deal with in the coming years.

Chatbots are the new apps

People don’t download as many apps as they used to and it’s getting harder to get people to install an app. According to data by comScore, most smartphone users now download fewer than 1 app per month.

So, in dealing with this new media reality, you go to where the audience is. Apparently that’s no longer in app stores, but on social networks and messaging apps. Some of the latter, and most prominently Facebook Messenger, allow for people to build chatbots, which are basically apps inside the messenger.

Companies like TransferwiseCNNAdidasNike, and many airlines already have their own bots running on Messenger. In music, well-known examples of artist chatbots are those by Katy Perry and DJ HardwellRecord Bird, a company specialized in surfacing new releases by artists you like, launched their own bot on messenger in 2016.

The challenge with chatbots is that designing for a conversational interface is quite different from designing visual user interfaces. Sometimes people will not understand what’s going on and start requesting things from your bot that you may not have anticipated. Such behaviours need to be anticipated, since people can not see the confines of the interface.

Chatbots are set to improve a lot over time, as developments in machine learning and artificial intelligence will help the systems behind the interfaces to interpret what users may mean and come up with better answers.

VUIs: Alexa, play me music from… uhmm….

I’ve been living with an Amazon Echo for over a month and together with my Philips Hue lamps it has imbedded itself into my life to the extent that I nearly asked Alexa, Amazon‘s voice assistant, to turn off the lights in a hotel room last weekend.

It’s been a pleasure to trade in the frequent returns to touch-based user interfaces for voice user interfaces (VUIs). I thought I’d feel awkward, but it’s great to quickly ask for weather updates, planned activities, the time, changing music, changing the volume, turning the lights on or off or dimming them, setting alarms, etc. without having to grab my phone.

I also thought it would be awkward having friends over and interacting with it, but it turns into a type of play, with friends trying out all kinds of requests I had never even thought of, and finding out about new features I wasn’t aware of.

And there’s the challenge for artists and businesses.

As a user, there is no home screen. There is nothing to guide you. There is only what you remember, what’s top of mind. Which is why VUIs are sometimes referred to as ‘zero UI’.

I have hundreds of playlists on Spotify, but through Alexa I’ve only listened to around a dozen different playlists. When I feel like music that may or may not be contained inside one of my playlists, it’s easier to mentally navigate to an artist that plays music like that, than to remember the playlist. So you request the artist instead.

VUIs will make the branding of playlists crucial. For example, instead of asking for Alexa to play hiphop from Spotify, I requested their RapCaviar playlist, because I felt the former query’s outcome would be too unpredictable. As the music plays, I’m less aware of the artist names, as I don’t even see them anymore and I hardly ever bother asking. For music composed by artificial intelligence, this could be a great opportunity to enter our music listening habits.

The VUI pairs well with the connected home, which is why tech giants like Google, Amazon, and Apple are all using music as the trojan horse to get their home-controlling devices into our living rooms. They’re going to be the operating system for our houses, and that operating system will provide an invisible layer that we interact with through our voice.

Although many of the experiences through VUIs feel a bit limited currently, they’re supposed to get better over time (which is why Amazon calls their Alexa apps ‘skills’). And with AI improving and becoming more widespread, these skills will get better to the point that they can anticipate our intentions before we express them.

As voice-controlled user interfaces enter more of our lives, the question for artists, music companies, and startups is: how do we stand out when there is no visual component? How can you stay top of mind? How will people remember you?

Augmented reality

Google Glass was too early. Augmented reality will be nothing like it.

Instead of issuing awkward voice commands to a kind of head mounted smartphone, the media reality that augmented reality will take shape in is one of conversational interfaces through messaging apps, and voice user interfaces, that are part of connected smart environments, all utilizing powerful artificial intelligence.

You won’t have to issue requests, because you’ll see overlays with suggested actions that you can easily trigger. Voice commands are a last resort, and a sign of AI failing to predict your intent.

So what is music in that reality? In a way, we’re already there. Kids nowadays are not discovering music by watching professional video productions on MTV; they discover music because they see friends dancing to it on Musically or they applied some music-enabled Snapchat-filter. We are making ourselves part of the narrative of the music, we step into it, and forward our version of it into the world. Music is behaving like internet memes, because it’s just so easy to remix now.

One way in which augmented reality is going to change music, is that music will become ‘smart’. It will learn to understand our behaviour, our intentions, and adapt to it, just like other aspects of our lives would. Some of Amazon Alexa‘s most popular skills already include music and sound to augment our experience.

This is in line with the trend that music listeners are increasingly exhibiting a utilitarian orientation towards music; interacting with music not just for the aesthetic, but also its practical value through playlists to study, focus, workout, clean the house, relax and drink coffee, etc.

As it becomes easier to manipulate music, and make ourselves part of the narrative, perhaps the creation of decent sounding music will become easier too. Just have a look at AI-powered music creation and mastering startups such as Jukedeck, Amper, and LANDR. More interestingly, check out Aitokaiku‘s Vimu, which lets you create videos with reactive music (the music reacts to what you film).

Imagine releasing songs in such a way that fans can interact and share them this way, but even better since you’ll be able to use all the data from the smart sensors in the environment.

Imagine being able to bring your song, or your avatar, into a space shared by a group of friends. You can be like Pokemon.

It’s hard to predict what music will look like, but it’s safe to say that the changes music went through since the proliferation of the recording as the default way to listen to music are nothing compared to what’s coming in the years ahead. Music is about to become a whole lot more intelligent.


For more on how interfaces change the way we interact with music, I’ve previously written about how the interface design choices of pirate filesharing services such as Napster influence music streaming services like Spotify to this day.

If you like the concept about media realities and would like to get a better understanding of it, I recommend spending some time to go through Marshall McLuhan‘s work, as well as Timothy Leary‘s perspective on our digital reality in the 90s.

The future of music, inspired by a cheap Vietnamese restaurant in Berlin

I spent the last week living from an Airbnb while getting started with my new job at IDAGIO in Berlin. Down the street from my Airbnb was a cheap Vietnamese place, where I ate a couple of times. They always had Vietnamese pop music on, but one day they had a CD by a Vietnamese singer covering Western pop songs. In English. I thought about it for a little: why wouldn’t they just play the originals?

These cover releases are often financially motivated, but since the restaurant has to pay some collection society, and a Spotify subscription gives you all the music for just $10, I figured that the reason for this music playing was probably not something financial.

I then wondered: could it be that they simply have more of a connection to the Vietnamese performer, and prefer to hear these works from his mouth?


I’ve been getting into a new way of thinking about music by stepping into classical. Suddenly, there’s not 1 original and then some ‘lesser’ remixes and covers. There’s a composition, with the author of that work often having deceased before modern recording technology, and then there are countless recordings of performances of that work. Sometimes there’s an relatively undisputed ‘best’, but often it comes down to personal taste, preference, and opinion.

IDAGIO screenshot

In the last century, music went through an enormous change. It went from ‘folk’ to ‘pop’. Here’s what I mean with each phrase:

  • Folk: music that’s not ‘owned’ by a single individual or corporation, but rather by the culture in which it was born. A song is not necessarily known for a particular performer, but instead is performed by many performers: ones that reach success far and wide, as well as local performers who just like to sing in front of a crowd in evenings or weekends.
  • Pop: music that’s controlled and owned. Songs are known for their original version and original performers. In this sense, the meaning extends beyond the charts, and into modern day underground rock, metal, and to a certain extent hiphop and dance music.

Recording technology in the 20th century brought about a transition: where once music was ‘folk’ by default, it became ‘pop’ instead. The rise of mass consumerism and cheap global distribution decreased the amount of time a song needed to spread geographically. These was now also a default version through which basically everyone became familiar with the work, rather than through their local performer or traveling bands.

While this system has generated a vast amount of money, and a huge music economy, I also think that music as an experience has lost a lot through this. People’s relationships with works are more superficial and performers are less incentivized to be the best performer of a certain work, since they can basically be the only one.


Back to the Vietnamese restaurant.

I got to thinking: what if we can ‘folk-ify’ modern pop music. It’s already being done to a certain extent. The remix culture on Soundcloud is a great example of it, and so is the cover culture on YouTube. What if the way we’re structuring the navigation in content on IDAGIO (such as: composers > works > recordings and performers) some day could become relevant for ‘pop’?

It would mean people would be able to browse based on songwriter, and then see all the pop songs related to that writer. They’d then be able to explore each song, and all the performances of it. They could sort by proximity: either offline (geographic), or online (based on your social graph and digital footprint). This could make the performance they listen to more personally relevant, just like the CD in the Vietnamese restaurant is to the owners of the restaurant.

It could make music more participative, and in a way it already is becoming so: YouTube, Soundcloud, remix apps, democratization of production tools, cheap hardware for recording (like our phones), Musical.ly, performances on livestream… The two most remix-heavy genres we know, dance and hiphop, are the ones most influential to the millennial demographic and younger. Both house and hiphop were born of affordable drum computers and samplers, of looping existing records, reinterpreting them, creating a new performance out of something that already existed.

The hard part has always been incentivizing the rights holders. Just look at the lawsuits.

We’re reaching an interesting time: we’re getting very good at interpreting really large datasets. Machine learning and AI are set to revolutionize our every day existence in just a few years. Then there’s blockchain, which is a good technology for tracking the complexity involved with a very nested type of ownership if we indeed ‘folk-ify’ pop music (without radically overhauling modern notions of intellectual property).

Music doesn’t have to become more participative, but it can. I think there’s a good economic case for it, but it still needs to be the product of deliberate choice of individuals. People in government can look at funding music education, and modernizing it, because the computer is the most important instrument for our generation (I know some of you will strongly disagree: find me at Midem, Sonár+D, or c/o pop and we can discuss over a beer). Musicians can think of how they can invite fans to contribute or interact with their music. People with entrepreneurial mindsets can think about solving some of the issues related to rights, or look at how musicians can monetize this type of interactivity.

And we all, as listeners, simply need to do one thing more often: sing.