|
by Edward Muldrew
A voice developer requires a unique set of skills. A different set of challenges. A different output requires a different output. A different set of users needs, to consider.
I want to tackle the key differences and perspective a voice developer has compared to a web developer or a mobile developer. Although similarities within the consumer-base and a specific platform like operating systems.
I must also note that although smart speakers are increasingly becoming multi-modal with devices like the Echo Show. Voice developers are told to develop “voice-first”. This blog will focus on developing geared towards that.
The interface
The main difference for voice is obvious in the way people use the technology. The way in which people see and touch is different from how they hear and talk. A great example is how a home screen can show multiple apps on a page. Users of voice do not want to hear a list of options over 3 however.
Despite it’s obvious limitations voice has a huge advantage of being eye and hands-free. This gives way to a huge opportunity as many daily activities can take up our hands and eyes. Which is why voice assistants are most used setting timers, playing songs, controlling smart lights and reading the news.
“VUIs must be as spontaneous and free-flowing as humans whilst retaining their ability to converse over defined topics. In the UI world, this would be akin to changing the color and position of a login button every time the user returns to keep the “conversation” fresh.”
Identity & Privacy
Mobile apps can be private and have an intimate relationship with the user. In technical speak consider this one-to-one. Confirming medical appointments or asking personal questions are better suited for the web.
A smart speaker can broadcast your interaction to anyone who is listening. Consider this one-to-many. This type of interaction is much more advantageous for multiplayer games and involving the whole room.
“The coolest thing about voice-controlled games is that they can enable a live multiplayer experience,” says Child. “Voice gives us a unique power and a unique opportunity to create social experiences in a way that the web and mobile really didn’t.”
The term “fuzzy identity” best describes voice devices. They tend to be tied to a location i.e the kitchen, living room or conference room. Recognizing users based of their unique voice signature is not quite there yet. Developers lack of being able to personalise an experience to a users needs can become more difficult.
The interaction
Perhaps the most obvious difference, interacting through listening and talking. It is up to the voice technology to help manage the dialog to create a fluent and coherent app. Developers have to really understand the interaction. In many cases this makes error handling difficult due to the abundance of ways a conversation can have. A developer must mitigate possible flows to simple options.
For example a quiz app is best suited to handle user input of A, B and C as opposed to taking input of an answer. This mitigates the possibilities of misspoken words being interpreted as an incorrect answer. Further to this as developers we often mistake our users having the knowledge we do about our product. Especially in the case for voice apps we can not assume our users have ever interacted with a voice app. Therefore encouraging hints, prompts and queues should be provided to help guide the user.
Users sessions in voice are often short and direct. So consider simplifying long dialog which will discourage a user. Low complexity with high functionality creates a good value proposition.
Engagement & User Retention
Keeping the user and engaged and continuing to make the user come back to use a voice app is one of the biggest struggles for any voice developer. Typical engagement triggers for mobile apps can include email, push notification and icon badges.
Typically voice apps have shallow engagement because they are mainly used to be fast and functional. However there are numerous ways to keep users engaged. Updating content regularly, having a natural conversational interaction with the user. Alexa provides case studies about successful skills. Headspace is one of them. Many of you will know Headspace from the mobile meditation app. They have integrated their services to an Alexa Skill.
“The Alexa experience features custom audio to lead guided exercises, as well as a robust conversational repertoire with nearly 500 responses and reprompts. The skill enables subscribers to link their accounts for a personalized experience, but also allows non-subscribers to access the “basics” as a way to drive trial and new subscription conversion.”
Another great case study is on Big Sky, an all encompassing weather app.
“This high level of personalization led to Big Sky becoming one of the most popular weather skills in the Alexa Skills Store—and significant payouts from the Alexa Developer Rewards program, which rewards developers based on their skill engagement.”
The basis is making use of potential triggers like integrating a skill with users routines. Offering a unique selling point in offering premium, personalized features.
Monetization limitation
Another glaring difference is that making money from voice apps is not that easy. For example for Alexa you can make money through
- Alexa Developer Rewards Program
- In-skill purchases
- Amazon Pay for Alexa Skills
I would argue in-skill app purchases and mass user adoption for voice skills is not as easy compared to other technologies. There are many reasons for this.
Most users use their smart speaker for music, the weather and “how to” instructions. The adoption rate for Alexa created skills is yet to reach that of mobile apps. The challenge is to create an engaging app which users use before making any money. Which is no easy task. Whilst web and mobile have the advantage of placing ads on their applications.
Further to this Alexa’s skill store is nowhere near the functionality of a typical app store. This could have some consequence on the visibility of some apps.
Conclusion
In conclusion from a developers perspective, voice app development presents some really interesting unique challenges. The way in which you think about a user brings a lot more questions.
How will a user interact with a product? Will they know how to respond? What do we do with erroneous input?
It is simply more difficult but arguably that is part of the fun. To build an engaging voice app is more impressive than an engaging website or mobile app in my opinion. Right now we are in the early stages of voice as a technology. This is what I see as a great opportunity for developers to get involved in an emerging technology.
Sources Used:
https://blog.usejournal.com/what-to-know-before-building-your-first-voice-skill-597c99141805
https://developer.amazon.com/en-US/blogs/alexa/alexa-skills-kit/2018/11/steven-arkonovich-adds-in-skill-purchasing-to-personalize-alexa-skills-and-boost-his-voice-business
https://build.amazonalexadev.com/rs/365-EFI-026/images/RAIN_Headspace_Final05082020.pdf
https://www.voiceflow.com/blog/6-differences-between-mobile-apps-and-voice-apps
No responses yet