|
Playlist Helper | A Voice App Development Story
by Edward Muldrew
Today’s blog will be different from what you are used to. It is a simple story about trials and tribulations of my experience creating a voice app.
A little bit of background on myself. I am a recent Software Engineer graduate and I have now created 3 Alexa Apps. The voice app I am talking about is Playlist Helper which is one of my favourites and my first Alexa app I had created. Playlist Helper allows you to add songs you are currently playing to your Spotify playlist.
“Alexa, ask Playlist helper to add song to my Rock and Roll playlist”
You can check it out here!
https://www.amazon.com/Civica-Playlist-Helper/dp/B08D6PM1YM
Inspiration
What inspired me to build this app was that as a student I listened to a lot of music whilst revising, writing notes, and writing code. I would listen to music through Alexa. If you are like me, then you are always interested in discovering new music through Spotifys “discover” and playlist/artist/song “radio” features.
However, when I wanted to add a song to my playlist, I would have to open my phone, open the Spotify app and add the song manually to my playlist. Alexa could only play the music but not alter anything to my Spotify account. Therefore Spotify on Alexa was fairly limited in it’s range of account customisation or playlist management.
My ideal solution was rather than interrupt my studying and break focus. I could rather ask Alexa to add the song to my playlist. So, whenever I finished my degree, I took the decision to build this app.
What it does
The app at it’s current stage can add and remove songs from your playlist through voice command. As well as this you can ask what song you just added and create new playlists. However a fair evaluation would be the app is limited only to Spotify and only works on Alexa currently.
The user listens to a song on Spotify and plays the song through Alexa or another device. The user asks Alexa to get Playlist Helper to add the song to their chosen playlist.
I created a video demo for marketing purposes and also to show users how the app is supposed to work.
https://www.youtube.com/watch?v=vGdzcMx5LEM
How I built it?
Originally, I built that app purely through the Alexa Developer Console. Testing it using the simulator and using logs to verify behaviour was as expected.
One of the biggest challenges for this project was enabling the connection and getting the user authentication for the Spotify API. I had to create a developer account on Spotify which was pretty simple. Once I had created my Spotify application I was able to get my client & secret keys. After a bit of research Alexa actually makes this process pretty easy and ensures the access token is always working for each session.
I created a separate blog about this to help other developers out!
https://edwardmuldrew.medium.com/how-to-link-accounts-to-your-alexa-app-3c89ccdc323b
After this I went about implementing the required API calls to get a users playlists, get the current song playing and adding a song to the chosen playlist. I later added calls to remove the song that has just been added as well as creating a new playlist.
Challenges I ran into
Exposure to the app
After developing my first Alexa app which was predominantly aimed at solving a personal problem. I wanted to see if others had a similar problem which brought about some questions.
How do I get people to use this? How will people know about it? How will people understand how to use the app?
At first I posted the app on forums and on Twitter. A couple of weeks went by and not many people have used the app. So I decided a promo video would be the best way to showcase the app. I got my tripod, Alexa and iPad and made the demo video and put it out. The video is currently sitting at 54 views so I can hardly say it was grand slam success.
To enable further exposure or use of my app I made the app available in all English speaking countries. I believe some people have just found the app organically instead of any of my marketing initiatives if you can call it that.
I think this is a problem which most developers come to. However my analytics suggest the app is being used by people.
Asynchronous calls
One of the first problems I had was that I had overcomplicated the number of intents for a user to add a song to their playlist. The interaction path looked something like
Invocation (Get All Users Playlists)-> Add Song to “playlist name” (Get Current Song User is listening to) -> What Song did I add? (Add Song to user)
This was too long an interaction path due to each intents purpose was to make a single call to the Spotify API. This was to avoid a delay from the response of the Spotify API and the response from Alexa. I wanted this interaction to be as simple as possible so the user could complete their task in as little steps as possible.
I implemented ASYNC, await http calls and used them specifically for the Add Song Intent. So now the user can simply say “Alexa, Ask playlist helper to add song to my playlist”. All the calls and logic can be contained to just using one intent now.
User Behaviour
One of the more intangible problems with creating any software is how the users interact with the product. This is especially true for a voice app. This is where as a developer you have to see your product from multiple perspectives. Which is by no means an easy task. I spent a lot of time looking over Cloudwatch logs to see how users were interacting with the app. It is very difficult to gain real insight and feedback from any consumer.
The main problem was that users simply didn’t understand how to use and interact with the product. This stems from a number of assumptions I had made as a developer. Some of the problems I will list
- Users knew the app only worked with Spotify
- Users knew the app could only add songs to a user’s created Spotify playlist
- Users know how the interaction path worked
- Users understand the difference between a 3rd party app and the Spotify app
- Users will read the product information provided
- The product facing information makes sense
- User’s understand the error information to correct their behaviours
The biggest problem is trying to understand how a user is interacting with the product based of minimal logs. It was only recently when I spoke to a member of the Voice Spark team AJ, did I realize that although my customer facing information made sense to me. It wasn’t as easily understood to the public and by all means far too technical. I have since refined this and I am hoping it will have an impact on how users interact with the product.
Data Input
Another problem I had was dealing with unexpected input. In testing I had used a limited testing range in my own Spotify account. I did not consider users who might not have their own Spotify playlists or playlist names which may contain emojis.
For example a user might say Add Song to “Playlist Name Groovy Tunes” but their Spotify playlist name “Groovy Tunes”. The apps logic would say that this was an incorrect playlist due to the comparison. I created some logic to strip some of the unnecessary text that a user could input. Further to this a users’ playlists would contain emoji’s which also caused problems. Therefore I had to implement a regex pattern to see if any playlist name matched an emoji. I would then remove the emoji found to ensure both user input and playlist input was clean for comparison purposes.
In addition to ensure a fair comparison for varied input. Varying inputs has a wider scope in the context of a voice app due to the nature of the interface. So I implemented a 3rd party library which implements better string matching.
FuzzySet.JS is “a fuzzy string set for JavaScript. A data structure that performs something akin to full text search against data to determine likely misspellings and approximate string matching.”
This allowed me to determine how likely the users input was compared to the list of playlists the user has. A score of over 60% is determined as a good match. This limits the range of possibilities and increases the probability of a successful match.
https://glench.github.io/fuzzyset.js/
What I learned?
From this project alone, I learned all about creating an Alexa app. As well as marketing an app, gathering feedback and fixing bugs . I am still trying to tweak the app consistently from what I see on users logs.
I still have a lot to learn in terms of trying to understand the Alexa analytics and gauge what the metrics mean. The most important thing for me is I still feel there is a lot of value in my voice app but my main issue is trying to convey that to the consumer. In an ideal world iterative feedback would feed the development cycle. Unfortunately the feedback loop for 3rd party voice app development is slow. Simply put smart assistant voice apps are not used as much.
Only recently when I spoke to one of my team members about Playlist Helper did he mention he didn’t really understand to product facing information. This feedback is invaluable to a developer like myself. What I have found is, the intangible things that matter most. Taking off your “engineering” hat, taking a step back and trying to evaluate a product you have developed can be one of the most difficult things. Trying to remove all bias and seeing your product for what it is.
Future Developments
In the future I plan to make the app available as a Google Action. However on my first trial basis of this, it would appear this will not be as easy as expected. I also plan on adding some multimodal elements using Alexa’s APL library. The app currently only functions using voice.
I am also considering experimenting with other features such as searching for a song to add to your playlist. I plan to spend some time researching common Spotify features which are currently not offered by Alexa. Examining the limitations of the range of Spotify’s ability on Alexa. I am also considering using adding other music streaming platforms such as Apple music. However as this is a personal project to me I would need to see if their is much value to the public in exchange for investment of my time.
Ideally I would have some more user feedback which I could work with and help refine the product on accordance to users’ needs .
Conclusion
To conclude developing this app was a worthwhile experience. It has been a great side project for me and one that I wish to continue with. As a developer a project like this gives you great exposure to Software Engineering holistically. Especially when you are a lone developer you have to think about everything and make development decisions based off this. From research, ideation, development, testing and even marketing a product. You get the experience of what a software team has to go through at each stage.
This blog was a little different being a case study from a personal experience of developing a voice app. I would love to get some feedback to what other developers experience has been.
No responses yet