“New uses of speech technologies are changing the way people interact with companies, devices, and each other. Speech frees users from keyboards and tiny screens and enables valuable, effective interactions in a variety of contexts.” (SpeechTek)
Technologies and use cases for conversational interfaces are rapidly changing. To better understand these industry trends I attended SpeechTEK 2017. Here are a few notes and observations from the field.
Mix of Yin and Yang, Old and New
The conference brought in a mix of old school contact center vendors (Convergys, Aspect, etc) and new tech vendors (Google, Amazon, etc) analysts, consultants and enterprises. It was interesting to see the collision and friction between the old and the new.
Although new tech vendors got most of the buzz, it was clear that there is a gap that is not yet being addressed.
New tech is rapidly advancing thanks in part due to machine learning, but is not on par with a good natural language IVR.
Creating a great conversational experience is not only about the tech tools but about skills needed to create an engaging and frictionless experience through a dialog.
The gap is how to create a great dialog-driven experience.
The decades of experience building speech enabled IVRs should dominate the VUI developer community. But it’s not. Not yet away.
So a lot of new tech is ignoring lessons learned from the call center on how to build a great conversational user experience. One that works. One that can do complex jobs. One that reduces user effort. One that supports enterprise scale.
The takeaway is that the old way has wisdom. Maybe this is more ying and yang: Each perspective adds value.
Conversational User Interface
We are conditioned to “Just say It!” (Dan Miller)
Conversational interfaces are becoming increasingly important. Voice (conversational interface) is much more natural than web or text.
In near future, 50% will be voice search (vs web search) and 85% of interactions be self-service (i.e. with a human only on the requesting end). Expectations are that many products will have a conversational interface- from IVRs to robots, VA, bots, Alexa, Google Assistant, cars, and so on.
(Of course at SpeechTek you’d expect some bias in favor of all things speech!)
People want to speak to enterprise using their language- not the enterprise’s language and not a machine’s language. Today if I talk to Alexa, it's her language, not mine, because I have to frame requests in specific ways using specific words.
Conversational user interfaces (CUI) can listen, understand, and act (fulfill). Many presenters said it is very hard to build a good CUI. Building a good CUI requires:
- Good platform and tools to build a CUI on … some are saying these are a commodity already, and
- Knowhow (skills, processes, frameworks, data) needed to use the tools and build a good system. Many say this is the most critical resource to building a good CUI.
Lessons learned from developing good speech IVRs have not yet transferred over to new tech (Google, Alexa etc). So these new tech CUIs are very basic.
And today’s CUIs are fairly restricted. “Spoken language systems, despite progress, continue to be knowledge-engineered particularly for dialog structure and domain integration” (Rudnicky)
So the old school (if you can call an NLU IVR old school) has really accomplished a lot and the tool sets used for NLU are getting much smarter.
Artificial Intelligence (AI) is Everywhere
New developments in machine learning (ML) such as Deep Learning are leading to rapid improvements in the technology. The sessions that included ML were packed as this is the hot topic.
AI is impacting all parts of the CUI technology stack from listening, understanding, talking, and taking action.
Google showed the recent advanced made with Google Speech API and api.ai using their machine learning expertise. The speed and accuracy of these Google tools is amazing.
But there remains a challenge with tools built on ML. One of the challenges is the need for large data sets needed to train the ML model. Data which most developers don’t have. So many platforms are suggesting to start with some basic ML and augment your own data with examples in a knowledge base or use prebuilt language objects.
Considering AI and its impact on customer service, Dan Miller suggests a new “grammar” is needed that includes:
- Adaptive Intelligence
- Cognitive Computing
- Dynamic Neural Networking
- Knowledge Management
- Customer Interaction Analytics
- Predictive Analytics
- …
These technologies are driving new and previously impossible insights into conversations and data. Example: Jeff Adams talked about how advanced ML speech analysis can detects early-stage diseases such as Alzheimer’s. Wow.
Integrated Customer Journey
Many vendors where showing the ability to track customers as they move across channels and provide an integrated experience. So I can go from a web interaction to mobile to IVR to an agent and my context goes with me.
Couple this context integration with Customer Interaction Analytics and you can reduce a significant amount of customer effort from the interaction.
Self-Service
Self-Service is being seen as critically important as user’s first choice and preference. Most customers, in most use cases want to self-serve instead of talking to an agent. And self-service needs to be across all channels, not just voice.
Dan Miller talked about the shift from Old School to Snowflakes:
“Old School”
- Starts with an IVR
- The resource that answers the phone
- Short-cut to contact center agents
“Snowflakes”
- Preference for text
- Add tweets, texts & direct messages
- “Human-assistance” only as needed
So we find that a great conversational interface must be matched with great self-service that can get the job done easily and effortlessly. Yin and Yang.
RBC Bank NLU Case Study
RBC Bank provided a case study of implementation of NLU in their contact center. RBC handles 100 million calls a year, which puts the contact center in the mission critical category.
RBC moved from an older DTMF IVR to a NLU IVR over the past two years. Their lessons-learned is a great summary of the SpeechTEK event:
- Technology should be invisible to the customer!
- KPI Improvement = Continuous Training and Tuning = Data
- NLU (AI) testing methodology is not the normal way QA engineers think
- There are a billion ways people ask for the same thing – semantic tuning is key
- Adhere to standard human communication protocol with VUI design
- Selecting proper Voice Talent is extremely important for conversational feeling
Final Note
SpeechTek has posted many of the presentations online. The full list is here. Below are a few of my favorites.
The Conversational User Interface Is a Minefield Wolf Paulus - A104_Paulus.pdf
An Intelligent Assistant for High-Level Task Understanding Alexander Rudnicky - C101_Rudnicky.pdf
Driving Enterprise Success through Conversational Virtual Assistants Jordi Torras - C102_Torras.pptx
Extracting & Using Gender, Age, Emotion, & Language From Speech Nagendra Goel - D104_Goel(2).pptx
The Deep Learning Breakthrough & How It Will Revolutionize Conversational AI Yishay Carmiel - D105_Carmiel.pptx
Chatbots vs. Voicebots Crispin Reedy - SD201_Reedy.pptx
Voice Services in the World of Bots Dan Miller - A203_Miller.pptx
Speech Analysis Detects Early-Stage Diseases Jeff Adams - A301_Adams.pptx