Many consumer brands are trying to incorporate voice interactions or voice commerce into their products. Apple has Siri locked in its ecosystem. So does Google for the most part. But Amazon is pushing Alexa to be the first mainstream voice touch point.
I wonder if that is the end goal. Sure Amazon would like a frictionless interface for consumers to buy more. My hypothesis is Alexa is meant to convince developers and startups that Amazon has all the products and tools that any “technology” company would ever need. The product line has everything from $5/month VPS instances that compete with Godaddy to running oil and energy market-related applications for GE. All infrastructure and data processing needs can be fulfilled within the Amazon ecosystem.
You would think that Google and Microsoft would leverage their expertise into selling more cloud infrastructure. Google’s revenue is overwhelming based on advertising. Microsoft’s comes from Office and licensing enterprise software. Both are in the process of shifting their mindset, but it is not a natural process. This direction might be in Amazon’s DNA or at least related to their previous experience with AWS. They built their website and realized they could leverage that knowledge and rent it out to others. They were the first big cloud provider that enabled developers and shadow IT groups to test, build, prototype and run their products in the cloud. Once hooked those customers were not going anywhere else as they developed more sophisticated products and moved on to other jobs. The use of AWS was organic. It had to be. They didn’t control any of the touch points like the operating system or devices (even though they tried).
How will we know if this hypothesis is correct? The easy answer is we see an uptick in new voice products developed on AWS. It will take time. Voice interactions or commerce have challenges. For example, the discovery of skills or apps. Can you search for Alexa skills without a phone or going through a website? We can’t turn the knob and find more content like on radio. Any form of monetization will be intrusive. How about interjecting ads? We all know how much consumers love pop-ups and interstitial ads. These points of friction require resolution. Otherwise, voice technology will have a short hype cycle.
Scott Brinker interviewing chief marketing technologist of Xerox:
8. Any advice you’d offer to someone starting out their career in marketing today?
Compare to the email sent out in preparation for the first course in Udacity’s Deep Learning Nanodegree Foundation:
Our team has compiled an excellent selection of resources for you, so please review these as you use this week of preparation to your full advantage:
Are you completely new to the world of Deep Learning? Read this fantastic introduction to Machine Learning.
Need a refresher on the basics of Linear Algebra? Have a look at this Udacity course.
Do you want to brush up Python skills? This Intro to Data Analysis Udacity course covers the Python libraries NumPy, Pandas, and Matplotlib.
Want to play around with a real neural network right in your browser? Check out this cool Neural Network Playground.
Never worked with TenserFlow before? Follow these instructions to download and install it.
We are severely underestimating the knowledge employees and agencies will need to have to optimize for the customer journey in the future.
If most brands can’t get digital right, or more thoroughly digital transformation (which includes big data), how will they take advantage of AI? Which to be successful, requires…big data.
You would think WordPress would be an easy sell nowadays. WordPress is not the right tool for every website, but the size of that pool is shrinking. It powers a significant chunk of the web.
Automattic has already leveraged WordPress into business with their VIP WordPress service, meant for high traffic sites. Facebook runs their company blogs on it. Practically every large news publisher has something running on VIP WordPress, like NY Times, Washington Post, CNN, and many others. Using that platform, now it is actively (aka spending ad dollars) going after the SMB market with a hosted solution against the lower priced services like Bluehost or Host Gator.
WordPress has a yearly “State of the Word” thingie. The most recent one:
The significant bit during the speech was the creation of the WordPress Growth Council. Why? Proprietary CMS competitors will spend $320 million on advertising, some of it directly against WordPress. In Matt’s words (around 27:30 in the video):
“Advertising does work…even though I think we have a better product and infinitely better community, we are starting to see in certain markets these tools which are typical proprietary, start to pick up shares.”
The Growth Council as a concept is not flushed out, but Matt / Automattic wants to start working on it, and he wants a small number of companies that work with WordPress to help figure it out. It will be interesting to see what direction they take.
Time picked these photos. They are memorable. They are influential in part because everyone knows them. Is it possible for any image/video/story to be that prominent nowadays? There are too many channels vying for our attention. 75 million watched the Seinfeld series finale. The Walking Dead had ratings of 9.6 in 2016. Sure, comparing a series finale against a regular episode may not be fair, but how does any group of people have a set of common cultural stories if there are no standard channels for consumption? The assumption is that it is much easier and faster to share, so theoretically it is possible. If all the social platforms are algorithmically personalizing content for clicks/engagement/whatever, essentially customizing for the individual instead of the common, how do you build the common?
Fascinating topic. We think of societal and cultural breakdowns led to the collapse of the Roman Empire. That might not be true. Rome had a thin layer of bureaucracy. Local officials ran the different regions and cities. They invested in their communities, and they connected the locals to Rome. Over time and changes in leadership, the governing model became more centralized. Rome co-opted the local officials, and those officials were more interested in making Rome happy and less invested in their local communities. Collapse coming in 10…9…8…
Ah, the early 90’s. Everyone remembers that story. The woman who sued McDonald’s was an example of stupid lawsuits. But read the link. It doesn’t sound like she was greedy and McDonald’s did know it was handing out coffee that was too hot. McDonald’s presented a different story which became embedded in society. Media and truth are fungible concepts.
To understand how the different pieces of building a chatbot fit together, I decided to prototype a Facebook Messenger insurance claim submission chatbot. The set of requirements were:
- Be able to submit an auto accident-related claim (location, vehicle, number of passengers) and be able to check the status of a claim.
- Be able to upload photos of the accident.
- Be able to understand and process when the claimant is submitting multiple pieces of relevant information at once.
- Leverage Facebook Messenger as the consumer touchpoint.
There are three components to chatbots:
- Consumer touch point (social platforms, website, SMS, etc.)
- Middleware which can handle the business logic and be the knowledgebase and database for customer information
- Tools and platforms that do the natural language processing to help the middleware understand input from the consumer.
Let’s discuss the tools and platforms. Like any technology, there are easy-to-use tools that perform limited functions, and there are complex ones that have to ability to provide incredible experiences but require time and resources to build. It’s the difference of using a free website building tool like Wix or Weebly vs. creating a custom website on Sitecore or Drupal. For our prototype, we are going to tie into the IBM Watson platform, which won Jeopardy! in 2011.
There are two Watson services we used: Natural Language Classifier (NLC) and Dialog. The purpose of NLC is to understand the context and intent of the user input. We used NLC to know when a user was trying to submit a claim versus checking the status of a claim. The Dialog service is trained with a general script, which it uses as building blocks to create a logic flow for the conversation. This flow helps Dialog ask clarifying questions (or understand when it has all the information) and gather information it needs to collect to complete the given task. The natural language processing ability of Dialog allows it to parse the user input and extract the required information (location, vehicle, the number of passengers). For all this magic to work, both the NLC and Dialog have to be trained with a set of grammars the user might input since these are domain specific.
Our middleware provides several functions for the prototype. First, it is the interface to Facebook Messenger. The middleware takes in user input from Messenger and decides whether to forward it on to NLC or Dialog. At the beginning of a conversation, user input is received and forwarded to the NLC to process and analyze what task the claimant is attempting to complete. From then on, the middleware talks to the Dialog and Messenger service to facilitate the rest of the conversation.
Second, when the Dialog service is in use, we check to see if the user uploaded photos. If the user did upload a photo, the middleware stores it.
Third, once the claimant has submitted all the necessary data, the middleware stores it. Last, when the claimant is requesting a status update, the middleware creates a Messenger-defined image template, including the claimant submitted photo and information, and sends it to Messenger.
Node.js was our programming language of choice for the middleware. While creating in node.js is not necessary since interactions with Facebook and the IBM Watson services are REST API calls, a majority of documentation provides example code for node.js.
MongoDB was used as the persistence layer since it does not force pre-defined structures which allow for easier prototyping. Mongoose was used to interface with MongoDB.
Webhooks are a way of integrating systems. They usually are configured to call a URL when an event is triggered. When setting up a bot for Facebook Messenger, it requires a webhook URL that will call the middleware with the user input (whether text, photos, or voice) as the payload. The middleware has to be accessible to Messenger at the time of setting up. While not a problem if deploying the chatbot on a cloud infrastructure like Heroku or AWS, but can create problems if the chatbot resides inside of a private network. Then, the best option is to use a service like ngrok which can provide a public interface for the duration of the development cycle.
Like any good webhook callback, Messenger expects a response status for each request. Obviously, this can be problematic while building the chatbot, especially when not all error handling code is in place. Expect Messenger to keep sending those messages repeatedly till they are acknowledged.
Access to Watson APIs is through IBM’s Bluemix platform, which is the largest Cloud Foundry installation. Cloud Foundry CLI tools can be used to set up all the Watson services before use. While we used NLC and Dialog for the prototype, it is worth investigating the full array of Watson APIs. In the future, we plan on integrating the sentiment analysis, tone analyzer, and emotional analyzer to determine how the chatbot can communicate more intelligently with users.
As seen below, there is no shortage of platforms to leverage when building chatbots.
One platform of note is wit.ai. It is owned by Facebook (but not meant to be exclusively used on Facebook) and has an interface for building mock conversations, called Stories, in the browser. Wit creates models from the Stories and tracks user input that can be categorized to provide supervised learning for the AI.
The existing rules on product design apply to building chatbots. Analyzing the intended goal, identifying the parameters of where chatbots can help, prototyping and evaluating, and repeating as necessary.
Gain real consensus on the purpose of the chatbot. Stakeholders need to understand what is possible and the limitations. Scope creep will create complexity at an exponential rate that will be hard to implement and frustrate users. On the flip side, when the chatbot starts interacting with humans, it should make clear the scope of what it can do.
For humans, conversations are natural. It is a process we learned starting at birth. Codifying it enough to a machine is a challenge. Mocking up conversations becomes a critical part of the process. Tools like Twine help map out the flow of communications. The next step is user research. Would users interact the chatbot as mapped? Understanding what users think, feel, and how they react while interacting with the chatbot will provide valuable feedback for gathering the NLP data systems need.
Temporary middleware systems are necessary because of how some customer facing touch points, like Facebook, interact with the middleware. If the chatbot middleware is down for maintenance or upgrades, a user may come in and try to initiate a conversation with a “Hello.” If the user receives no response, they will likely say “Hello” again. When the chatbot comes back online, Facebook will send both “Hello” inputs to the middleware. The chatbot will probably assume the both statements are the start of a conversation and provide duplicate responses. Obviously a terrible user experience. A solution would be to stand up a temporary middleware system that can respond to the user after the initial “Hello” and state the chatbot is unavailable and will return soon. At the same time, the temporary system needs to store the user input so that the chatbot can respond to them appropriately when back online.
Remember to provide an escape route. In the early iterations of a chatbot, it will likely not know what to do in every scenario. Track if the conversation is going sideways and provide alternatives communication methods.
We are early in the consumer facing AI chatbot revolution. There are opportunities in enhancing existing methods of communications to create better engagement, but new types of interactions will develop to bring brands to the consumers, not depend on the customers to land where brands want them to be.
While we are discussing text-based communications, the next evolution is already underway: voice interactions. The current incarnations of Siri, Google Now, and Alexa have shown us what will be possible shortly. The good news is the infrastructure and systems brands will build for chatbots will be portable to voice.
Since the mainstreaming of the Internet, the assumption has been that we are at the cusp of the next big revolution.
We are going on 20 years, and it’s not here yet. Will it ever?
Let’s back up a bit. My degree is in computer engineering. So I learned how to design circuits and chips, how to write operating systems. To me, there is no magic to digital or executions which are digital. Digital is what we want it to be. Nothing more, nothing less.
Then why do people have unrealistic expectations of it?
Look at your average consultant, CX expert, or futurist talking about digital. They understand the current applications of digital. They have all used digital tools. The output of digital. The ones who are or have been in the bowels of digital are more neutral about its prospects (unless they are trying to sell you something).
I stumbled across Robert Gordon’s US Economic Growth is Over: The Short Run Meets the Long Run, which codified some of what I was seeing and felt.
Visit a third world country; it is easy to see what is revolutionary. Things like electricity, gas, water, and sewer access.
The combination of these connections resulted is societal shifts of the 1920s to 1970s. It changed where and how we lived our lives. That is what we are seeing in the BRIC countries currently. Everyone expected benefits of these new connections to continue and compound and get us our jetpacks in the 21st century. But, Peter Thiel said it best, “We wanted flying cars, instead we got 140 characters.” Most people are still commuting and pushing around (digital) paper.
We don’t have the right combination to create the next epic change. Think of marketing; you need to get the 4Ps correct. Read the Ten Types of Innovation. The authors point out that for a company to differentiate, it needs a combination of 4 or 5 things right.
Will big data and AI get us there? If only applied to digital, probably not. We are going to need more significant changes from other fields. How about the nano-, neuro-, and green-?
With AI chatbots, the conversation is the hard part, not the frameworks and tools.
Think about prototyping an insurance bot, integrating IBM Watson, Facebook Messenger, and claims databases. The most challenging component, by far, is the conversation. We have to understand the audience and their communication patterns. Talk to customers who have filed claims or are filing claims over the phone and online. Talk to customer service reps and claims adjusters. Listen to logged phone calls and read case emails. Researching conversation patterns in messaging.
Right now chatbot developers are treating the conversation component like a simple decision tree. But people don’t talk in decision trees. It’s a back and for negotiation. Non-developers are using the “no coding frameworks” and they are disappointing. When developers try to integrate conversational intelligence, they fail. When there is a problem definition, developers are great at integrating tools and frameworks. But mapping out how ordinary people converse or within a domain is a different skill set. Most UX professionals aren’t prepared for that either, so there is a talent gap.
tl;dr: There are lots of good use cases for AI chatbots, but too many people see it as a gold rush. Most are not doing the work to solve real problems.
Chatbots are hot right now. It has all the hype. That and virtual reality. But let’s talk about chatbots right now. I’m writing a post on the company blog about chatbots, but I thought I’d put down some thoughts here. Figuring out why chatbots even matter is an interesting exercise. It seems obvious: communications on the Internet is an utter disaster at the moment.
There are four factors. First, online ads downright suck. The ads themselves are not creative, and they are in your face. That is a terrible combination. Agencies and marketers don’t get excited about banner ads and text ads. Unless they are “digital” marketers, and that is all they know. Video and native ads have possibilities, but those executions are bad. Our brains have become dulled from creating banner and text ads. What’s wrong with interrupting the user experience with overlaying ads and tiny close buttons?
More important question is why TV and radio get away with it instead. My main theory is even though we all grew up with interruptive ads on those mediums, we had no alternatives. There was no Facebook or Snapchat to occupy our attentions during commercial breaks. Would TV and radio ads be possible if created today? Another theory is that Americans like breaks in stories. Look at popular American sports: football, baseball, and basketball. They are all situational. One team gets the opportunity to score, and then the next team tries to score. While not entirely applicable to basketball, timeouts make it situational.
If a viewer has watched a few shows on Netflix, how can they go back to live TV? It is painful. The only reason to watch live TV is live events like sports. But social media, which is taking eyeballs from TV ads, makes live TV worth watching. Sharing thoughts on social networks when watching live TV makes it more enjoyable.
Back to why chatbots are huge right now. The second factor is messaging is such a big deal. Business Insider created the chart above showing how messaging usage has surpassed social networks. It’s understandable. As soon as a social network gets big, mom & dad come, the advertisers move in. Then the cool kids migrate to another social network. But the way messaging works, you don’t have to see your dumb uncle’s Trump posts. Messaging allows users the one on one interaction, while interacting with separate small groups. Kids are comfortable with the UX and asynchronous nature of messaging. They are the perfect audience for chatbot-type interactions.
Next, websites and apps are painful. Marketers and agencies design websites and apps without putting the customer at the center. Look at any “digital” agency website. Stuff is flying around, punch the visitor in the face annoying. The websites are a demonstration of how they can grab the visitors attention. Sure. Throw in the bloated nature of websites due to ads and trackers. Why are we surprised ad blocking is on the rise?
The last factor, AI is more accessible. Computing and infrastructure costs have plummeted. The resources to build AI and machine learning are available to anyone. Granted, a lot of what we see are more of decision trees than AI, but it is a start in the right direction. The investments in technologies are already evident. Siri (ok, not so much), Alexa, the multitude of Google products. Plugin architecture of these platforms increases the impact and value of what AI provides. Once users are comfy with AI as their primary interface, they will demand it everywhere. More on this next time.
Let’s hope brands don’t build crappy chatbots and screw up this game-changing opportunity.