May 31, 2021
Introduction to NLP, BERT and Transformers (short story)
Paul and Laura...
As Customer Service Managers, Paul and Laura encounter difficulties in achieving the level of Customer Experience their customers and their management is expecting from their teams.
They know - vaguely - that new technologies such as Artificial Intelligence, Natural Language Processing, Robotic Process Automation could help them but they don't know how. These terms are everywhere yet they still struggle to understand them and the benefits they could bring.
That is why they have accepted the invitation to this exhibition about “cutting edge technologies”.
Let’s meet Laura and Paul as they are now walking into the NLP section of this exhibition.
They are passing by a stand which advertises a company saying that they "Empower Customer Service Agents".
That seems interesting, promising and understandable!
Let us find out ….
Laura: Good morning
Scott: The stand’s representative replied to Laura, “Good morning, how may I help you ?”
Laura: I am not sure as I don't know what you are doing ...
Scott: That is fair. (smile). Our solution helps Customer Service organizations by handling recurring and time consuming activities so that your teams can focus on more meaningful and more relationship activities. That is how we “Empower Customer Service Agents”.
Paul: OK...but how do you do that ?
Scott: We are using NLP. NLP stands for Natural Language Processing. NLP allows an IT system to understand the written messages you are receiving. Based on the intent of the message, we carry out some activities. A common example is when one of your customers asks a question about one of your products. Our solution understands the intent of the user in his email, gathers appropriate information and sends an email back to the user.
Paul: Does that really work ?
Scott: Of course (smile). Would you like me to explain how ?
Paul: Yes, please
Scott: The first thing to have in mind to understand NLP is to know how an IT System understands words. There is nothing magical, it is all about numbers and statistics. Computers only understand zeros & ones so all the English words need to be converted in strings of zeros & ones.
Paul: Is this possible ? Can an IT system understands all the English words ? Even the ones specific to my organization ?
Scott: Yes, and they can do this in different languages.
Laura: That is great.
Scott: The practical aspects of this are actually a little bit more complicated.
Laura and Paul: We knew it (smile)
Getting the Context
Scott: Let me explain. Up to about two years ago, NLP systems worked based on a dictionary with the English words on one side and a corresponding series of zeros and ones on the other. However, results were not that great as a word might have different meanings. If you have only one unique series of zeros and ones for a word, you lose the context of the word.
Paul: Not sure I get this...
Scott: Let us take an example: let us say you are walking on a shaded river bank and you take out of your pocket one of your bank statements. Does the word "bank" have the same meaning next to the word “river” as it does next to the word “statement” ?
Paul: Well, no…
Scott: That is why getting the context of a word is critical. And, that was the key issue in the NLP world: enabling a system to understand the context of a word. Some systems existed but their performance was limited. You might have heard of Recurrent Neural Network (RNN), Long Short Term Memory (LSTM) or even Convolutional Neural Network (CNN).
Laura: Uh no. I have never heard of these terms ...
Scott: That is normal. These are the technologies which enable a system to understand the words. But, these technologies failed to understand words in their contexts. They struggled to differentiate if the word "bank" was meant as a bank statement or a river bank. But, things have changed about two years ago, a revolution has occurred in the NLP world.
Laura: A revolution ?
Transformer and the BERT family
Scott: Yes, that is really what it is. A new type of algorithm has been released by Google which enables technologies to understand the context of a word in a sentence. To do so, the algorithm pays attention to all the words in a sentence and how they relate to each other. This technology is called "Transformer" and it outperforms all the previous NLP algorithms.... by far. Previous algorithms, like RNNs or LSTMs could only get the context of a word using the 3 or 4 previous words, which was not enough and led to poor results.
Paul: Interesting. So, there is a real breakthrough in NLP technology, a huge shift in the way NLP solutions handle text messages. At least the ones using Transformer, and the others ?
Scott: Exactly. I can show you some benchmarks that show this and even more interestingly that show how Transformer technology can equal human comprehension.
Laura: That is very surprising. I did not know that.
Scott: The last bit on the technology side that is key to understand is that Transformer is the architecture which enables new models to achieve these performances. These models are called BERT. It is actually a family. You have BERT, the initial model from Google, then Facebook created their own, called RoBERTa, and the French created a French version called camenBERT, like the cheese !!!
Laura: That is funny !
Paul: And these models work in any language ?
Scott: Yes, this is done by leveraging what we call the "language models".
Laura: OK, what are these ?
From BERT to Language Models
Scott: Some companies have trained their own models, based on BERT, RoBERTa or CamenBERT, using vast amount of information, like the whole Wikipedia. The goal is that these algorithms understand the language on which they have been trained. So you have a language model for English, French, Spanish ....
Paul: All right, I understand but how do you make it specific to our organization ?
Scott: The second key strength of these new models, on top of understanding the contextual meaning of a word, is that they can be customized for your own organization so that they understand your specific vocabulary and data.
Laura: How can you do that ? Can you do that and still understand the context of all words and reaching human comprehension ?
Scott: Yes, this is done by doing an activity called fine tuning. The BERT family of models are based on Deep Learning Neural Network. So “fine tuning” them means tweaking some of their parameters so they match your specifics terms better.
Laura: I understand.
Scott: By the way, we have tools that show, in an easy way to understand (smile) why & how the model is performing the way it is. It is not a black box as it might be for some other neural network models where you do not know what is going on.
Laura: How do you do that ?
Scott: We use a tool called LIT. I can give you a demo later if you want.
Laura: That will be very interesting. What about your solution, could you give us a demo as well ?
Scott: Yes, of course (smile).
Paul: So, how can you "empower my people" ?
Scott: By doing some activities which are time consuming, repetitive, and boring for a human. For instance, we can classify any type of message automatically. In your CS organization, do you have a ticketing system ?
Scott: And, therefore, some of your team members need to read the tickets to classify them and set up the right priority ?
Scott: Well, Y Meadows could do these activities for you: read the message, classify it, set up the right priority ....and it could do that in a few seconds, 24/7. No ticket will remain still for a while, sometimes a long while, in a queue. And your agents are freed from this boring activity.
Laura: Indeed that is very interesting. Does Y Meadows understand everything ?
Scott: Its understanding is, on average, about 90% of the messages.
Paul: That is impressive !
Scott: Indeed. As you see, Y Meadows can understand a message and do an action, like classifying, but Y Meadows can do much more complicated activities such as researching information in a data source or updating a database. We can discuss this later if you want, this is the RPA aspect of Y Meadows.
What I wanted to point out, from an NLP perspective, is that Y Meadows can also create messages. Let us say you receive a request from a user that would like some information about your product. Y Meadows will understand the request, grab the requested information about the product and write an answer back to the requestor.
Laura: We have loads of these requests. They take loads of time and don't bring much value to our company.
Scott: You will be even more happy when you know that the answer from Y Meadows is immediate and that Y Meadows works 24/7. Your agents will be happy too, for being freed from this boring activity, and can focus on more interesting stuff.
Paul: Indeed. I think this is the typical scenario. What do you think, Laura ?
Laura: Completely agree, Paul
Scott: By the way, if you prefer, Y Meadows can draft the answer and wait for one of your agents to validate it. In that case, the email is prepared, your agent reads the message that Y Meadows has drafted, he can modify it if needed and then just click "ok". Y Meadows will then send the message and perform any follow up activity, such as updating a database, as needed. It is a huge gain of time.
Laura: Extremely interesting. I really did not know that such technologies were available and bundled into one solution like Y Meadows.
Scott: I am here for this reason. To promote these new ways of working enabled by Y Meadows. Would you like a coffee before we discuss some of these scenarios in more detail, so that you get a complete picture of how to fully empower your CS agents?
To be continued ....