What is Natural Language Processing ?

Natural Language Processing (NLP) is a field in Machine Learning which focuses on making written language understandable to computers so that they can perform tasks which humans can perform easily.

Recent NLP Technical Disruptions

NLP technology has changed dramatically in the last eighteen months – enabling solutions aimed for enterprises that simply were not feasible before.

New “transformer” models, which achieve a high level of accuracy on common tasks, allow for language models to be pre-trained on large corpuses without any annotations. This means that much less customer data is needed to use the model’s accumulated language understanding for any particular task.

NLP Can Transform Customer Service

These are just some of the ways that NLP can transform your business: 

Automatic classification and tagging of messages and service tickets

Triage and routing of emails and service tickets

Automated response to forms and emails

Extracting information from customer documents and email attachments

Insights into customer sentiment

Scanning of social media to find relevant messages

All these enable more efficient and effective customer service– leading to superior customer experiences and better productivity for your organization.

How does Natural Language Processing work ?

This section introduces some of the key information about Natural Language Processing (NLP).

 Any NLP project is typically performed through a series of steps, or a  “pipeline”. The overall goal of the NLP Pipeline is to convert sequences of text, like sentences or documents, into vectors of numbers that can then be compared quantitatively.

Segmenting the text into single tokens (Tokenization)

The first activity of most NLP Pipelines is to split the sequence of text into words or sub-words, called “tokens”. There are three types of “tokenization” approaches: based on words, sub-words, or even splitting at the character level.
Some NLP pipelines perform extra pre-processing activities. They may remove non-alphabetic data (such as punctuation) or transform tokens to their canonical form (done by finding the stem of the word), or attempt to automatically fix errors in the text.

Converting the words into numbers (Embedding)

Once the sentence is broken into a collection of tokens, we have to convert these tokens into numerical vectors. These numerical vectors will be what are used to represent the words as input to the model that will be used to extract the needed information from them.
This conversion of words or sub-words to numerical values can be learned independently, typically by learning from large corpuses of text which contain many instances of all the words in all sorts of contexts. With transformer models, typically this conversion step is learned together with the rest of the model which is trained on sequences from the same large corpuses of text.

Information Extraction

Once the sequences of words are converted to sequences of numerical vectors, models can be trained to recognize patterns in the sequences which correspond to information that we are interested in. For example, sentences and documents contain a set of special words known as “entities”. Examples of “entities” are people’s names, addresses, dates, currencies or product names. Named Entities Recognition (NER) identifies these entities so that they can be processed appropriately. For example we may want to find instances of personally identifiable information (PII) in documents; or we may be looking for company confidential information.
NER is only one of the ways in which models are able to extract information from text. NLP includes topics like sentiment analysis, phrase extraction, named entities disambiguation and linking, relation extraction, and event extraction.

Model Development

Before 2018, NLP models were trained by reading sentences based on the ordered sequence of words. In 2018, Transformers Models based on “Attention Mechanisms” changed this paradigm. The Transformer Models read the whole sentence at once and then use the Attention Mechanisms to estimate how strongly words are correlated with each other in the whole sentence. This ensures that the word context is fully taken into account, and none of the context is lost or “forgotten” as it could have been with the earlier sequential algorithms.

The most famous Transformer Model is BERT which stands for Bidirectional Encoder Representation from Transformer. BERT encodes a representation of words (or sub-words) in a sentence in such a way that both the context before and after each word is taken into account in its representation - it is “bidirectional”. It is what’s known as a “language model”, meaning a model that transforms the representation of language from the text into its own encoding using its statistical “knowledge” of the language. The procedure that is used to train the BERT model uses unlabeled corpuses of text, including the entire Wikipedia (2,500 million words).

Once BERT is “pre-trained” to understand a language, it can then be fine-tuned for a task relevant to a specific industry or business in order to achieve state-of-the-art models for a wide range of NLP tasks. Such tasks could be Sentiment Analysis, Question and Answering tasks and Intent Recognition. These types of tasks typically require manually labeled data, which is time consuming and expensive to produce, and therefore are usually of smaller volume.

Model Evaluation

A model is evaluated based on its internal performance, and in addition, based on the business outcomes it provides. The internal performance of a model is measured over a dedicated “testing set” of data with indicators like precision, recall, and the F1 Score. Once the internal performance of the model is sufficient, we need to ensure that the model fulfills the business objectives it has been developed for.
For instance, for a spam / email classification model, we would need to evaluate the internal performance of the model (using the F1 Score), and we would also need to verify with the end users that the model works well, for example, by checking with the users that they have not received any spams in their mailbox and that no email has been wrongly classified as a spam.

Model Monitoring and Retraining

Once the model is rolled out in the production environment, it needs to be monitored to ensure its performance remains appropriate. This is also referred to as “Human in the Loop”. Humans are asked to identify whenever the model makes a mistake, and those new labels are saved and added to the training set.
From time to time, the model is retrained when enough new data is available and we want the model to take them into account.

Follow the latest trends in AI Customer Service Automation and NLP

We plan to regularly monitor the latest trends and publish our thoughts on AI Customer Service Automation and Natural Language Processing (NLP) from both a business and technical perspective.

Enter your email to subscribe to our newsletters.
(your contact details will not be shared outside of Y Meadows)

illustrated letter with people in and around it
Follow Us

© 2020 Y Meadows, Inc.