A Quick Guide to Natural Language Processing (NLP)
What is NLP?
Natural Language Processing (NLP) is an area of research and application that explores how computers can be used to understand and manipulate natural language text or speech. NLP is a sub-field of Artificial Intelligence (AI) that is focused on enabling computers to understand and process human languages, and to get computers closer to a human-level understanding of language. Computers don’t yet have the same intuitive understanding of natural language that humans do. There is a big difference between the way humans communicate with one another, and the way we “talk” with computers. When writing programs, we have to use very careful syntax and structure, but when talking with other people, we take a lot of liberties. We make short sentences. We make longer sentences, we layer in extra meaning, we use puns and sarcasms. We find multiple ways to say the same thing.
That being said, recent advances in Machine Learning (ML) have enabled computers to do quite a lot of useful things with natural or human language. Deep Learning has enabled us to write programs to perform things like language translation (e.g. Apple’s Siri, Amazon’s Alexa, Google Home etc.), semantic understanding, and text summarization. As AI becomes ubiquitous by finding its way into more and more of our devices and tasks, it becomes critically important for us to be able to communicate with computers in the language we’re familiar with. We can always ask programmers to write more programs, but we can’t ask consumers to learn to write code just to ask Siri for the weather. Consumers have to be able to speak to computers in their “natural” language.
The Necessities & Challenges of NLP
A lot of information in the world is unstructured i.e. raw text in English or another human language. As long as computers have been around, programmers have been trying to write programs that understand languages like English. Soon after the first appearance of ‘electronic calculators’ research began on using computers as aids for translating natural languages. The beginning may be dated to a letter in March 1947 from Warren Weaver of the Rockefeller Foundation to cyberneticist Norbert Wiener. Two years later, Weaver wrote a memorandum (July 1949), putting forward various proposals, based on the wartime successes in code breaking, the developments by Claude Shannon in information theory and speculations about universal principles underlying natural languages. The reason behind this continual interest in NLP is pretty obvious — humans have been writing things down for thousands of years and it would be really helpful if a computer could read and understand all that data.
The human language is one of the most diverse and complex part of us, considering a total of over 6500 different languages that we humans speak in all over the world. It has been estimated that almost 80% of available big data from text messages, tweets, Facebook posts etc. is in unstructured or natural language form. The process of reading and understanding English is very complex and that’s not even considering that English doesn’t follow logical and consistent rules. There are many things that go in to truly understanding what a piece of text means in the real-world. For example, what do you think the following piece of text means? “I was on fire last night and completely destroyed the other team!”- Us humans will get the layered meaning behind this statement quite easily and almost intuitively. Our evolved brains will effortlessly make a connection between the phrases ‘other team’ and ‘was on fire’ to evaluate that the person uttering those words is most probably talking about some kind of sport that she excelled in the night before. This is not the case for (regular) computers, which use structured data (strict algorithms) as a means of making sense out of things. To a computer the sentence, “I was on fire last night and completely destroyed the other team” would literally mean that the person uttering those words was lit with fire, like actual fire! And that he literally destroyed or killed other people by burning them with fire!
NLP with Machine Learning
Computers using regular non-AI algorithms are not very capable of processing natural language. Even if they are, it would not be efficient at all to use them, as a new algorithm would have to be made with the introduction of any new document or set of words. This is where machine learning comes to the rescue. Thanks to the recent developments in ML we can actually do some really clever things to quickly extract and understand information from natural language and simply give it the rules that it can implement to understand any new phrase, or document.
For example, take a look at the following paragraph taken from Wikipedia:
“Vancouver is a coastal seaport city in western Canada, located in the Lower Mainland region of British Columbia. As the most populous city in the province, the 2016 census recorded 631,486 people in the city, up from 603,502 in 2011. The Greater Vancouver area had a population of 2,463,431 in 2016, making it the third-largest metropolitan area in Canada. Vancouver has the highest population density in Canada with over 5,400 people per square kilometer, which makes it the fifth-most densely populated city with over 250,000 residents in North America behind New York City, Guadalajara, San Francisco, and Mexico City according to the 2011 census.”
This paragraph contains a lot of useful information about a city in Canada and most importantly this information is not in computer codes, it’s in unstructured and natural form. ML can be implemented in order to make sense out of this paragraph and extract useful information from it.
Doing anything complicated in machine learning usually means building a pipeline. The idea is to break up your problem into very small pieces and then use ML to solve each smaller piece separately. Then by chaining together several ML models that feed into each other, you can do very complicated things, like understanding the nuances of human language.
Tobias Unger, previously head of innovation, digitalization & benchmarking at Siemens, discusses chatbots.
Source: The AIIA Network Podcast
The basic steps that any ML model follow in order to build an NLP pipeline are the following:
- Step 1: Sentence Segmentation
The first thing that the ML model does is that it breaks the given paragraph into separate sentences. This is quite intuitive in the sense that even human beings tend to do the same thing. They try to process the meaning from one line to the next.
- Step 2: Word Tokenization
After separating the different sentences, the next step is to extract the words from each sentence one by one. The algorithm for tokenization can be as simple as identifying a word every time a ‘space’ is noticed. In the first sentence of the given paragraph, the tokenized words will be the following: “Vancouver”, “is”, “a”, “coastal”, “seaport”, “city”, “in”, “Canada”, “,”, “located”, “in”, “the”, “Lower”, “Mainland”, “Region”, “of”, “British”, “Columbia”.
- Step 3: ‘Parts of Speech’ Prediction
As the name suggests, this step involves identifying whether a word is noun, verb, adjective i.e. its parts of speech. Identifying the parts of speech of a word helps the ML model to understand the role it’s playing in the sentence. It’s important to highlight the fact that the ML model does not actually understand the word’s meaning in the context of the sentence like a human being would do. The model first has to be fed a lot of data i.e. millions of English sentences along with the correct tag for each word’s meaning and parts of speech. This in essence is the main characteristic of AI and deep learning.
In the first sentence our ML model will identify the word “Vancouver” as a Proper Noun by implementing the basic rules of English Language.
- Step 4: Text Lemmatization
This step teaches the ML model to figure out the most basic form or lemma of each word in a sentence. For example, the words (or more appropriately strings) “horse” and “horses” might be processed by a ML model to be words with 2 completely different meanings. But in reality, this is not the case. A human being will not consider a “horse” and “horses” to be 2 different animals!
- Step 5: ‘Stop Words’ Identification:
Next, the importance of each word in the sentence is identified. English has a lot of filter words that appear very frequently like “and”, “the”, and “a”. When doing statistics on text, these words introduce a lot of noise since they appear way more frequently than other words. Some NLP pipelines will flag them as stop words i.e. words that you might want to filter out before doing any statistical analysis.
- Step 6: Dependency Parsing
In this step the model uses the grammatical laws of English language to figure out how the words relate to one another. For example, in the first sentence of our paragraph, the ML model will identify the word “is” as a root between the proper noun “Vancouver” and noun “city”. Hence extracting the simple meaning, Vancouver is a city!
- Step 7: Entity Analysis
Entity analysis will go through the text and identify all of the important words or “entities” in the text. When we say “important” what we really mean is words that have some kind of real-world semantic meaning or significance. The ML model will categorize the words in each sentence as one of the following: Place, Organization, Person, Date, Location, Events, Sum of money etc.
- Step 8: Pronouns Parsing
This is one of the toughest steps for the ML model to carry out. This step requires the ML model to keep track of the pronouns with respect to the context of the sentence. Simply put, we want our ML model to understand that the word “it” in a sentence is referring to let’s say the place “Vancouver”. In English we use words (pronouns) such as “it”, “he” or “she” as substitutes for names of people and places. Humans can understand the meaning simply from the context of the sentence, but computers can’t. Hence a ML model needs to be fed a lot of data along with the correct tags for it to learn how to identify the effect of the pronouns in a sentence.
Webinar from AI & IA LIVE APAC webinar series, Pranav Rai Global Head- Solutions & Practice, Hexaware BPS presents:
Using NLP to Analyze Human Sentiments
One of the most exciting applications of NLP is known as ‘sentiment analysis’. It is the process that can be used by computers (using NLP) to not only understand the literal meaning of words, but to also extract the emotions behind them. Sentiment analysis also known as ‘opinion mining’ is the automated process of understanding an opinion about a given subject from written or spoken language.
Besides identifying the opinion, these systems extract the following 3 attributes of an expression:
- Polarity: if the speaker expresses a positive or negative opinion,
- Subject: the thing that is being talked about,
- Opinion holder: the person, or entity that expresses the opinion.
As a simple example let’s consider giving the following sentence to an NLP model: “How the hell could you do this to me?”
If we were to use the basic model of NLP, then a computer will have no problem identifying that the sentence is in the form of a question. But that is clearly not enough information about the sentence. Any human being can easily make a connection between the words ‘hell’, ‘to me’ and the ‘?’ mark at the end of the sentence to realize that the person uttering these words are either not happy or not satisfied, or maybe even furious. This is where sentiment analysis comes in and enables the NLP model to truly understand both the literal and emotional message behind a phrase.
Currently, sentiment analysis is a topic of great interest and development since it has many practical applications such as: social media monitoring, product analytics, market research & analysis, brand monitoring workforce analytics and etc. Effective sentiment analysis means understanding people better and more accurately than ever before, with far-reaching implications for marketing, research, politics and security.
Sentiment analysis has moved beyond merely an interesting, high-tech whim, and will soon become an indispensable tool for all companies of the modern age. Ultimately, sentiment analysis enables us to collect new insights, better understand our customers, and empower our own teams more effectively so that they do better and more productive work.
From Wikipedia, the free encyclopedia. [18 March, 2019]. Natural Language Processing.
Hutchins J. [November, 2005]. The History of Machine Translation in a Nutshell.
Gonfalonieri A. [21 November, 2018]. How Amazon Alexa works? Your guide to Natural Language Processing (AI)
Ashish. [3 March, 2016]. How Does Apple’s Siri Work?
Finkel J, Bethard S, Bauer J. [June 23, 2014]. The Stanford CoreNLP Natural Language Processing Toolkit
Liddy E. . Natural Language Processing. Syracuse University.
From Wikipedia, the free encyclopedia. [5 April, 2019]. Vancouver.
Pang B & Lee L. . Opinion Mining and Sentiment Analysis.
Lexalytics - Blogs – Applications