Natural language processing
Natural language processing (NLP) is a subfield of AI that powers a number of everyday applications such as digital assistants like Siri or Alexa, GPS systems and predictive texts on smartphones.
Earlier versions of NLP used rule-based computational linguistics with statistical methods and machine learning to understand and gather insights from social messages, reviews and other data. More recent approaches leverage neural networks and large-language models (LLMs) to accomplish the tasks below
To facilitate NLP, a number of sub-tasks are often conducted, including:
- Tokenization: Text is broken down into smaller single clauses.
- Stemming: Words are broken down into root forms. For example, reading, reader, reads are stemmed into the word “read”.
- Lemmatization: Contextually similar words or degrees are reduced to their root word. For example, better, best and very good are reduced to “good”.
- Stop word removal: Words such as prepositions and articles are removed.
- Part-of-speech-tagging: Nouns, verbs, adjectives, adverbs, pronouns, etc. are tagged.
To facilitate conversational communication with a human, NLP employs two other sub-branches called natural language understanding (NLU) and natural language generation (NLG). NLU comprises algorithms that analyze text to understand words contextually, while NLG helps in generating meaningful words as a human would. Together, they power intelligent chatbots such as ChatGPT.
Here are the main NLP techniques used in business and B2C environments.
- Text summarizations: NLP algorithms scan large amounts of data and condense the information to provide a summary with key points.
- Speech recognition: This technique analyzes audio data to translates it into text or maps it to known words. It’s used to caption audio and has been pivotal in empowering the hearing impaired.
- Machine translations: Automatically translates words in different languages so that users can benefit from non-native information with minimal effort. Google Translate is a good example
- Question answering systems: NLP algorithms scan data and search for relevant information to provide answers to a user. These systems can be rules-based or based on generative pre-trained models, like ChatGPT, that derive information by accessing publicly available data on the internet.
- Named entity recognition: Named entity recognition (NER) is an NLP technique that identifies and extracts entities such as people, locations, brands, objects, currencies and such.
- Semantic search: A search technique that allows a user retrieve information by understanding the intent of the search rather than just using keywords.
- Sentiment analysis: NLP algorithms that can categorize the emotions in a text to show whether it is positive, negative or neutral and to what extent.
- Aspect-based sentiment: This advanced technique analyzes sentiment in aspects that have been extracted from topics in a text. This fine-grained view of market sentiment tells brands exactly where they need to improve and what’s going well.
All the above NLP techniques and subtasks work together to provide the right data analytics about customer and brand sentiment from social data or otherwise.