Natural Language Processing for SME's
Natural language processing (NLP) is a theory-motivated range of computational techniques for the automatic analysis and representation of human languages [1].
Lots of mainstream NLP research relies on language models based on popular machine learning algorithms called transformers, which are nowadays used for popular general-purpose applications such as automatic translation, question answering, chatbots or search engines, to name but a few. In natural language processing, two main application fields are text mining and natural language generation. Text mining is the process of transforming unstructured text into a structured format to identify meaningful patterns and new insights. Its technique area includes Information Retrieval, Information Extraction, Text Classification, Textual Entailment, and Sentiment Analysis. The other main application field is natural language generation, which includes language generation, text to speech, speech recognition, and speech synthesis.
For SMEs which are not specifically focusing on the language processing business or the more general-purpose applications mentioned above, three main text mining techniques, including Information Extraction, text classification, and sentiment analysis, are selected to give a brief introduction of their underlying approach and application.
- Information Extraction
An NLP technique that provides great value for SMEs is Information Extraction. Information Extraction (IE) is to identify a predefined set of concepts in a specific domain, ignoring other irrelevant information, where a domain consists of a corpus of texts together with a clearly specified information need. In other words, IE is about deriving structured information from unstructured text. [2] A professional service provider, called PwC, found that 28% of executives have prioritized using AI and machine learning for Information Extraction in their AI Predictions 2021[3].
The information extraction technique we are discussing here is different from Optical character recognition (OCR), which is a technology commonly used in Robotic Process Automation (RPA) to achieve automated paperwork for PDF documents. It is also different from Information Retrieval (IR), which is a technique to select from a collection of textual documents a subset relevant to a particular query, based on keyword search. Information Retrieval [4] returns a ranked list of documents, while Information Extraction returns the extraction from the document’s salient facts about pre-specified types of entities, events, relationships, or meaningful information. These facts, in turn, can for instance be used to answer questions or find relevant documents more effectively. Since IE turns unstructured text into machine-readable structured information, it can be used for further AI-based processing. Examples below are document summarization (to deal with paperwork more effectively) and media monitoring.
Summarize documents and deal with paperwork
A subfield of Information extraction is Named-Entity Recognition (NER), this technique can identify key terms, provisions, or clauses within documents; it can also identify clusters of documents requiring similar actions. With information extraction techniques, companies can read and summarize long documents from third parties, competitors, or internal sources efficiently, thus achieving agile management and decision-making.
For example, many companies might have issues interpreting and responding to tax notices or letters issued by government revenue agencies. These tax notices, account changes, payment requests, or tax return discrepancies need to be read and interpreted; employees need to verify their accuracy, catalog it, and respond. During the process, data-entry errors or documents getting lost might happen under manual handling. To automate this kind of tax notice process or letters response, integrated machine learning solutions that consists of OCR, information extraction, text classification, and language generation can be utilised and it brings a reduction of 30–40% of the hours spent on such processes [3].
Media monitoring
Another main application of information extraction is media monitoring. For SMEs, building brand recognition is their top priority. Through media monitoring, companies can find out where the potential clients discuss their products or services, thus providing a chance to engage with the clients, answer their questions or offer solutions to their problems.
Companies can monitor keywords or hashtags such as brand name, product name, campaign-specific hashtags, or recognisable employee’s name; apart from this, monitoring partner’s reputation, industry trends, and competitor’s communication or marketing can also give insightful information for SMEs.
2. Text classification
Text classification (TC) is one of the important tasks in supervised machine learning. It is a sequence of techniques for labeling natural language texts with relevant categories from a predefined set. Assigning categories to documents can help companies automatically and quickly structure and analyze text in a cost-effective way.
Text classification has wide applications in marketing, such as customer relationship management, competitor analysis, website optimization, and even search engine optimization. It can also perform in integrated solutions after the information extraction process for the incoming data received in structural or non-structural format.
One text classification application is in customer relationship management (CRM). Companies use CRM software to retain customers, automate marketing communications, boost customer loyalty and sales conversions, and text classification is one of the techniques to create more value from their CRM database. By applying text classification on customer feedback, companies can categorize customers’ attitudes towards a specific service or product, and gain insight into changes in customer needs, thus improving their service and product. It helps customer interaction management and customer retention. Analyzing competitors’ customer feedback using text classification also provides huge value and insight for SMEs if related data is accessible.
Companies can also use text classification to provide better customer experience. For E-commerce companies, tagging content or products using categories is a way to improve browsing and identify related content on their E-commerce websites, thus providing a better user experience and increasing sales. Moreover, it can be used to analyze tags and keywords used by competitors, thus creating differentiation in the contents to achieve better search engine optimization (SEO) for the websites.
3. Sentiment analysis
Another useful NLP technique is Sentiment analysis. Sentiment analysis process gives a sentiment score to a text or categorizes it into positive, neutral, or negative sentiment polarity. In the traditional lexicon-based method, it uses a variety of words annotated by polarity score to decide the general assessment score of a given context [5]. This method does not require any training data, however, the lack of any words in the sentiment lexicons will influence the final score. In contrast, machine learning based method takes this task as a 2 or 3-class classification problem, which can be solved by the same algorithms used for text classification, and these algorithms have become the most popular solution for sentiment analysis nowadays.
Sentiment analysis task classifies the sentiment state of information from internal or external data sources such as the information from media, social media or CRM databases, allowing companies to track changes in customer attitudes over time. It can provide information for companies to prioritize key pain points to solve for their customers and allow businesses to respond to urgent issues in real-time. Sentiment analysis can also be used to monitor how a new product is perceived by the public, for instance, on social media. These insights can propel businesses to connect with customers and improve processes and user experiences.
Summary
This article provided a brief outline of different NLP techniques and their example applications. For SMEs, information extraction can help you deal with tax notices, letters, and long documents; it also helps you gather insightful information from social media including monitoring your partners, competitors, and industry trends. Text classification can be applied to your customer relationship management, competitor analysis, website optimization, and search engine optimization. Sentiment analysis can help you to prioritize the issue to solve or respond to clients.
How can we help?
The presented NLP solutions and applications are by far not exhaustive and there are many more possible challenges that can be addressed using NLP. SOLVD can help you evaluate suitable NLP applications and techniques for your business, or deliver further NLP research and development collaboration. Please email us at solvd@wlv.ac.uk if you wish to be notified of future digital technologies events for SMEs, or to get in touch with our experts.
Blog by Ming Jing Yao, SOLVD project academic team member at The University of Wolverhampton.
Reference
[1] E. Cambria and B. White, "Jumping NLP Curves: A Review of Natural Language Processing Research [Review Article]," in IEEE Computational Intelligence Magazine, vol. 9, no. 2, pp. 48-57, May 2014, doi: 10.1109/MCI.2014.2307227.
[2] Piskorski, J., & Yangarber, R. (2013). Information extraction: Past, present and future. In Multi-source, multilingual information extraction and summarization (pp. 23-49). Springer, Berlin, Heidelberg.
[4] Singhal, A. (2001). Modern information retrieval: A brief overview. IEEE Data Eng. Bull., 24(4), 35-43.
[5] Fang, X., Zhan, J. Sentiment analysis using product review data. Journal of Big Data 2, 5 (2015). https://doi.org/10.1186/s40537-015-0015-2
[6] https://www.ibm.com/cloud/learn/text-mining#toc-text-minin-sFfDd5uU
[7] https://brand24.com/blog/how-to-do-media-monitoring/
[8] https://www.kdnuggets.com/2018/03/5-things-sentiment-analysis-classification.html
[10] https://www.agilitypr.com/media-monitoring-ultimate-guide/
[11] https://www.europeanbusinessreview.com/natural-language-processing-nlp-applications-in-business/
[12] https://www.brainkart.com/article/Customer-Relationship-Management-(CRM)-Structures_6048/
[13] https://towardsdatascience.com/text-classification-applications-and-use-cases-beab4bfe2e62
[14] Ikonomakis, M., Kotsiantis, S., & Tampakas, V. (2005). Text classification using machine learning techniques. WSEAS transactions on computers, 4(8), 966-974.