What is automatic text summarization?
Text summarization is one of the complex tasks in Natural Language Processing (NLP). It should produce a shorter version of a text and preserve the meaning and key ideas of the original text.
It involves several aspects of semantic and cognitive processing. We can define the goal of summarization as Extractive or Abstract Summarization. The purpose of Extractive Summarization is to create a summary from phrases or sentences in the source document. In extraction-based summarization, a subset of sentences that represent the most important points is pulled from the text and combined to make a summary. For instance, when we produce and automatic summary for legal documents it is preferable to use extraction to avoid any interpretation. Abstract Summarization is used to express the ideas in the source document in different words. In abstraction-based summarization, advanced deep learning techniques are applied to paraphrase and shorten the original document. This method is preferred for news documents to provide informative and catchy summaries which are short.
Could you give us a brief history of it?
Traditional NLP methods for text summarization such as scoring based TF-term frequency, IDF-inverse document frequency and cosine similarity, which involve extractive summarization, have been prevalent for quite some time now, owing to their origins in 1950’s. It’s more about trying to understand the importance of each sentence and their relationships with each other than to understand the context.
On the other hand, abstract summarization is all about understanding the content of the text and then providing a summary based on that. In traditional NLP approaches, this involves more complex linguistic models as it creates new sentences using template-based summarization.
In the last few years since the arrival of more modern NLP methods such as neural word embeddings, word2vec, and Deep Learning approaches such as Recurrent Neural Networks (RNN) and Long Short-Term Memory (LSTM), the interaction with machines through natural language and machine learning have been enjoying a lot of success. These modern NLP approaches have become the go to automatic summarization approaches to encapsulate semantics in text applications. These methods have been highly successful thanks to improvements in computing and data storage.
What is the current state-of-the-art?
We can apply automatic summarization in combination for many tasks and applications. I can give you couple of examples.
Update Summarization – this is amethod for producing an update summary of a set of topic-related documents. The summarizer assumes prior knowledge of the reader determined by a set of older documents on the same topic. When producing an update summary, the system has to decide which information it can categorize as novel or redundant. Redundant information is already contained in the set of earlier documents. These decisions are very important in order to maintain a high update value. For example, we can apply this technique for news reports where there is an update to an older news story.
Network Activity Summarization – Monitoring and analyzing the rich and continuous flow of user-generated social network content can yield unprecedentedly valuable information which would not have been available from traditional media outlets. Summarization can play a key role in semantic analysis of social media and social media analytics.
Data summarization is an important concept in data mining for finding a compact representation of a dataset. In network activity summarization, we are given a network and a collection of activities and the goal is to find k shortest paths that summarize these activities. For example, we can generate the summary from Trending Twitter hashtags.
Opinion Summarization – The volume of data being collected by different apps is huge and keeps increasing. There is no sign of data collection slowing down. The need for efficient processing of this extensive information has resulted in increasing research in knowledge engineering. The main category in this field is opinion summarization. Sentiment analysis is a broad area that includes opinion mining, sentiment classification, and opinion summarization. Opinion summarization is the process of automatically summarizing many opinions that are related to the same topic. For example, when we would collect reviews for a product and produce a summary of the weaknesses and strengths from customers’ points of view.
Event Summarization – One of theevent summarization research efforts focuses on episode mining or frequency pattern discovering. These methods simply output a number of patterns that are independent of each other, so they fail to provide a brief and comprehensible event summary revealing the big picture the dataset embodies. Event summarization is more interesting as it covers every dimension of a particular event being mined. The event summarization could focus on a specific event such as launch of new product, to find the relationship between the online information and product reviews to predict the stock market.
What are the key challenges?
For every NLP system, building applications using techniques suitable for an audience and making sure problems are solved is challenging. In summarization, finding redundant information is very important, as unnecessary storage of data already mined wastes time and memory. In some techniques like update summarization, detecting redundant data is key for good scoring. Detecting data of value or novel data is also key for the success of any model.
Once a key idea is derived, presentation order is critical too. Having a fluent summary with all the data predicted is very important to get a readable and meaning full summary. Once a meaningful summary is developed, the next challenge is to have it in the right length. The length of a summary should be based on the task provided. For instance, a title generation or a headline predictor should have a short summary, whereas an abstract summarization has at least half a page. Adaptation of different techniques and understanding audience’s needs is a huge challenge.
How can we tell if an AI or machine learning product works as advertised? What are things to look for?
We can evaluate an automatic summarization with intrinsic and extrinsic metrics.
Intrinsic – Look at the exact tool of selection and the summarization of the important sentences and compare it with a human summary. This can be measured by how many updates were required from the summarized sentence to make an actual human readable sentence. ROUGE, or Recall-Oriented Understudy for Gusting Evaluation, can be used to measure this. It is a set of metrics and a software package used for evaluating automatic summarization. The metrics compare the sequence of the words in an automatically produced summary against a reference or a set of references (human-produced).
Extrinsic – Based on the task, we check if the appropriate contextually meaningful summarization has been produced. For example, for a task where a legal summary had to be generated from a long text of a judgment, we run a survey asking if the selection of phrases was valid for a legal expert, or if by only reading the summary could experts answer the important questions without referring to the original text.
Where is it heading?
Huge volumes of text are produced every day but few of us have time to read it. These data need to be processed and made best use of. Machines will take over these tasks and do them quickly. Collecting all the information overnight and giving an overall summary will be an easy job for the machine and, with more work, this summarization can be made more accurate than human summarization by never missing any data of value or being limited by language barriers when it is combined with machine translation.
Thank you, Anna!