Automatic text summarization is a field that is experiencing much interest and research lately because of its applications(automatically summarizing research papers, long reports, entire books, web pages, news, etc).
There are two ways of attempting an automatic text summarization: The extractive is the easiest one, consisting of a “collage” of sentences extracted from the original document; the abstractive approach on the contrary focuses on obtaining a shorter version of the original text using different words and sentences, rephrasing the text in the same way as a human would do.
It is a difficult task because it implies a sort of semantic “understanding” of the text, something which it’s really borderline with strong AI objectives and that researchers have not yet fully obtained.
Deep learning for automatic summarization
Deep learning technologies resulted to be very promising for this particular task, because they try to mimic the way human brain works, handling several levels of abstraction and non linearly transforming a certain input into a certain output (in this process the output of one layer becomes the input of the other layer and so on). Obviously more are the layers, more is the deepness. Deep neural networks are widely used in NLP problems because their architecture works well with the complex structure of the language, for example each layer can manage a different task and then hand in the output to the next layer.
Usually when you think of summarizing something you think of articles, reports, scientific papers…texts that are not overly lengthy.
But sometimes this is not the case and you have to deal with different kinds of texts: studies, thorough reports, even entire books…in this case you have to adopt a different strategy and in case choose when to use summarize in a traditional way or in a machine aided way.
Also, you should always keep in mind that summarizing is a very different activity from paraphrasing (state something using different words), so you should also understand – in the case of long texts – when to use summarize or when you rather want to paraphrase.
Long texts: paraphrasing or summarizing?
When you paraphrase, you state the same thing as you can find in the original text but using different words. You choose to paraphrase when the meaning of the text is as important as the wording, when you want to express the same thing but using a simpler, more direct language, or also when you want to insert that text into a longer text written in a different style.
Of course when you paraphrase you don’t want to shorten the text, but only to convey the same contents using different words. So it’s not something you can do when synthesis is important: that is when to use summarize.
The summary provides a brief outline of a text, a condensed version of it that does not result in a loss of relevant information.
It’s obvious that when you deal with long texts such as books you must choose to summarize. Let’s see how to summarize long texts and books.
Summarizing long texts: choosing a strategy
The main difficulty when you summarize long texts or books is to organize the information in an effective way. You may choose different strategies:
- Read the entire book or text, take notes while you do it and then try to sum it up in only one step writing a text that condenses the book’s contents in your own words
- Read chapter by chapter or paragraph by paragraph and do mini summaries of each one, then use this material to compose a general summary
- Read the book quickly to gather the most important idea or ideas the book is around to. Then draw a conceptual map for the various sections or chapters to show how they connect to the main idea. Use this material to put together your summary.
Remember that you don’t have to follow the same logical or chronological order of the book: you can start with the most important topic or idea and then go in further detail, or you could group topics, or you could follow any order (chronological, narrative, logical) that allows you to sum up the book or the text without loss of relevant information.
Remember also that a summary is not a review: NEVER insert your ideas, personal opinions, or argumentations into a summary. A summary should be neutral and impartial.
Benefits of summarizing
But you could ask: is this work worth the effort? Why bother?
The reason why summarizing could be extremely useful for you and your business are several:
- It allows to go into the text’s subject matter and to fix ideas and concepts
- It helps you organize and clarify your ideas about the book and to structure your thinking
- If you sum up books or texts that others are interested into it could build an audience or readers and help you in building connections and expand your visibility
If you need to go fast…
As you can imagine, summarizing a long text or even worse a book is a cumbersome and time consuming task, that requires concentration and a considerable effort. If you have to summarize more than one book the situation is even more complicated. In that case you can avail yourself of the help of technology, and in particular of the most recent developments in natural language processing and machine learning.
Automatic summarizing is an area or research that has undergone incredible achievements in the last few years and from being able to deal only with short texts it is now mature for processing longer and longer pieces of writing, even entire books.
There are many different algorithms, based on diverse approaches; to summarize long texts at the moment the ones that perform better are based on the extractive approach (extracting the most relevant sentences from the text and then compiling a summary linking them together).