学在北邮

/ Study in BUPT

An Introduction to Automatic Summarization

主讲人 :Marina Litvak,Natalia Vanetik 地点 :教三-912 开始时间 : 2019-09-16 09:00 结束时间 : 2019-09-16 11:30

主讲内容:

(1)HEvaS: Headline Evaluation System

Automatic headline generation is a sub-task of one-line summarization with many reported applications. Evaluation of systems generating headlines is a very challenging and undeveloped area. We introduce the Headline Evaluation and Analysis System (HEvAS) that performs automatic evaluation of systems in terms of a quality of the generated headlines. HEvAS provides two types of metrics–one which measures the informativeness/relevancy of a headline, and another that measures its readability. The results of evaluation can be compared to the results of baseline methods which are implemented in HEvAS. The system also performs the statistical analysis of the evaluation results and provides different charts visualizing the results.

(2)Extractive summarization with MDL

We describe an approach for extractive summarization based on the Minimum Description Length (MDL) principle and relying on the Krimp dataset compression algorithm. We represent text as a transactional dataset, with sentences as transactions and normalized words as items; then describing the dataset by frequent itemsets of different types that provide the best compressed representation. The summary is compiled from sentences that best describe the document. The problem of extractive summarization is therefore reduced to the maximal coverage problem, following the assumption that a summary that best describes the original text should cover most of the itemsets describing the document.

Included: system demo

主讲人先容: