A Scoring Model Assisted by Frequency for Multi-Document Summarization

Yue YU*, Mutong Wu, Weifeng Su*, Yiu-ming CHEUNG

*Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference contributionpeer-review

Abstract

While position information plays a significant role in sentence scoring of single document summarization, the repetition of content among different documents greatly impacts the salience scores of sentences in multi-document summarization. Introducing frequencies information can help identify important sentences which are generally ignored when only considering position information before. Therefore, in this paper, we propose a scoring model, SAFA (Self-Attention with Frequency Graph) which combines position information with frequency to identify the salience of sentences. The SAFA model constructs a frequency graph at the multi-document level based on the repetition of content of sentences, and assigns initial score values to each sentence based on the graph. The model then uses the position-aware gold scores to train a self-attention mechanism, obtaining the sentence significance at its single document level. The score of each sentence is updated by combing position and frequency information together. We train and test the SAFA model on the large-scale multi-document dataset Multi-News. The extensive experimental results show that the model incorporating frequency information in sentence scoring outperforms the other state-of-the-art extractive models.
Original languageEnglish
Title of host publicationArtificial Neural Networks and Machine Learning – ICANN 2021
Subtitle of host publication30th International Conference on Artificial Neural Networks, Bratislava, Slovakia, September 14–17, 2021, Proceedings, Part V
EditorsIgor Farkaš, Paolo Masulli, Sebastian Otte, Stefan Wermter
PublisherSpringer Cham
Pages309-320
Number of pages12
Edition1st
ISBN (Electronic)9783030863838
ISBN (Print)9783030863821
DOIs
Publication statusPublished - 7 Sep 2021
Event30th International Conference on Artificial Neural Networks, ICANN 2021 - Bratislava, Slovakia
Duration: 14 Sep 202117 Sep 2021

Publication series

NameLecture Notes in Computer Science
Volume12895
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349
NameTheoretical Computer Science and General Issues
Volume12895
NameICANN: International Conference on Artificial Neural Networks

Conference

Conference30th International Conference on Artificial Neural Networks, ICANN 2021
Country/TerritorySlovakia
CityBratislava
Period14/09/2117/09/21

User-Defined Keywords

  • Multiple document summarization
  • Position information
  • Frequency
  • Graph

Fingerprint

Dive into the research topics of 'A Scoring Model Assisted by Frequency for Multi-Document Summarization'. Together they form a unique fingerprint.

Cite this