A Computational Measure for the Semantic Readability of Segmented Texts

Valentino Santucci*, Umberto Bartoccini, Paolo Mengoni, Fabio Zanda

*Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference contributionpeer-review

Abstract

In this paper we introduce a computational procedure for measuring the semantic readability of a segmented text. The procedure mainly consists of three steps. First, natural language processing tools and unsupervised machine learning techniques are adopted in order to obtain a vectorized numerical representation for any section or segment of the inputted text. Hence, similar or semantically related text segments are modeled by nearby points in a vector space, then the shortest and longest Hamiltonian paths passing through them are computed. Lastly, the lengths of these paths and that of the original ordering on the segments are combined into an arithmetic expression in order to derive an index, which may be used to gauge the semantic difficulty that a reader is supposed to experience when reading the text. A preliminary experimental study is conducted on seven classic narrative texts written in English, which were obtained from the well-known Gutenberg project. The experimental results appear to be in line with our expectations.

Original languageEnglish
Title of host publicationComputational Science and Its Applications – ICCSA 2022 Workshops
Subtitle of host publicationMalaga, Spain, July 4–7, 2022, Proceedings, Part I
EditorsOsvaldo Gervasi, Beniamino Murgante, Sanjay Misra, Ana Maria A. C. Rocha, Chiara Garau
PublisherSpringer Cham
Pages107-119
Number of pages13
Edition1st
ISBN (Electronic)9783031105364
ISBN (Print)9783031105357
DOIs
Publication statusPublished - 23 Jul 2022
Event22nd International Conference on Computational Science and Its Applications , ICCSA 2022 - Malaga, Spain
Duration: 4 Jul 20227 Jul 2022
https://link.springer.com/book/10.1007/978-3-031-10536-4 (Conference proceedings)

Publication series

NameLecture Notes in Computer Science
Volume13377
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349
NameICCSA: International Conference on Computational Science and Its Applications

Conference

Conference22nd International Conference on Computational Science and Its Applications , ICCSA 2022
Country/TerritorySpain
CityMalaga
Period4/07/227/07/22
Internet address

Scopus Subject Areas

  • Theoretical Computer Science
  • Computer Science(all)

User-Defined Keywords

  • Hamiltonian path
  • Natural Language Processing
  • Semantic readability of texts
  • Unsupervised machine learning

Fingerprint

Dive into the research topics of 'A Computational Measure for the Semantic Readability of Segmented Texts'. Together they form a unique fingerprint.

Cite this