Clustering facebook for biased context extraction

Valentina Franzoni*, Yuanxi Li, Paolo Mengoni, Alfredo Milani

*Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

15 Citations (Scopus)


Facebook comments and shared posts often convey human biases, which play a pivotal role in information spreading and content consumption, where short information can be quickly consumed, and later ruminated. Such bias is nevertheless at the basis of human-generated content, and being able to extract contexts that does not amplify but represent such a bias can be relevant to data mining and artificial intelligence, because it is what shapes the opinion of users through social media. Starting from the observation that a separation in topic clusters, i.e. sub-contexts, spontaneously occur if evaluated by human common sense, especially in particular domains e.g. politics, technology, this work introduces a process for automated context extraction by means of a class of path-based semantic similarity measures which, using third party knowledge e.g. WordNet, Wikipedia, can create a bag of words relating to relevant concepts present in Facebook comments to topic-related posts, thus reflecting the collective knowledge of a community of users. It is thus easy to create human-readable views e.g. word clouds, or structured information to be readable by machines for further learning or content explanation, e.g. augmenting information with time stamps of posts and comments. Experimental evidence, obtained by the domain of information security and technology over a sample of 9M3k page users, where previous comments serve as a use case for forthcoming users, shows that a simple clustering on frequency-based bag of words can identify the main context words contained in Facebook comments identifiable by human common sense. Group similarity measures are also of great interest for many application domains, since they can be used to evaluate similarity of objects in term of the similarity of the associated sets, can then be calculated on the extracted context words to reflect the collective notion of semantic similarity, providing additional insights on which to reason, e.g. in terms of cognitive factors and behavioral patterns.

Original languageEnglish
Title of host publication17th International Conference on Computational Science and Its Applications (ICCSA 2017)
EditorsBeniamino Murgante, Bernady O. Apduhan, Giuseppe Borruso, Elena Stankova, Osvaldo Gervasi, Sanjay Misra, David Taniar, Ana Maria A.C. Rocha, Alfredo Cuzzocrea, Carmelo M. Torre
PublisherSpringer Verlag
Number of pages13
ISBN (Print)9783319623917
Publication statusPublished - 6 Jul 2017
Event17th International Conference on Computational Science and Its Applications, ICCSA 2017 - Trieste, Italy
Duration: 3 Jul 20176 Jul 2017

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference17th International Conference on Computational Science and Its Applications, ICCSA 2017

Scopus Subject Areas

  • Theoretical Computer Science
  • Computer Science(all)

User-Defined Keywords

  • Artificial intelligence
  • Collective knowledge
  • Data mining
  • Knowledge discovery
  • Semantic distance
  • Word similarity


Dive into the research topics of 'Clustering facebook for biased context extraction'. Together they form a unique fingerprint.

Cite this