TY - GEN
T1 - Efficient Graph-Based Author Disambiguation by Topological Similarity in DBLP
AU - Franzoni, Valentina
AU - Lepri, Michele
AU - Li, Yuanxi
AU - Milani, Alfredo
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/9/26
Y1 - 2018/9/26
N2 - In this work, we introduce a novel method for entity resolution author disambiguation in bibliographic networks. Such a method is based on a 2-steps network traversal using topological similarity measures for rating candidate nodes. Topological similarity is widely used in the Link Prediction application domain to assess the likelihood of an unknown link. A similarity function can be a good approximation for equality, therefore can be used to disambiguate, basing on the hypothesis that authors with many common co-authors are similar. Our method has experimented on a graph-based representation of the public DBLP Computer Science database. The results obtained are extremely encouraging regarding Precision, Accuracy, and Specificity. Further good aspects are the locality of the method for disambiguation assessment which avoids the need to know the global network, and the exploitation of only a few data, e.g. author name and paper title (i.e., co-authorship data).
AB - In this work, we introduce a novel method for entity resolution author disambiguation in bibliographic networks. Such a method is based on a 2-steps network traversal using topological similarity measures for rating candidate nodes. Topological similarity is widely used in the Link Prediction application domain to assess the likelihood of an unknown link. A similarity function can be a good approximation for equality, therefore can be used to disambiguate, basing on the hypothesis that authors with many common co-authors are similar. Our method has experimented on a graph-based representation of the public DBLP Computer Science database. The results obtained are extremely encouraging regarding Precision, Accuracy, and Specificity. Further good aspects are the locality of the method for disambiguation assessment which avoids the need to know the global network, and the exploitation of only a few data, e.g. author name and paper title (i.e., co-authorship data).
KW - Rewards margin
KW - Termination condition
KW - disambiguation
KW - entity resolution
KW - semantic networks
KW - topological similarity
KW - link prediction
KW - bibliometrics
KW - graph-based database
KW - Neo4j
KW - NoSQL Database
KW - co-authorship network
KW - social network analysis
UR - http://www.scopus.com/inward/record.url?scp=85058225352&partnerID=8YFLogxK
U2 - 10.1109/AIKE.2018.00054
DO - 10.1109/AIKE.2018.00054
M3 - Conference proceeding
AN - SCOPUS:85058225352
SN - 9781538695562
T3 - Proceedings - IEEE International Conference on Artificial Intelligence and Knowledge Engineering, AIKE
SP - 239
EP - 243
BT - Proceedings - 2018 1st IEEE International Conference on Artificial Intelligence and Knowledge Engineering, AIKE 2018
PB - IEEE
T2 - 1st IEEE International Conference on Artificial Intelligence and Knowledge Engineering, AIKE 2018
Y2 - 26 September 2018 through 28 September 2018
ER -