Snippet Comment Generation Based on Code Context Expansion

Hanyang Guo, Xiangping Chen, Yuan Huang*, Yanlin Wang*, Xi Ding, Zibin Zheng*, Xiaocong Zhou, Hong Ning Dai

*Corresponding author for this work

Research output: Contribution to journalJournal articlepeer-review

2 Citations (Scopus)

Abstract

Code commenting plays an important role in program comprehension. Automatic comment generation helps improve software maintenance efficiency. The code comments to annotate a method mainly include header comments and snippet comments. The header comment aims to describe the functionality of the entire method, thereby providing a general comment at the beginning of the method. The snippet comment appears at multiple code segments in the body of a method, where a code segment is called a code snippet. Both of them help developers quickly understand code semantics, thereby improving code readability and code maintainability. However, existing automatic comment generation models mainly focus more on header comments, because there are public datasets to validate the performance. By contrast, it is challenging to collect datasets for snippet comments, because it is difficult to determine their scope. Even worse, code snippets are often too short to capture complete syntax and semantic information. To address this challenge, we propose a novel Snippet Comment Generation approach called SCGen. First, we utilize the context of the code snippet to expand the syntax and semantic information. Specifically, 600,243 snippet code-comment pairs are collected from 959 Java projects. Then, we capture variables from code snippets and extract variable-related statements from the context. After that, we devise an algorithm to parse and traverse abstract syntax tree (AST) information of code snippets and corresponding context. Finally, SCGen generates snippet comments after inputting the source code snippet and corresponding AST information into a sequence-to-sequence-based model. We conducted extensive experiments on the dataset we collected to evaluate our SCGen. Our approach obtains 18.23 in BLEU-4 metrics, 18.83 in METEOR, and 23.65 in ROUGE-L, which outperforms state-of-the-art comment generation models.

Original languageEnglish
Article number24
Number of pages30
JournalACM Transactions on Software Engineering and Methodology
Volume33
Issue number1
Early online date23 Nov 2023
DOIs
Publication statusPublished - Jan 2024

Scopus Subject Areas

  • Software

User-Defined Keywords

  • code summarization
  • contextual information
  • neural machine translation
  • Snippet comment generation

Fingerprint

Dive into the research topics of 'Snippet Comment Generation Based on Code Context Expansion'. Together they form a unique fingerprint.

Cite this