Code summarization

Introduction

Code summarization methods automatically generate descriptions from source code and document programs. High-level summaries provided along with code snippets help developers better comprehend source code. Moreover, these summaries are useful in other applications such as code search. Summarization is a non-trivial problem in natural language processing. Fortunately, source code is highly structured. Recently researchers have tried to exploit the rich structure of source code by using code-specific features such as abstract syntax trees and program control flows.

Reading material

Papers to be discussed in this session are: [1], [2].

Bibliography

[1]

P. Fernandes, M. Allamanis, and M. Brockschmidt, “Structured neural summarization,” arXiv preprint arXiv:1811.01824, 2018.

[2]

S. Iyer, I. Konstas, A. Cheung, and L. Zettlemoyer, “Summarizing source code using a neural attention model,” in Proceedings of the 54th annual meeting of the association for computational linguistics (volume 1: Long papers), 2016, pp. 2073–2083.

[3]

J. Fowkes, P. Chanthirasegaran, R. Ranca, M. Allamanis, M. Lapata, and C. Sutton, “Autofolding for source code summarization,” IEEE Trans. Softw. Eng., vol. 43, no. 12, pp. 1095–1109, Dec. 2017.

[4]

M. Allamanis, H. Peng, and C. Sutton, “A convolutional attention network for extreme summarization of source code,” in International conference on machine learning, 2016, pp. 2091–2100.

[5]

U. Alon, S. Brody, O. Levy, and E. Yahav, “code2seq: Generating sequences from structured representations of code,” arXiv preprint arXiv:1808.01400, 2018.

[6]

Y. Wan et al., “Improving automatic source code summarization via deep reinforcement learning,” in Proceedings of the 33rd ACM/IEEE international conference on automated software engineering, 2018, pp. 397–407.

[7]

S. Xu, S. Zhang, W. Wang, X. Cao, C. Guo, and J. Xu, “Method name suggestion with hierarchical attention networks,” in Proceedings of the 2019 ACM SIGPLAN workshop on partial evaluation and program manipulation, 2019, pp. 10–21.

Copyright

The course contents are copyrighted (c) 2018,2019,2020 - onwards by TU Delft and their respective authors and licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.

Code summarization

Maliheh Izadi

06 October 2022

Introduction

Reading material

Bibliography

Copyright