The papers to read for this session are [1] and [2]


[1] V. Markovtsev and E. Kant, “Topic modeling of public repositories at scale using names in source code,” arXiv preprint arXiv:1704.00135, 2017.

[2] B. Gelman, B. Hoyle, J. Moore, J. Saxe, and D. Slater, “A language-agnostic model for semantic source code labeling,” in Proceedings of the 1st international workshop on machine learning and software engineering in symbiosis, 2018, pp. 36–44.

[3] W. E. Zhang, Q. Z. Sheng, E. Abebe, M. A. Babar, and A. Zhou, “Mining source code topics through topic model and words embedding,” in International conference on advanced data mining and applications, 2016, pp. 664–676.

[4] Z. Chen and M. Monperrus, “A literature study of embeddings on source code,” arXiv preprint arXiv:1904.03061, 2019.

[5] Z. Zhang and Q. Lu, “NLP4SE: Evaluation of natural language processing techniques for software engineering tasks.”