During the last few years, GitHub has emerged as a popular project hosting, mirroring and collaboration platform. GitHub provides an extensive Rest API, which enables researchers to retrieve both the commits to the projects’ repositories and events generated through user actions on project resources. The GHTorrent project created an infrastructure to collect and process Github's event stream and all data linked from it. The purpose of this project is to perform socio-technical analysis on the data provided from Github. Ideas include:

  • Project communities: How are project ecosystems being formed? How do they evolve?

  • Followers and Watchers: Under what circumstances do project watchers become project members? Do pull requests help in team formation?

  • Socio-technical congruence: Does team congruence translate to better software quality?


  1. G. Gousios and D. Spinellis, “ GHTorrent: GitHub’s Data from a Firehose,” in MSR ’12: Proceedings of the 9th Working Conference on Mining Software Repositories, 2012, pp. 12–21.

  2. (missing reference)

  3. (missing reference)