Eclipse DataEggs

The Eclipse DataEggs project has been archived and is no longer maintained.

I developed and maintained Eclipse DataEggs, a project that provides datasets related to the development of Eclipse projects, mainly for software practitionners and researchers.

Datasets

The datasets include various pieces of data retrieved from the Eclipse forge: Mailing lists, Project development data, and AERI stacktraces, all in handy CSV and JSON formats. Each dataset comes with R Markdown documents describing its content and providing hints about how to use it. Examples provided mainly use the R statistical analysis software.

The datasets provided include:

Privacy has been a major concern from the beginning. Once extracted, data is anonymised using data-anonymiser and published in the downloads section of the project. See our documentation for more details

All data related to projects is retrieved from the Eclipse Alambic instance at https://eclipse.alambic.io. Alambic is an open-source framework for development data extraction and processing, for more information see https://alambic.io.

Screenshots