Python scripts for investigating Open Source Hardware GitHub repositories

  • Jeremy Bonvoisin (Creator)

Dataset

Description

This dataset contains Python scripts applying repository mining and social network analysis (SNA) techniques for investigating the transparency and workload distribution of open source hardware (OSH) product development projects hosted on GitHub. Starting from a list of projects and the reference of their corresponding repositories, the scripts extract file versioning metadata from the GitHub API and compute GraphML graphs depicting the full history of commit information for each project. Three types of graphs are computed: commit graphs, file co-edition graphs and file change graphs. They then apply SNA indicators (size, centrality and clustering index) to characterize the topology of file co-edition graphs. Finally, they apply a k-means clustering to these indicators in order to identify different types of projects based on the topology of their co-edition graphs. These scripts have been developed and applied to 105 OSH product development projects in the frame of a study published in the following article (in open access): Bonvoisin, J., Tom Buchert, Maurice Preidel, Rainer Stark. 2018. “How participative is open source hardware? Insights from online repository mining”. Design Science, 4, E19. doi:10.1017/dsj.2018.15
Date made available27 Mar 2018
PublisherZenodo

Cite this