You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
*Initial proof-of-concept of data-mining backend for an open source development status dashboard*
11
+
***Demonstrator** data-mining backend for an open source development status dashboard*
12
12
13
13
Targeted at hosters of version control platforms (such as [Wikifactory](https://wikifactory.com/), [GitLab](https://gitlab.com/), or [GitHub](https://github.com/)), this Python backend program mines open source hardware repositories for metadata and calculates metrics based on it. This backend exposes a representational state transfer ([REST](https://en.wikipedia.org/wiki/Representational_state_transfer)) application programming interface ([API](https://en.wikipedia.org/wiki/Web_API)) where requests for those metrics can be made.
14
14
15
15
***This software is not for general consumers to just "double click" on and install on their devices***.
16
16
17
-
**Please see the [Install](#install) and [Usage](#usage) sections to get up and running with this tool**. For more details on its background and design considerations, please see the [Background](#background), ~~[Design notes](#design-notes), and [Future work](#future-work) sections. There is also a detailed [step-by-step walkthrough](docs/usage-example.md).~~
17
+
**Please see the [Install](#install) and [Usage](#usage) sections to get up and running with this tool**.
18
+
18
19
## Table of Contents
19
20
20
21
-[OSD status dashboard _(wp2.2\_dev)_](#osd-status-dashboard-wp22_dev)
@@ -23,13 +24,12 @@ Targeted at hosters of version control platforms (such as [Wikifactory](https://
@@ -41,18 +41,12 @@ Targeted at hosters of version control platforms (such as [Wikifactory](https://
41
41
42
42
[OPENNEXT](https://opennext.eu/) is a collaboration between 19 industry and academic partners across Europe. Funded by the [European Union](https://europa.eu/)'s [Horizon 2020](https://ec.europa.eu/programmes/horizon2020/) programme, this project seeks to enable small and medium enterprises (SMEs) to work with consumers, makers, and other communities in rethinking how products are designed and produced. [Open source hardware](https://www.oshwa.org/definition/) is a key enabler of this goal where the design of a physical product is released with the freedoms for anyone to study, modify, share, and redistribute copies. These essential freedoms are based on those of [open source software](https://opensource.org/osd), which is itself derived from [free software](https://www.gnu.org/philosophy/free-sw.en.html) where the word free refers to freedom, *not* free-of-charge. When put in practice, these freedoms could potentially not only reduce proprietary vendor lock-in, planned obsolescence, or waste but also stimulate novel – even disruptive – business models. The SME partners in OPENNEXT are experimenting with producing open source hardware and even opening up the development process to wider community participation. They produce diverse products ranging from [desks](https://stykka.com/), [cargo bike modules](http://www.xyzcargo.com/), to a [digital scientific instrument platform](https://pslab.io/) (and [more](https://opennext.eu/project-team/#sme)).
43
43
44
-
Work package 2 of OPENNEXT is gathering theoretical and practical insights on best practices for company-community collaboration when developing open source hardware. This includes running [Delphi studies](https://www.edelphi.org/) to develop a maturity model to describe the collaboration and developing a precise definition for what the "source" is in open source hardware. In particular, task 2.2 in this work package is developing a project status dashboard with "health" indicators showing the evolution of a project within the maturity model; design activities; or progress towards success based on project goals.
45
-
46
-
~~To that end, the month 18 deliverable for task 2.2 is focused on establishing the underlying "behind the scenes" infrastructure to mine data about open source hardware projects from version control repositories that they are hosted on (`osmine`). The Python scripts in this repository currently query the public [application programming interfaces](https://en.wikipedia.org/wiki/API) (APIs) of [GitHub](https://www.github.com/) and [Wikifactory](https://www.wikifactory.com/). Both platforms host version control repositories with the latter having a focus on supporting open source hardware projects. There is also a barebones proof-of-concept user-facing demonstration dashboard (`osdash`) which computes core metrics from the mined data and presents interactive visualisations. This dashboard is only to show that the underlying data could be displayed, and is not meant to confer immediate usefulness at this time.~~
44
+
Work package 2 (WP2) of OPENNEXT is gathering theoretical and practical insights on best practices for company-community collaboration when developing open source hardware. This includes running [Delphi studies](https://www.edelphi.org/) to develop a maturity model to describe the collaboration and developing a precise definition for what the "source" is in open source hardware. In particular, task 2.2 in this work package is developing a demonstration project status dashboard with "health" indicators showing the evolution of a project within the maturity model; design activities; or progress towards success based on project goals. Details of the dashboard's technical architecture are described in the deliverable 2.5 (D2.5) report.
47
45
48
-
To be clear, this deliverable ***is***: Designed to be deployed on a server operated by version control platforms such as Wikifactory or GitHub.
46
+
This repository contains the backend code for D2.5 and to be clear, this deliverable ***is***: Designed to be deployed on a server operated by version control platforms such as Wikifactory or GitHub.
49
47
50
48
This deliverable ***is not***: For general end-users to install on consumer devices and "double click" to open.
51
49
52
-
There are other excellent open source software for open source project analytics and data visualisation, with [Grimoirelab](https://chaoss.github.io/grimoirelab/) being a prime example. However, the full Grimoirelab pipeline requires a full server stack necessitating advanced skills in heavy-duty (but potentially complicated) web technologies such as [Kibana](https://www.elastic.co/products/kibana) or [Elastisearch](https://www.elastic.co/products/elasticsearch). This project aims to create a lighter, more focused solution needing only the use of Python.
53
-
54
-
This documentation aims to demonstrate practices that facilitate design reuse, including of this repository. In addition to the [Install](#install) and [Usage](#usage) sections that increase reproducibility, ~~[Design notes](#design-notes) and [Future work](#future-work) communicate the thought process and lessons-learned while developing the dashboard. Together, they constitute an intangible body of "know-how" that is very often undocumented. For example, motivations for the internal data model or the approach to compressing data at the end of the section [Internal data structure](#internal-data-structure) which reduces disk usage are of practical benefit. But "snippets" of practical experience like these are seldom recorded.~~
55
-
56
50
In addition, this repository aims to follow international standards and good practices in open source development such as, but not limited to:
57
51
58
52
*[SDPX 3](https://spdx.dev/) compliance with a [LICENSE](./LICENSE) file (also see [License](#license) section)
@@ -64,7 +58,7 @@ In addition, this repository aims to follow international standards and good pra
64
58
65
59
## Install
66
60
67
-
This section assumes knowledge of Python, Git, and using a GNU/Linux-based server including installing software from package managers and running a terminal session.
61
+
This section assumes knowledge of [Python](https://www.python.org/), [Git](https://git-scm.com/), and using a GNU/Linux-based server including installing software from package managers and running a terminal session.
68
62
69
63
**Note:** This software is designed to be deployed on a server by system administrators or developers, not on generic consumer devices.
70
64
@@ -86,7 +80,7 @@ A [GitHub personal access token](https://docs.github.com/en/github/authenticatin
86
80
87
81
### Running from source
88
82
89
-
The code can be run from source and has been tested on updated versions of GNU/Linux server operating systems including [Red Hat Enterprise Linux](https://redhat.com/en/technologies/linux-platforms/enterprise-linux) 8.5. While effort has been made to keep the Python scripts platform-agnostic, they have not been tested under other operating systems such as [BSD](https://en.wikipedia.org/wiki/Berkeley_Software_Distribution)-derivatives, [Apple macOS](https://www.apple.com/macos/) or [Microsoft Windows](https://www.microsoft.com/windows/) as they are rarely used for hosting code such as this (especially the latter two).
83
+
The code can be run from source and has been tested on updated versions of GNU/Linux server operating systems including [Red Hat Enterprise Linux](https://redhat.com/en/technologies/linux-platforms/enterprise-linux) 8.5. While effort has been made to keep the Python scripts platform-agnostic, they have not been tested under other operating systems such as [BSD](https://en.wikipedia.org/wiki/Berkeley_Software_Distribution)-derivatives, [Apple macOS](https://www.apple.com/macos/) or [Microsoft Windows](https://www.microsoft.com/windows/) as they - especially the latter two- are rarely used for hosting code such as this.
90
84
91
85
On your server, with the tools [`git`](https://git-scm.com/) and [`pip`](https://pip.pypa.io/) installed, run the following commands in a terminal session to retrieve the latest version of this repository and prepare it for development and running locally (usually for testing):
92
86
@@ -113,7 +107,7 @@ This means the server API is up an running, and should be accessible on your loc
113
107
114
108
### Deploy as container
115
109
116
-
There is a [`Dockerfile`](./Dockerfile) in this repository that defines a [container](https://en.wikipedia.org/wiki/OS-level_virtualization) within which this program can run.
110
+
There is a [`Dockerfile`](./Dockerfile) in this repository that defines a [container](https://en.wikipedia.org/wiki/OS-level_virtualization) within which this code can run.
117
111
118
112
To build and use the container, you need to have programs like [Podman](https://podman.io) or [Docker](https://en.wikipedia.org/wiki/Docker_(software)) installed.
Where `token` is the 40 character alphanumeric string of your GitHub API personal access token. It is in the form of "ghp_2D5TYFikFsQ4U9KPfzHyvigMycePCPqkPgWc".
135
129
136
-
#### Heroku example
130
+
#### Heroku deployment example
137
131
138
132
The image built this way can be pushed to cloud hosting providers such as [Heroku](https://www.heroku.com/). With Heroku as an example:
139
133
@@ -165,11 +159,11 @@ A demo of this is hosted on Heroku with this API endpoint:
165
159
https://wp22dev.herokuapp.com/data
166
160
```
167
161
168
-
This demo instance will go into a sleep state after a period of inactivity. If your API calls to this endpoint is taking more than a few seconds, it might be the demo waking from that state.
162
+
This demo instance will go into a sleep state after a period of inactivity (approximately 30 minutes at time of writing). If your API calls to this endpoint is taking more than a few seconds, it might be the demo waking from that state.
169
163
170
164
#### Fly.io example
171
165
172
-
Similar to Heroku, the container image created above can be deployed to an app on [Fly.io](https://fly.io/). Assuming an account has already been created:
166
+
Similar to Heroku, the container image created above can be deployed to an app on [Fly.io](https://fly.io/). Assuming a Fly.io account has already been created:
173
167
174
168
1. Log in to Fly.io in a terminal session:
175
169
@@ -211,21 +205,21 @@ Where `token` is the 40 character alphanumeric string of your GitHub API persona
211
205
212
206
## Usage
213
207
214
-
The backend server listens to requests for information about a list of open source hardware (and software) repositories hosted on Wikifactory or GitHub. The GitHub backend is a placeholder for now, but the Wikifactory backend is now accessible.
208
+
The backend server listens to requests for information about a list of open source hardware (and software) repositories hosted on Wikifactory or GitHub.
215
209
### Making requests to the REST API
216
210
217
211
[GET requests](https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol#Request_methods) to the API are formed as [JSON](https://www.json.org/json-en.html) payloads to the `/data` endpoint.
218
212
219
213
There are two components to each request:
220
214
221
-
1.`repo_urls`: An array of strings of repository [URL](https://en.wikipedia.org/wiki/URL)s, such as `https://wikifactory.com/+elektricworks/pikon-telescope`. Currently, metadata retrieval for Wikifactory project URLs is implemented. Each URL is composed of the Wikifactory domain (`wikifactory.com`), space (e.g. `+elektricworks`), and project (e.g. `pikon-telescope`).
215
+
1.`repo_urls`: An array of strings of repository [URL](https://en.wikipedia.org/wiki/URL)s, such as `https://wikifactory.com/+elektricworks/pikon-telescope`. Currently, metadata retrieval for Wikifactory project and GitHub repository URLs are implemented. Each URL is composed of the Wikifactory domain (`wikifactory.com`), space (e.g. `+elektricworks`), and project (e.g. `pikon-telescope`).
222
216
223
217
2.`requested_data`: An array of strings representing the types of repository metrics desired for each repository. Currently, the following are implemented for Wikifactory projects:
224
218
1.`files_info`: The numbers and proportions of mechanical and electronic computer-assisted design (CAD), image, data, document, and other file types in the repository.
225
219
2.`files_editability`: Basic information about how "editable" the CAD files are in this repository.
226
220
3.`license`: The license for the repository.
227
221
4.`tags`: Aggregated tags for the repository and any associated with the maintainers of that repsitory.
228
-
5.`commits_level`: The hash identifier (contribution `id` for Wikifactory projects) and timestamp of each commit to the repository. This can be used to graph the commit activity level in a frontend visualisation. **Note:** This will be based on commits from the first three detected branches in the repository, including the default branch. This is because the time it takes to requests commits across various branches take a long time, and APIs might time out.
222
+
5.`commits_level`: The hash identifier (contribution `id` for Wikifactory projects) and timestamp of each commit to the repository. This can be used to graph the commit activity level in a frontend visualisation. **Note:** This will be based on commits from the first three detected branches in the repository, including the default branch. This is because the time it takes to requests commits across various branches take a long time, and APIs might time out. Also note that branches are not implemented by Wikifactory, so it will behave as if there is only one branch.
229
223
6.`issues_level`: Similar to `commits_level`, but for all issues in the repository.
230
224
231
225
The following is an example request that could be sent to the API for three Wikifactory projects:
@@ -332,15 +326,11 @@ By default, this tool will:
332
326
1. Identify whether a provided repository URL in the JSON request body as a Wikifactory project if it is under the domain `wikifactory.com`
333
327
2. Use the public Wikifactory GraphQL API endpoint at `https://wikifactory.com/api/graphql`
334
328
335
-
Both can be customised with the following environmental variables:
329
+
Both can be customised with the following environmental variables during deployment:
336
330
337
331
1.`WIF_BASE_URL` - (default: `wikifactory.com`) The base domain used for pattern-matching and identifying Wikifactory project URLs in the JSON request body in the form of `example.com`. If this is customised, then the requested Wikifactory project URLs passed to this tool should also use that domain instead of `wikifactory.com`. Otherwise, an "Repository URL domain not supported" error will be returned.
338
332
2.`WIF_API_URL` - (default: `https://wikifactory.com/api/graphql`) The full URL of the GraphQL API endpoint to make queries regarding Wikifactory projects in the form of `https://example.com[:port]/foo/bar`.
339
333
340
-
## Design notes
341
-
342
-
[to be updated]
343
-
344
334
## Maintainers
345
335
346
336
Dr Pen-Yuan Hsing ([@penyuan](https://github.com/penyuan)) is the current maintainer.
@@ -363,6 +353,7 @@ The maintainer would like to gratefully acknowledge:
363
353
* OPENNEXT internal reviewers Dr Jean-François Boujut ([@boujut](https://github.com/boujut)) and Martin Häuer ([@moedn](https://github.com/moedn)) for constructive criticism.
364
354
* OPENNEXT project researchers Robert Mies ([@MIE5R0](https://github.com/MIE5R0)), Mehera Hassan ([@meherrahassan](https://github.com/meherahassan)), and Sonika Gogineni ([@GoSFhg](https://github.com/GoSFhg)) for useful feedback and extensive administrative support.
365
355
* The Linux Foundation [CHAOSS](https://chaoss.community/) group for insights on open source community health metrics.
356
+
* The following people for their valuable feedback via a survey (see D2.5 report for details) (in alphabetical order of last name): Jean-François Boujut ([@boujut](https://github.com/boujut)), Martin Häuer ([@moedn](https://github.com/moedn)), James Jones (CubeSpawn), Max Kampik ([@mkampik](https://github.com/mkampik)), Johannes Střelka-Petz.
0 commit comments