pyOpenSci Infrastructure Overview#
This page will help you understand how we collect and process peer review and contributor data to:
Highlight pyOpenSci contributors
Track our peer review process
Showcase peer-reviewed Python packages
How it works#
We use a Python package called pyosMeta to extract and transform contributor and peer review data into machine-readable formats (.yml and .csv).
This data allows us to automatically update:
with up-to-date contributor and review information, directly from GitHub.
Data collection and processing#
We collect two types of data from GitHub:
Contributor data Parsed from All Contributors bot config files found in each pyOpenSci repo.
Peer review submission data Extracted from issues in the software-submission repo, including:
package name and repo URL
editor and reviewers
maintainers and authors
This data is processed by pyosMeta, which generates:
_data/contributors.yml_data/packages.yml.csvfiles for metrics
Where the data goes#
The processed data files are used in two main parts of our website:
Website GitHub Repo
A cron job reads the
.ymlfiles to populate our 👉 Contributors page 👉 Packages page
Metrics GitHub Repo
A cron job reads
.csvfiles to generate the 👉 Peer review status dashboard
Workflow diagram#
The diagram below explains the basic workflow that we use.
graph TD
subgraph Sources
A1[All Contributors Bot]
A2[Peer Review Submissions *GitHub Issues*]
end
subgraph pyosmeta
A3[pyosmeta]
end
A1 --> A3
A2 --> A3
A3 -->|DATA:
_data/contributors.yml,
_data/packages.yml| B1[Website GitHub Repo]
A3 -->|DATA:
_/*.CSV | B2[Metrics GitHub Repo]
B1 -->|Cron job reads YAML| C1[🔗 Contributor listing page]
B2 -->|Cron job reads CSV| C2[Generate metric plots]
click C1 "https://www.pyopensci.org/our-community/index.html#pyopensci-community-contributors" "View pyOpenSci Contributor Page"