pyOpenSci Infrastructure Overview#
This page will help you understand how we collect and process peer review and contributor data to:
Highlight pyOpenSci contributors
Track our peer review process
Showcase peer-reviewed Python packages
How it works#
We use a Python package called pyosMeta
to extract and transform contributor and peer review data into machine-readable formats (.yml
and .csv
).
This data allows us to automatically update:
with up-to-date contributor and review information, directly from GitHub.
Data collection and processing#
We collect two types of data from GitHub:
Contributor data Parsed from All Contributors bot config files found in each pyOpenSci repo.
Peer review submission data Extracted from issues in the software-submission repo, including:
package name and repo URL
editor and reviewers
maintainers and authors
This data is processed by pyosMeta
, which generates:
_data/contributors.yml
_data/packages.yml
.csv
files for metrics
Where the data goes#
The processed data files are used in two main parts of our website:
Website GitHub Repo
A cron job reads the
.yml
files to populate our 👉 Contributors page 👉 Packages page
Metrics GitHub Repo
A cron job reads
.csv
files to generate the 👉 Peer review status dashboard
Workflow diagram#
The diagram below explains the basic workflow that we use.
graph TD subgraph Sources A1[All Contributors Bot] A2[Peer Review Submissions *GitHub Issues*] end subgraph pyosmeta A3[pyosmeta] end A1 --> A3 A2 --> A3 A3 -->|DATA: _data/contributors.yml, _data/packages.yml| B1[Website GitHub Repo] A3 -->|DATA: _/*.CSV | B2[Metrics GitHub Repo] B1 -->|Cron job reads YAML| C1[🔗 Contributor listing page] B2 -->|Cron job reads CSV| C2[Generate metric plots] click C1 "https://www.pyopensci.org/our-community/index.html#pyopensci-community-contributors" "View pyOpenSci Contributor Page"