pyOpenSci Infrastructure#

pyOpenSci uses GitHub to manage almost all of its infrastructure, from community processes to website rendering. This page provides a high-level overview of our infrastructure, focusing on how our core repositories work together and contribute to the website and community operations.

For detailed information about specific infrastructure components, see the Learn more section below.

What is pyOpenSci infrastructure?#

pyOpenSci infrastructure encompasses:

  • GitHub repositories: All code, content, and documentation repositories

  • Website and documentation: Main website and sub-sites (handbook, guides, lessons)

  • Data processing: Automated collection and processing of contributor and peer review data

  • Continuous Integration (CI): GitHub Actions workflows for testing, building, and deploying

  • Access and permissions: Repository access management and team structures

  • Issue and pull request workflows: Processes for managing contributions and reviews

Infrastructure overview diagrams#

The diagrams below illustrate two key aspects of our infrastructure:

Data flow and processing#

The first diagram shows how peer review data is extracted from GitHub issues through our automated processing system to update the website:

../../_images/website-diagram.svg

pyOpenSci infrastructure data flow diagram showing how peer review issues are processed through pyosMeta to update the website.#

This diagram illustrates the automated workflow: peer review happens in GitHub issues, which are parsed by scripts in the pyosMeta package to generate YAML files that automatically update the website’s package and contributor pages.

Website structure#

The second diagram shows how the main pyOpenSci website connects to its sub-sites:

../../_images/website-repositories-structure.svg

pyOpenSci website structure diagram showing the main website and its sub-sites (Handbook, Python Package Guide, Software Peer Review Guide, Lessons, and Metrics).#

All sub-sites are built separately but served under the pyopensci.org domain, with the main website (pyopensci.github.io) serving as the central hub.

Data flow and continuous integration#

In simple terms: pyOpenSci uses automated workflows to collect data from GitHub and automatically update our website.

pyOpenSci uses a set of Continuous Integration (CI) jobs (GitHub Actions) to:

  • Collect data from our open peer review process

  • Collect contributor data from across all of our GitHub repositories

The pyosMeta package is a Python package that parses review and contributor data and transforms it into machine-readable YAML files used by our website.

How data flows through our system#

  • pyosMeta parses the Markdown data within review issues in the software-review GitHub repository. It:

    • Gathers review editors, reviewers, and maintainers’ GitHub usernames, and uses the GitHub API to retrieve contributor names, emails, and other public GitHub profile information

    • Extracts the GitHub URL of each reviewed package and retrieves basic repository statistics (number of forks, stars, contributors)

    • Stores this peer review information in packages.yml

  • pyosMeta also parses contributor data from across all pyOpenSci repositories. It:

    • Parses all-contributors bot files to compile a list of contributors and their associated repositories/projects

    • Parses peer review metadata to populate roles such as reviewers, editors, and other contributor roles within our organization

    • Stores this contributor information in contributors.yml

  • The packages.yml and contributors.yml files generated by pyosMeta are updated daily via a GitHub Action cron job in the pyopensci.github.io repository. This data is used to populate:

    • The Our Community page

    • The Packages page

For more detailed information about data collection and processing, see the Data Workflows page.

Website publishing#

  • The Python Package Guide, Peer Review Guide, and Handbook are all Sphinx books that use the pydata_sphinx_theme. These books are built separately but are served under the pyopensci.org domain.

  • All Sphinx books use the pyos-sphinx-theme, which is a Sphinx theme built on top of pydata_sphinx_theme.

  • The final site is published at pyopensci.org using GitHub Pages.

Learn more#

This page provides a high-level overview. For detailed information about specific infrastructure components, see:


Last updated: Dec 17, 2025