-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Here, you will get an overview of all the most important matters related to ClimHub.
ClimHub is a scalable, open-source infrastructure for managing climate data access and processing pipelines to streamline spatio-temporal data workflows in alignment with FAIR principles. It provides a standardised framework as an R package, with planned cross-language support—for discovering, accessing, harmonising, processing, and analysing climate data in a reproducible and efficient way.
Originally developed as a prototype at CICERO, ClimHub has evolved into a shared research infrastructure that underpins multiple projects across research disciplines. It is designed to reduce duplication of effort, improve data consistency, and enable collaboration across teams by offering a common toolbox and workflow paradigm. Beyond software, ClimHub represents an institutional shift toward reproducible science, open development practices, and integrated data engineering within climate research.
ClimHub compartmentalises the lifecycle of climate (and other spatio-temporally explicit) data into four distinct pillars around which functionality are oriented:
Figure 1: ClimHub Functionality Flowchart. ClimHub is oriented around four pillars of the data lifecycle with a selection of functions belonging to each pillar. Function naming follows conventions outlined in the documentation guidelines for ClimHub. Note that this flowchart is non-exhaustive and focuses on some of the most important functions at this point in time.
In this pillar, you will find functionality that interacts with metadata we have curated for different climate data products. Functions belonging to this pillar give you insight into what data products ClimHub currently supports access of, their important characteristics such as spatial and temporal extent and resolution as well as what variables are contained therein and how to attribute these contents of each product.
This pillar contains one function for each data product supported by ClimHub. Using standardised argument naming and backbone helper functions, functions of this pillar streamlines and harmonise data retrieval while augmenting downloaded data with informative metadata fields to adhere to FAIR principles and ensure provenance of data throughout your work.
Here, you will find a number of functions aimed at packaging up reproducible and efficient workflows that are repeated across most data handling efforts for spatial or spatio-temporal data. This includes spatially explicit operations like limiting data to specific areas via cropping/masking and reprojecting data to align in shared coordinate reference systems. More complex spatial operations for aggregation and disaggregation via interpolation are also supported by this pillar. In addition, ClimHub makes available functionality to carry out temporal aggregation. Finally, taking these operations to their logical conclusion, ClimHub provides functions for the standardised calculation of important metrics derived from climate data such as the ETCCDI or Bioclimatic Variables.
This final pillar is a crucial one. Data you prepare, no matter how well handled and processed, is only as good as its applications and adherence to FAIR principles. Functions in this pillar are aimed at enhancing contemporary practices in data dissemination with easy-to-use and customise visualisation functionality, provisioning of a data provenance string to track the lifecycle of the data through the ClimHub framework as well as checks for compliance with the CF metadata convention for the resulting NetCDF data on your hard drive to streamline data archiving and publication.
At present, the ClimHub team is made up of:
Erik Kusch - Lead Developer
Erik is a senior researcher at the institute for international climate research CICERO in Oslo, Norway where he focuses on data science and analysis of climate and nature risk. During his PhD in quantitative macroecology, Erik developed the KrigR R package - the spiritual predecessor of ClimHub. Building from this experience with lifecycle framework development and downstream analyses, Erik leads the developer team.
Richard Davy - Developer
Richard is a senior researcher at the Nansen Environmental and Remote Sensing Center in Bergen, Norway. With a solid background if atmospheric sciences and downscaling approaches, Richard augments ClimHub's selection of climate data products to support access of and contributes important interpolation algorithms.
Taimur Khan - Developer
Taimur is a Data Scientist in the Vegetation and Macroecology working group at the Helmholtz Centre for Environmental Research - UFZ in Germany. He has a background in Software Engineering, Earth System Data Science and Remote Sensing. Previously, Taimur has developed several Open Source Software libraries in projects like Deeptrees, Biodiversity Digital Twin, Biodiversity Meets Data. In ClimHub, Taimur contributes to the dev ops, data validation and real-time data streaming features.
Patrick Van Laake - Developer
Description of Patricks work. Definitely worthwhile mentioning ncdfCF here as well as CFtime which make up an important part of the backbone of ClimHub.
Sam Mason - Developer
Description of Sams work here.
You too can join the team by contributing to ClimHub