Groundwater science in Australia and internationally faces enormous challenges in the years and decades to come. Increasing water demand for agriculture, industry, domestic supply and the environment will put pressure on already stressed groundwater systems. To manage our groundwater resources in the face of competing interests, regulators require transparent, reproducible and defensible science delivered in political timeframes to underpin decision making and investment. Competing demands for groundwater resources will continue to politicise groundwater science, as demonstrated by controversies surrounding fracking and coal mining, which will result in continued public scrutiny.
The current scientific landscape presents enormous opportunities. Decreasing data acquisition, storage and computing costs and advances in computational sciences allows access to a larger volume and greater variety of data, which may greatly increase our ability to understand groundwater systems. However, processing, analysing and visualising these data is currently complicated by the proprietary and black-box components in many groundwater research workflows.
We propose open-source science, where data and methodologies are freely available, as the best approach to ensuring science is sufficiently transparent and reproducible to withstand both professional and public scrutiny. Open-source science also allows greater collaboration and sharing of ideas between groups, which reduces duplication of efforts and frees scientists to focus on their specific research. This is particularly advantageous in code development, which is important in extracting information from large data collections in realistic timeframes
We present, hydrogeol_utils (https://github.com/GeoscienceAustralia/hydrogeol_utils), a GitHub repository of python based tools and workflows for processing, analysis, integration and visualisation of hydrogeological data. This toolkit aims to provide a user-friendly Application Protocol Interface (API) for accessing analysis-ready hydrogeological and geophysical datasets stored in efficient, standardised and open formats (netCDF4 and Spatialite). It applies common scientific processes such as plotting, interpolating, filtering, exploratory analysis and modelling. A major focus of this package is in the integration of a range of datasets including airborne electromagnetics (AEM), surface nuclear magnetic resonance (SNMR) and borehole information (including wireline logs, water levels and lithology) to create maps and models and assess groundwater systems against management objectives.
The package utilises mature and powerful scientific computing packages including numpy, pandas and scipy for data analysis, matplotlib for visualisation, scikitlearn for machine learning techniques and rasterio, shapely, xarray and gdal for spatial analysis. Workflows for calculating hydrological parameters including aquifer properties, groundwater salinity and water table depth are contained within Jupyter Notebooks, which are used to document the workflow including runnable code.