Skip to content

blueOceanSustainableSolutions/DATAz

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

90 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Repository for the DATAz DTO Project GitHub

Public project number 2025.00265.DT4ST financed by ARTE.

Developments were carried out by the DATAz consortium, which includes the following organisations: blueOASIS, IDMEC (Instituto de Engenharia Mecânica), IMAR (Instituto do Mar), Instituto Hidrográfico, Centro de Experimentação Operacional da Marinha, Associação para o Desenvolvimento e Formação do Mar dos Açores, and Direção Regional de Políticas Marítimas.

Numerical Modelling Tools

The DATAz DTO relies on a suite of numerical modelling tools to simulate the ocean environment around the Azores site. These include:

  • MOHID: An open-source water modelling system developed by Instituto Superior Técnico to simulate hydrodynamic, environmental, and water quality processes across rivers, estuaries, coastal zones, and oceans.
  • RAINDROP: A Python-based framework developed by blueOASIS to generate real-time holistic underwater acoustic maps, automating workflows from environmental data collection and simulation setup to execution and post-processing.
  • REEF3D: An open-source phase-resolved wave model developed by the Norwegian University of Science and Technology to simulate nearshore hydrodynamics over complex bathymetry and coastlines
  • SWAN: An open-source phase-averaged wave model developed by Delft University of Technology to simulate wind-generated waves in coastal regions and inland waters.
  • WW3: A community-driven wave modelling framework providing global-to-regional spectral wave forecasts, used here for boundary forcing and validation.

AI-based Surrogate Models

The surrogate_models directory provides data preprocessing and training pipelines for AI-based surrogates of the numerical models. For example, the SWAN surrogate uses Jupyter notebooks to handle boundary condition preprocessing, input/output preparation, and model training, enabling rapid approximation of wave conditions without running the full numerical solver.

DTO Visualisation

The DATAz DTO site is presented through open access on blueOASIS' DTO dashboard framework, accessed via the site-centric API (/api/site/:id/*).

Hydrotwin dashboard homepage

The dashboard provides:

  • Live status panel: Real-time Hydrotwin environmental readings, sensor status, and environmental measurement availability.
  • Detection timeline: Chronological view of all detection events across modalities.
  • Scenario explorer (future feature): Interactive what-if interface backed by the surrogate model.
  • Alert management console: Active alerts, historical alert log, and response action tracking.

The Hydrotwin dashboard encapsulates site-specific views, where:

  • A single site (such as DATAz) shows a unified view of all sensors.
  • Individual sensors can be selected for deeper analysis.
  • In-situ sensors tied to an acoustic deployment can be viewed through individual sensor tabs.
  • HT-C units offer a live listening mode.

Getting the Data

Large datasets and model checkpoints (bathymetry, AIS, ERA5/CMEMS/FES forcing, surrogate weights, etc.) are not stored in git. They are tracked with DVC and hosted in public Azure Blob Storage, so cloning the repo gives you small pointer files (*.dvc) rather than the data itself — you fetch the actual files with dvc pull.

1) Install DVC

pip install "dvc[http]"

2) Clone the repository

git clone https://github.com/blueOceanSustainableSolutions/DATAz.git
cd DATAz

3) Pull the data

The data lives in a public, read-only container, so no credentials are needed. Add the public remote over HTTPS, then pull:

pip install dvc-azure
dvc remote add --local public https://datazstorage.blob.core.windows.net/dvcstore
dvc pull -r public

This downloads all tracked datasets into their expected locations.

To fetch only part of it, pass the path to a tracked file (or its .dvc pointer) — not a folder:

dvc pull -r public zlt_numerical_models/MOHID/GeneralData/Meteo/ERA5/era5.hdf5

Note: pull over HTTPS, not the azure:// remote. The container allows anonymous reads but not the operations the Azure client performs, so azure:// + allow_anonymous_login hangs at Collecting ... 0.00 entry/s and then fails with AuthorizationPermissionMismatch. The HTTPS remote above avoids this entirely.

macOS / SSL: if you see CERTIFICATE_VERIFY_FAILED, your Python has no CA bundle. Conda/most venvs are fine; a python.org install needs its Install Certificates.command run once (or pip install certifi).

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors