Public project number 2025.00265.DT4ST financed by ARTE.
Developments were carried out by the DATAz consortium, which includes the following organisations: blueOASIS, IDMEC (Instituto de Engenharia Mecânica), IMAR (Instituto do Mar), Instituto Hidrográfico, Centro de Experimentação Operacional da Marinha, Associação para o Desenvolvimento e Formação do Mar dos Açores, and Direção Regional de Políticas Marítimas.
The DATAz DTO relies on a suite of numerical modelling tools to simulate the ocean environment around the Azores site. These include:
- MOHID: An open-source water modelling system developed by Instituto Superior Técnico to simulate hydrodynamic, environmental, and water quality processes across rivers, estuaries, coastal zones, and oceans.
- RAINDROP: A Python-based framework developed by blueOASIS to generate real-time holistic underwater acoustic maps, automating workflows from environmental data collection and simulation setup to execution and post-processing.
- REEF3D: An open-source phase-resolved wave model developed by the Norwegian University of Science and Technology to simulate nearshore hydrodynamics over complex bathymetry and coastlines
- SWAN: An open-source phase-averaged wave model developed by Delft University of Technology to simulate wind-generated waves in coastal regions and inland waters.
- WW3: A community-driven wave modelling framework providing global-to-regional spectral wave forecasts, used here for boundary forcing and validation.
The surrogate_models directory provides data preprocessing and training pipelines for AI-based surrogates of the numerical models. For example, the SWAN surrogate uses Jupyter notebooks to handle boundary condition preprocessing, input/output preparation, and model training, enabling rapid approximation of wave conditions without running the full numerical solver.
The DATAz DTO site is presented through open access on blueOASIS' DTO dashboard framework, accessed via the site-centric API (/api/site/:id/*).
The dashboard provides:
- Live status panel: Real-time Hydrotwin environmental readings, sensor status, and environmental measurement availability.
- Detection timeline: Chronological view of all detection events across modalities.
- Scenario explorer (future feature): Interactive what-if interface backed by the surrogate model.
- Alert management console: Active alerts, historical alert log, and response action tracking.
The Hydrotwin dashboard encapsulates site-specific views, where:
- A single site (such as DATAz) shows a unified view of all sensors.
- Individual sensors can be selected for deeper analysis.
- In-situ sensors tied to an acoustic deployment can be viewed through individual sensor tabs.
- HT-C units offer a live listening mode.
Large datasets and model checkpoints (bathymetry, AIS, ERA5/CMEMS/FES forcing, surrogate weights, etc.) are not stored in git. They are tracked with DVC and hosted in public Azure Blob Storage, so cloning the repo gives you small pointer files (*.dvc) rather than the data itself — you fetch the actual files with dvc pull.
1) Install DVC
pip install "dvc[http]"2) Clone the repository
git clone https://github.com/blueOceanSustainableSolutions/DATAz.git
cd DATAz3) Pull the data
The data lives in a public, read-only container, so no credentials are needed. Add the public remote over HTTPS, then pull:
pip install dvc-azure
dvc remote add --local public https://datazstorage.blob.core.windows.net/dvcstore
dvc pull -r publicThis downloads all tracked datasets into their expected locations.
To fetch only part of it, pass the path to a tracked file (or its .dvc pointer) — not a folder:
dvc pull -r public zlt_numerical_models/MOHID/GeneralData/Meteo/ERA5/era5.hdf5Note: pull over HTTPS, not the
azure://remote. The container allows anonymous reads but not the operations the Azure client performs, soazure://+allow_anonymous_loginhangs atCollecting ... 0.00 entry/sand then fails withAuthorizationPermissionMismatch. The HTTPS remote above avoids this entirely.macOS / SSL: if you see
CERTIFICATE_VERIFY_FAILED, your Python has no CA bundle. Conda/most venvs are fine; a python.org install needs itsInstall Certificates.commandrun once (orpip install certifi).
