Underwater Stereo Images Dataset

Stereo images and generated point clouds of divers' poses hand gestures in different underwater scenarios.

The images available were collected with a Bumblebee XB3 FireWire Stereo Vision System during different research trials carried out within the EU-FP7 CADDY project (Cognitive Autonomous Diving Buddy). With the creation of this dataset, research in two specific areas was made: diver body pose estimation and hand gesture recognition.

Diver body pose estimation

Biograd - October 2016

Stereo images: 132 MB
Bagfile: 0.92 GB

PC view Stereo Bagfile

Brodarski - August 2016

Stereo images: 1429 MB
Bagfile: 14.1 GB

PC view Stereo Bagfile

Brodarski - May 2016

Stereo images: 1450 MB
Bagfile: 10.22 GB

PC view Stereo Bagfile

The AUV (Autonomous Underwater Vehicle) needs to face the diver from the front in order to communicate with him/her through an attached tablet. It is also the optimal position to monitor the diver's behavior and overall well-being e.g. breathing pattern, equipment position, etc.

To keep the AUV in front of the diver at all times, the diver wears a system of inertial sensors in the suit that transmits his pose acoustically (DiverNet) [1]. However, acoustic communication's bandwith and tranmission rate (~5s) prevents the system from having this information in real time. For this reason, diver pose estimation methods based on stereo images were developed [2] and this database was created.

To collect the data, divers were asked to perform three tasks in front of the AUV: turn 360° horizontally (chest pointing downwards) and vertically, clockwise and anticlockwise, and swim freely. For the latter, the AUV was operated manually to follow the diver. Ground thruth is provided by the inertial sensor located in the diver's chest, which logs the data to an underwater tablet directly hooked to the DiverNet or to an on-land computer through an optic fiber cable. Three data collection experiments were done in an open-sea environment in Biograd na Moru, Croatia; and in an indoor pool at Brodarski Institute, Zagreb, Croatia.

We describe the files provided for each dataset as follows:

i It is important to mention that the bagfile shows the recording of the complete experiment whereas the provided stereo images are the ones particularly used for the developed algorithms. They only include images where the diver appears on the field of view and from which sufficiently dense point clouds could be generated. For a more detailed explanation of our methodology please refer to the Publications section [2].

Your browser must have WebGL support for proper display.

© 2018 Robotics Group, Jacobs University Bremen.
This website includes software developed and used by the Systems, Robotics and Vision Group, University of the Balearican Islands (
based on Bootstrap, Three.js and PapaParse.

Impressum / Imprint