Data

Tasmanian Orange Roughy Stereo Image Machine Learning Dataset

Commonwealth Scientific and Industrial Research Organisation
Scoulding, Ben ; Maguire, Kylie ; Orenstein, Eric ; Jackett, Chris ; CSIRO
Viewed: [[ro.stat.viewed]] Cited: [[ro.stat.cited]] Accessed: [[ro.stat.accessed]]
ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&rfr_id=info%3Asid%2FANDS&rft_id=info:doi10.25919/a90r-4962&rft.title=Tasmanian Orange Roughy Stereo Image Machine Learning Dataset&rft.identifier=https://doi.org/10.25919/a90r-4962&rft.publisher=Commonwealth Scientific and Industrial Research Organisation&rft.description=The Tasmanian Orange Roughy Stereo Image Machine Learning Dataset is a collection of annotated stereo image pairs collected by a net-attached Acoustic and Optical System (AOS) during orange roughy (Hoplostethus atlanticus) biomass surveys off the northeast coast of Tasmania, Australia in July 2019. The dataset consists of expertly annotated imagery from six AOS deployments (OP12, OP16, OP20, OP23, OP24, and OP32), representing a variety of conditions including different fish densities, benthic substrates, and altitudes above the seafloor. Each image was manually annotated with bounding boxes identifying orange roughy and other marine species. For all annotated images, paired stereo images from the opposite camera have been included where available to enable stereo vision analysis. This dataset was specifically developed to investigate the effectiveness of machine learning-based object detection techniques for automating fish detection under variable real-world conditions, providing valuable resources for advancing automated image processing in fisheries science.\nLineage: Data were obtained onboard the 32 m Fishing Vessel Saxon Onward during an orange roughy acoustic biomass survey off the northeast coast of Tasmania in July 2019. Stereo image pairs were collected using a net-attached Acoustic and Optical System (AOS), which is a self-contained autonomous system with multi-frequency and optical capabilities mounted on the headline of a standard commercial orange roughy demersal trawl. Images were acquired by a pair of Prosilica GX3300 Gigabyte Ethernet cameras with Zeiss F2.8 lenses (25 mm focal length), separated by 90 cm and angled inward at 7° to provide 100% overlap at a 5 m range. Illumination was provided by two synchronised quantum trio strobes. Stereo pairs were recorded at 1 Hz in JPG format with a resolution of 3296 x 2472 pixels and a 24-bit depth.\n\nHuman experts manually annotated images from the six deployments using both the CVAT annotation tool (producing COCO format annotations) and LabelImg tool (producing XML format annotations). Only port camera views were annotated for all deployments. Annotations included bounding boxes for orange roughy and orange roughy edge (for partially visible fish), as well as other marine species (brittle star, coral, eel, miscellaneous fish, etc.). Prior to annotation, under-exposed images were enhanced based on altitude above the seafloor using a Dark Channel Prior (DCP) approach, and images taken above 10 m altitude were discarded due to poor visibility.\n\nFor all annotated images, the paired stereo images (from the opposite camera) have been included where available to enable stereo vision applications. The dataset represents varying conditions of fish density (1-59 fish per image), substrate types (light vs. dark), and altitudes (2.0-10.0 m above seafloor), making it particularly valuable for training and evaluating object detection models under variable real-world conditions.\n\nThe final standardised COCO dataset contains 1051 annotated port-side images, 849 paired images (without annotations), and 14414 total annotations across 17 categories. The dataset's category distribution includes orange roughy (9887), orange roughy edge (2928), mollusc (453), cnidaria (359), misc fish (337), sea anemone (136), sea star (105), sea feather (100), sea urchin (45), coral (22), eel (15), oreo (10), brittle star (8), whiptail (4), chimera (2), siphonophore (2), and shark (1).&rft.creator=Scoulding, Ben &rft.creator=Maguire, Kylie &rft.creator=Orenstein, Eric &rft.creator=Jackett, Chris &rft.creator=CSIRO &rft.date=2025&rft.edition=v1&rft.relation=https://doi.org/10.1093/icesjms/fsac166&rft.coverage=westlimit=148.72806666666668; southlimit=-41.513133333333336; eastlimit=148.77066666666667; northlimit=-41.211983333333336; projection=WGS84&rft_rights=Creative Commons Attribution Noncommercial-Share Alike 4.0 Licence https://creativecommons.org/licenses/by-nc-sa/4.0/&rft_rights=Data is accessible online and may be reused in accordance with licence conditions&rft_rights=All Rights (including copyright) CSIRO 2025.&rft_subject=orange roughy&rft_subject=Hoplostethus atlanticus&rft_subject=machine learning&rft_subject=deep learning&rft_subject=object detection&rft_subject=fisheries science&rft_subject=stereo imagery&rft_subject=benthic imagery&rft_subject=image annotation&rft_subject=underwater imaging&rft_subject=automated fish detection&rft_subject=biomass survey&rft_subject=acoustic-optical system&rft_subject=AOS&rft_subject=marine ecology&rft_subject=Artificial intelligence not elsewhere classified&rft_subject=Artificial intelligence&rft_subject=INFORMATION AND COMPUTING SCIENCES&rft_subject=Machine learning not elsewhere classified&rft_subject=Machine learning&rft.type=dataset&rft.language=English Access the data

Licence & Rights:

Non-Commercial Licence view details
CC-BY-NC-SA

Creative Commons Attribution Noncommercial-Share Alike 4.0 Licence
https://creativecommons.org/licenses/by-nc-sa/4.0/

Data is accessible online and may be reused in accordance with licence conditions

All Rights (including copyright) CSIRO 2025.

Access:

Open view details

Accessible for free

Contact Information



Brief description

The Tasmanian Orange Roughy Stereo Image Machine Learning Dataset is a collection of annotated stereo image pairs collected by a net-attached Acoustic and Optical System (AOS) during orange roughy (Hoplostethus atlanticus) biomass surveys off the northeast coast of Tasmania, Australia in July 2019. The dataset consists of expertly annotated imagery from six AOS deployments (OP12, OP16, OP20, OP23, OP24, and OP32), representing a variety of conditions including different fish densities, benthic substrates, and altitudes above the seafloor. Each image was manually annotated with bounding boxes identifying orange roughy and other marine species. For all annotated images, paired stereo images from the opposite camera have been included where available to enable stereo vision analysis. This dataset was specifically developed to investigate the effectiveness of machine learning-based object detection techniques for automating fish detection under variable real-world conditions, providing valuable resources for advancing automated image processing in fisheries science.
Lineage: Data were obtained onboard the 32 m Fishing Vessel Saxon Onward during an orange roughy acoustic biomass survey off the northeast coast of Tasmania in July 2019. Stereo image pairs were collected using a net-attached Acoustic and Optical System (AOS), which is a self-contained autonomous system with multi-frequency and optical capabilities mounted on the headline of a standard commercial orange roughy demersal trawl. Images were acquired by a pair of Prosilica GX3300 Gigabyte Ethernet cameras with Zeiss F2.8 lenses (25 mm focal length), separated by 90 cm and angled inward at 7° to provide 100% overlap at a 5 m range. Illumination was provided by two synchronised quantum trio strobes. Stereo pairs were recorded at 1 Hz in JPG format with a resolution of 3296 x 2472 pixels and a 24-bit depth.

Human experts manually annotated images from the six deployments using both the CVAT annotation tool (producing COCO format annotations) and LabelImg tool (producing XML format annotations). Only port camera views were annotated for all deployments. Annotations included bounding boxes for "orange roughy" and "orange roughy edge" (for partially visible fish), as well as other marine species (brittle star, coral, eel, miscellaneous fish, etc.). Prior to annotation, under-exposed images were enhanced based on altitude above the seafloor using a Dark Channel Prior (DCP) approach, and images taken above 10 m altitude were discarded due to poor visibility.

For all annotated images, the paired stereo images (from the opposite camera) have been included where available to enable stereo vision applications. The dataset represents varying conditions of fish density (1-59 fish per image), substrate types (light vs. dark), and altitudes (2.0-10.0 m above seafloor), making it particularly valuable for training and evaluating object detection models under variable real-world conditions.

The final standardised COCO dataset contains 1051 annotated port-side images, 849 paired images (without annotations), and 14414 total annotations across 17 categories. The dataset's category distribution includes orange roughy (9887), orange roughy edge (2928), mollusc (453), cnidaria (359), misc fish (337), sea anemone (136), sea star (105), sea feather (100), sea urchin (45), coral (22), eel (15), oreo (10), brittle star (8), whiptail (4), chimera (2), siphonophore (2), and shark (1).

Available: 2025-04-07

Data time period: 2019-07-11 to 2019-07-18

This dataset is part of a larger collection

Click to explore relationships graph

148.77067,-41.21198 148.77067,-41.51313 148.72807,-41.51313 148.72807,-41.21198 148.77067,-41.21198

148.74936666667,-41.362558333333