Brief description
This dataset contains cloud free, low tide composite satellite images for the tropical Australia region based on 10 m resolution Sentinel 2 imagery from 2018 – 2023. This image collection was created as part of the NESP MaC 3.17 project and is intended to allow mapping of the reef features in tropical Australia. This collection contains composite imagery for 200 Sentinel 2 tiles around the tropical Australian coast. This dataset uses two styles: 1. a true colour contrast and colour enhancement style (TrueColour) using the bands B2 (blue), B3 (green), and B4 (red) 2. a near infrared false colour style (Shallow) using the bands B5 (red edge), B8 (near infrared), and B12 (short wave infrared). These styles are useful for identifying shallow features along the coastline. The Shallow false colour styling is optimised for viewing the first 3 m of the water column, providing an indication of water depth. This is because the different far red and near infrared bands used in this styling have limited penetration of the water column. In clear waters the maximum penetrations of each of the bands is 3-5 m for B5, 0.5 - 1 m for B8 and < 0.05 m for B12. As a result, the image changes in colour with the depth of the water with the following colours indicating the following different depths: - White, brown, bright green, red, light blue: dry land - Grey brown: damp intertidal sediment - Turquoise: 0.05 - 0.5 m of water - Blue: 0.5 - 3 m of water - Black: Deeper than 3 m In very turbid areas the visible limit will be slightly reduced. Change log: This dataset will be progressively improved and made available for download. These additions will be noted in this change log. 2024-07-24 - Add tiles for the Great Barrier Reef 2024-05-22 - Initial release for low-tide composites using 30th percentile (Git tag: "low_tide_composites_v1") Methods: The satellite image composites were created by combining multiple Sentinel 2 images using the Google Earth Engine. The core algorithm was: 1. For each Sentinel 2 tile filter the "COPERNICUS/S2_HARMONIZED" image collection by - tile ID - maximum cloud cover 0.1% - date between '2018-01-01' and '2023-12-31' - asset_size > 100000000 (remove small fragments of tiles) 2. Remove high sun-glint images (see "High sun-glint image detection" for more information). 3. Split images by "SENSING_ORBIT_NUMBER" (see "Using SENSING_ORBIT_NUMBER for a more balanced composite" for more information). 4. Iterate over all images in the split collections to predict the tide elevation for each image from the image timestamp (see "Tide prediction" for more information). 5. Remove images where tide elevation is above mean sea level to make sure no high tide images are included. 6. Select the 10 images with the lowest tide elevation. 7. Combine SENSING_ORBIT_NUMBER collections into one image collection. 8. Remove sun-glint (true colour only) and apply atmospheric correction on each image (see "Sun-glint removal and atmospheric correction" for more information). 9. Duplicate image collection to first create a composite image without cloud masking and using the 30th percentile of the images in the collection (i.e. for each pixel the 30th percentile value of all images is used). 10. Apply cloud masking to all images in the original image collection (see "Cloud Masking" for more information) and create a composite by using the 30th percentile of the images in the collection (i.e. for each pixel the 30th percentile value of all images is used). 11. Combine the two composite images (no cloud mask composite and cloud mask composite). This solves the problem of some coral cays and islands being misinterpreted as clouds and therefore creating holes in the composite image. These holes are "plugged" with the underlying composite without cloud masking. (Lawrey et al. 2022) 12. The final composite was exported as cloud optimized 8 bit GeoTIFF Note: The following tiles were generated with different settings as they did not have enough images to create a composite with the standard settings: - 51KWA: no high sun-glint filter - 54LXP: maximum cloud cover set to 1% - 54LXP: maximum cloud cover set to 1% - 54LYK: maximum cloud cover set to 2% - 54LYM: maximum cloud cover set to 5% - 54LYN: maximum cloud cover set to 1% - 54LYQ: maximum cloud cover set to 5% - 54LYP: maximum cloud cover set to 1% - 54LZL: maximum cloud cover set to 1% - 54LZM: maximum cloud cover set to 1% - 54LZN: maximum cloud cover set to 1% - 54LZQ: maximum cloud cover set to 5% - 54LZP: maximum cloud cover set to 1% - 55LBD: maximum cloud cover set to 2% - 55LBE: maximum cloud cover set to 1% - 55LCC: maximum cloud cover set to 5% - 55LCD: maximum cloud cover set to 1% High sun-glint image detection: Images with high sun-glint can lead to lower quality composite images. To determine high sun-glint images, a mask is created for all pixels above a high reflectance threshold for the near-infrared and short-wave infrared bands. Then the proportion of this is calculated and compared against a sun-glint threshold. If the image exceeds this threshold, it is filtered out of the image collection. As we are only interested in the sun-glint on water pixels, a water mask is created using NDWI before creating the sun-glint mask. Sun-glint removal and atmospheric correction: Sun-glint was removed from the images using the infrared B8 band to estimate the reflection off the water from the sun-glint. B8 penetrates water less than 0.5 m and so in water areas it only detects reflections off the surface of the water. The sun-glint detected by B8 correlates very highly with the sun-glint experienced by the visible channels (B2, B3 and B4) and so the sun-glint in these channels can be removed by subtracting B8 from these channels. Eric Lawrey developed this algorithm by fine tuning the value of the scaling between the B8 channel and each individual visible channel (B2, B3 and B4) so that the maximum level of sun-glint would be removed. This work was based on a representative set of images, trying to determine a set of values that represent a good compromise across different water surface conditions. This algorithm is an adjustment of the algorithm already used in Lawrey et al. 2022 Tide prediction: To determine the tide elevation in a specific satellite image, we used a tide prediction model to predict the tide elevation for the image timestamp. After investigating and comparing a number of models, it was decided to use the empirical ocean tide model EOT20 (Hart-Davis et al., 2021). The model data can be freely accessed at https://doi.org/10.17882/79489 and works with the Python library pyTMD (https://github.com/tsutterley/pyTMD). In our comparison we found this model was able to predict accurately the tide elevation across multiple points along the study coastline when compared to historic Bureau of Meteorolgy and AusTide data. To determine the tide elevation of the satellite images we manually created a point dataset where we placed a central point on the water for each Sentinel tile in the study area . We used these points as centroids in the ocean models and calculated the tide elevation from the image timestamp. Using "SENSING_ORBIT_NUMBER" for a more balanced composite: Some of the Sentinel 2 tiles are made up of different sections depending on the "SENSING_ORBIT_NUMBER". For example, a tile could have a small triangle on the left side and a bigger section on the right side. If we filter an image collection and use a subset to create a composite, we could end up with a high number of images for one section (e.g. the left side triangle) and only few images for the other section(s). This would result in a composite image with a balanced section and other sections with a very low input. To avoid this issue, the initial unfiltered image collection is divided into multiple image collections by using the image property "SENSING_ORBIT_NUMBER". The filtering and limiting (max number of images in collection) is then performed on each "SENSING_ORBIT_NUMBER" image collection and finally, they are combined back into one image collection to generate the final composite. Cloud Masking: Each image was processed to mask out clouds and their shadows before creating the composite image. The cloud masking uses the COPERNICUS/S2_CLOUD_PROBABILITY dataset developed by SentinelHub (Google, n.d.; Zupanc, 2017). The mask includes the cloud areas, plus a mask to remove cloud shadows. The cloud shadows were estimated by projecting the cloud mask in the direction opposite the angle to the sun. The shadow distance was estimated in two parts. A low cloud mask was created based on the assumption that small clouds have a small shadow distance. These were detected using a 35% cloud probability threshold. These were projected over 400 m, followed by a 150 m buffer to expand the final mask. A high cloud mask was created to cover longer shadows created by taller, larger clouds. These clouds were detected based on an 80% cloud probability threshold, followed by an erosion and dilation of 300 m to remove small clouds. These were then projected over a 1.5 km distance followed by a 300 m buffer. The parameters for the cloud masking (probability threshold, projection distance and buffer radius) were determined through trial and error on a small number of scenes. As such there are probably significant potential improvements that could be made to this algorithm. Erosion, dilation and buffer operations were performed at a lower image resolution than the native satellite image resolution to improve the computational speed. The resolution of these operations was adjusted so that they were performed with approximately a 4 pixel resolution during these operations. This made the cloud mask significantly more spatially coarse than the 10 m Sentinel imagery. This resolution was chosen as a trade-off between the coarseness of the mask verse the processing time for these operations. With 4-pixel filter resolutions these operations were still using over 90% of the total processing resulting in each image taking approximately 10 min to compute on the Google Earth Engine. (Lawrey et al. 2022) Format: GeoTiff - LZW compressed, 8 bit channels, 0 as NoData, Imagery as values 1 - 255. Internal tiling and overviews. Average size: 12500 x 11300 pixels and 300 MB per image. The images in this dataset are all named using a naming convention. An example file name is `AU_AIMS_MARB-S2-comp_low-tide_p30_TrueColour_51KUA.tif`. The name is made up from: - Dataset name (`AU_AIMS_MARB-S2-comp`) - Tide elevation, here "low-tide" - An algorithm descriptor (`p30` for 30th percentile), - Colour and contrast enhancement applied (`TrueColour` or `NearInfraredFalseColour`), - Sentinel 2 tile ID (example: `51KUA`), References: Google (n.d.) Sentinel-2: Cloud Probability. Earth Engine Data Catalog. Accessed 10 April 2021 from https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2_CLOUD_PROBABILITY Zupanc, A., (2017) Improving Cloud Detection with Machine Learning. Medium. Accessed 10 April 2021 from https://medium.com/sentinel-hub/improving-cloud-detection-with-machine-learning-c09dc5d7cf13 Lawrey, E., & Hammerton, M. (2022). Coral Sea features satellite imagery and raw depth contours (Sentinel 2 and Landsat 8) 2015 – 2021 (AIMS) [Data set]. eAtlas. https://doi.org/10.26274/NH77-ZW79 Hart-Davis, M., Piccioni, G., Dettmering, D., Schwatke, C., Passaro, M., and Seitz, F.: EOT20 – A global Empirical Ocean Tide model from multi-mission satellite altimetry, SEANOE [data set], https://doi.org/10.17882/79489, 2021. Data Location: This dataset is filed in the eAtlas enduring data repository at: data\custodian\2023-2026-NESP-MaC-3\3.17_Northern-Aus-reef-mapping The source code is available on GitHub: https://github.com/eatlas/AU_NESP-MaC-3-17_AIMS_S2-compNotes
CreditTeam members: Marc Hammerton (AIMS), Eric Lawrey (AIMS)
National Environmental Science Program (NESP) Marine and Coastal Hub
Department of Climate Change, Energy, the Environment and Water (DCCEEW), Australian Government
In addition to NESP (DCCEEW) funding, this project is matched by an equivalent amount of in-kind support and co-investment from project partners and collaborators.
This image collection is intended to allow mapping of the reef and island features of tropical Australia. This version of the imagery was developed to facilitate exploration of what could be seen and mapped from the imagery.
Data time period: 2018-01-01 to 2023-12-31
User Contributed Tags
Login to tag this record with meaningful keywords to make it easier to discover
(True colour imagery data 200 GeoTiff and preview images [68.5 GB total])
(Near infrared false colour imagery data 200 GeoTiff and preview images [49.5 GB total])
(Source code - Python Google Earth Engine (GitHub))
uri :
https://github.com/eatlas/AU_NESP-MaC-3-17_AIMS_S2-comp
global : 58f3a091-2463-4963-a908-2a5505e2baf9
- DOI : 10.26274/2BFV-E921
- global : ec1db691-3729-4635-a01c-378263e539b6