Data

Spectral Centroid Images for Multi-class Human Action Analysis : A Benchmark Dataset

The University of Western Australia
Shaikh, Muhammad Bilal ; Chai, Douglas ; Islam, Syed Mohammed Shamsul ; Akhtar, Naveed
Viewed: [[ro.stat.viewed]] Cited: [[ro.stat.cited]] Accessed: [[ro.stat.accessed]]
ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&rfr_id=info%3Asid%2FANDS&rft_id=info:doi10.17632/yfvv3crnpy.1&rft.title=Spectral Centroid Images for Multi-class Human Action Analysis : A Benchmark Dataset&rft.identifier=10.17632/yfvv3crnpy.1&rft.publisher=Mendeley Data&rft.description=This dataset contains a collection of spectral centroid images that represent various human actions. Spectral centroid images are time-frequency representations of audio signals that capture the distribution of frequency components over time. In this dataset, the audio signals correspond to different human actions, such as walking, running, jumping, and dancing. The spectral centroid images were generated using the short-time Fourier transform (STFT) of the audio signals, and each image represents a segment of the audio signal. The dataset is designed for tasks such as human action recognition, classification, segmentation, and detection. It can be used to train and evaluate machine learning models that analyze human actions based on audio signals. The dataset is suitable for researchers and practitioners in the fields of signal processing, computer vision, and machine learning who are interested in developing algorithms for human action analysis using audio signals. The dataset is annotated with labels indicating the type of human action represented in each spectral centroid image.&rft.creator=Shaikh, Muhammad Bilal &rft.creator=Chai, Douglas &rft.creator=Islam, Syed Mohammed Shamsul &rft.creator=Akhtar, Naveed &rft.date=2023&rft.type=dataset&rft.language=English Access the data

Access:

Open

Full description

This dataset contains a collection of spectral centroid images that represent various human actions. Spectral centroid images are time-frequency representations of audio signals that capture the distribution of frequency components over time. In this dataset, the audio signals correspond to different human actions, such as walking, running, jumping, and dancing. The spectral centroid images were generated using the short-time Fourier transform (STFT) of the audio signals, and each image represents a segment of the audio signal. The dataset is designed for tasks such as human action recognition, classification, segmentation, and detection. It can be used to train and evaluate machine learning models that analyze human actions based on audio signals. The dataset is suitable for researchers and practitioners in the fields of signal processing, computer vision, and machine learning who are interested in developing algorithms for human action analysis using audio signals. The dataset is annotated with labels indicating the type of human action represented in each spectral centroid image.

Notes

External Organisations
Edith Cowan University
Associated Persons
Douglas Chai (Contributor)Muhammad Bilal Shaikh (Creator)

Issued: 2023-05-22

This dataset is part of a larger collection

Click to explore relationships graph

User Contributed Tags    

Login to tag this record with meaningful keywords to make it easier to discover

Other Information
MAiVAR: Multimodal Audio-Image and Video Action Recognizer

url : http://research-repository.uwa.edu.au/en/publications/320adc19-a686-4612-ac40-2216742198bc

Conference paper

Identifiers