Data

MFFCs for Multi-class Human Action Analysis : A Benchmark Dataset

The University of Western Australia
Shaikh, Muhammad Bilal ; Chai, Douglas ; Islam, Syed Mohammed Shamsul ; Akhtar, Naveed
Viewed: [[ro.stat.viewed]] Cited: [[ro.stat.cited]] Accessed: [[ro.stat.accessed]]
ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&rfr_id=info%3Asid%2FANDS&rft_id=info:doi10.17632/6ng2kgvnwk.1&rft.title=MFFCs for Multi-class Human Action Analysis : A Benchmark Dataset&rft.identifier=10.17632/6ng2kgvnwk.1&rft.publisher=Mendeley Data&rft.description=This dataset embodies a diverse collection of Mel Frequency Cepstral Coefficients (MFCCs) corresponding to various human actions. The MFCCs are time-frequency representations of audio signals that reflect the power spectrum of an audio signal, processed through a Mel scale filter bank to mimic human auditory perception. In this dataset, the audio signals are associated with different human actions, including walking, running, jumping, and dancing. The MFCCs are calculated through a series of signal processing stages, including application of the Fourier Transform, Mel scale transformation, and Discrete Cosine Transform. Each MFCC representation encapsulates a segment of the corresponding audio signal. The dataset is meticulously designed for tasks such as human action recognition, classification, segmentation, and detection. It serves as a potent instrument for training and assessing machine learning models that interpret human actions based on audio signals. The dataset is particularly useful for researchers and practitioners in the fields of signal processing, computer vision, and machine learning, who endeavor to construct algorithms for human action analysis through audio signals. Importantly, the dataset comes annotated with labels specifying the type of human action represented in each MFCC. This label information facilitates a supervised learning framework, which is essential for the development and evaluation of predictive models.&rft.creator=Shaikh, Muhammad Bilal &rft.creator=Chai, Douglas &rft.creator=Islam, Syed Mohammed Shamsul &rft.creator=Akhtar, Naveed &rft.date=2023&rft.relation=http://research-repository.uwa.edu.au/en/publications/e9d69696-ad1a-444f-ac33-5cf4014c7f2e&rft_subject=Multimodality&rft_subject=Benchmarking&rft_subject=Computer Vision Representation&rft_subject=Action Recognition&rft_subject=Image Analysis&rft.type=dataset&rft.language=English Access the data

Access:

Open

Full description

This dataset embodies a diverse collection of Mel Frequency Cepstral Coefficients (MFCCs) corresponding to various human actions. The MFCCs are time-frequency representations of audio signals that reflect the power spectrum of an audio signal, processed through a Mel scale filter bank to mimic human auditory perception. In this dataset, the audio signals are associated with different human actions, including walking, running, jumping, and dancing. The MFCCs are calculated through a series of signal processing stages, including application of the Fourier Transform, Mel scale transformation, and Discrete Cosine Transform. Each MFCC representation encapsulates a segment of the corresponding audio signal. The dataset is meticulously designed for tasks such as human action recognition, classification, segmentation, and detection. It serves as a potent instrument for training and assessing machine learning models that interpret human actions based on audio signals. The dataset is particularly useful for researchers and practitioners in the fields of signal processing, computer vision, and machine learning, who endeavor to construct algorithms for human action analysis through audio signals. Importantly, the dataset comes annotated with labels specifying the type of human action represented in each MFCC. This label information facilitates a supervised learning framework, which is essential for the development and evaluation of predictive models.

Notes

External Organisations
Edith Cowan University
Associated Persons
Douglas Chai (Contributor); Syed Mohammed Shamsul Islam (Contributor)Muhammad Bilal Shaikh (Creator)

Issued: 2023-07-26

This dataset is part of a larger collection

Click to explore relationships graph
Subjects

User Contributed Tags    

Login to tag this record with meaningful keywords to make it easier to discover

Other Information
MAiVAR: Multimodal Audio-Image and Video Action Recognizer

url : http://research-repository.uwa.edu.au/en/publications/320adc19-a686-4612-ac40-2216742198bc

Conference paper

Identifiers