Full description
The BenAV dataset contains a lexicon of 50 words from 128 speakers (107 male and 21 female) with 26,300 utterances. The average number of speakers for each word is 18 (max 20, min 12, and standard deviation 1.826). The total duration of the dataset is 7.3 hours. This is the first Bengali audio-visual dataset that can be used for various research, including acoustic speech recognition and audio-visual speech recognition.Notes
External OrganisationsChittagong University of Engineering & Technology; Chittagong University of Engineering and Technology (CUET)
Associated Persons
Ashish Pondit (Creator); Muhammad Eshaque Ali Rukon (Creator); Anik Das (Creator)
Ashish Pondit (Creator); Muhammad Eshaque Ali Rukon (Creator); Anik Das (Creator)
Created: 2021
Issued: 2021-03-10
User Contributed Tags
Login to tag this record with meaningful keywords to make it easier to discover
Other Information
BenAV: A Bengali audio-visual corpus for visual speech recognition
url :
http://researchoutput.csu.edu.au/en/publications/76c343ad-5f12-4ed7-8d9b-5b8ccf770a3f
Conference paper
Identifiers
- global : 59a9e2de-7efa-4037-830c-b227ffb265d3