Data

Foreign Accented Australian English

Western Sydney University
Estival, Dominique
Viewed: [[ro.stat.viewed]] Cited: [[ro.stat.cited]] Accessed: [[ro.stat.accessed]]
ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&rfr_id=info%3Asid%2FANDS&rft_id=info:doi10.26183/weve-zg60&rft.title=Foreign Accented Australian English dataset&rft.identifier=10.26183/weve-zg60&rft.publisher=Western Sydney University&rft.description=Foreign accent is a distinct attribute of second language (L2) speech production. Foreign-accented speech is characterized by deviations in the target language pronunciation due to the learners' native language (L1) influence (Best, 1995; Best & Tyler, 2007; Flege, 1995), specifically due to differences in and contact between L1 and L2 phonology and phonotactics (i.e. consonant/vowel acoustics & articulation, syllable structure), as well as prosody (i.e. intonation, rhythm, stress, pitch). The degree or strength of a foreign accent varies among L2 learners and depends on a number of factors, such as the age of L2 acquisition (Piske, MacKay, & Flege, 2001), language experience (i.e., length of residence in an L2-speaking country: Bohn & Flege, 1990), amount of use (Flege, Frieda, & Nozawa, 1997), motivation/language learning aptitude (Piske et al., 2001). Automatic speech recognition (ASR) systems are predominantly trained on native speech data. However, while the native speakers have shown flexibility in adaptation to L2 speech, despite variation in L2 speakers’ accents, proficiency, and fluency, automatic recognition of foreign-accented speech degrades considerably in comparison to recognition of native speech (Derwing, Munro, & Carbonaro, 2000). Thus, the key challenges in ASR for accented speech are to ensure fast model adaptation on limited (and often homogenous) speech data and to facilitate recognition of unseen (untrained) accents (He & Zhao, 2002). Standard Australian English (AusE) is a distinct regional variety of the English language (Cox & Palethorpe, 2007). To enable accurate ASR of AusE speech, models need to be specifically trained on AusE speech data, in addition to American English and British English data (Chengalvarayan, 2001). This dataset contains: • Audio data for 226 speakers of Australian English with an Arabic accent (150 males and 76 females). • Demographic information for all the speakers. • Transcription of the audio data. • Lexicon extracted from the data. Comprehensive documentation publicly available. This dataset contains sensitive information. To discuss the data, please contact d.estival@westernsydney.edu.au ORCID - 0000-0002-6178-3825 &rft.creator=Estival, Dominique &rft.date=2023&rft.coverage=150.884383,-33.97278 150.884383,-33.90613 151.062863,-33.90613 151.062863,-33.97278 150.884383,-33.97278&rft.coverage=Bankstown&rft.coverage=Liverpool&rft_rights=Copyright Western Sydney University&rft_subject=Accented Australian English (Arabic accent), &rft_subject=Language&rft_subject=audio recordings with transcriptions&rft_subject=Phonemic lexicon&rft_subject=The MARCS Institute&rft_subject=Audio processing&rft_subject=Computer vision and multimedia computation&rft_subject=INFORMATION AND COMPUTING SCIENCES&rft_subject=Machine learning not elsewhere classified&rft_subject=Machine learning&rft_subject=Psycholinguistics (incl. speech production and comprehension)&rft_subject=Cognitive and computational psychology&rft_subject=PSYCHOLOGY&rft_subject=Emerging defence technologies&rft_subject=Defence&rft_subject=DEFENCE&rft_subject=Command, control and communications&rft_subject=Expanding knowledge in psychology&rft_subject=Expanding knowledge&rft_subject=EXPANDING KNOWLEDGE&rft.type=dataset&rft.language=English Access the data

Licence & Rights:

view details

Copyright Western Sydney University

Access:

Conditions apply view details

Conditional

Full description

Foreign accent is a distinct attribute of second language (L2) speech production. Foreign-accented speech is characterized by deviations in the target language pronunciation due to the learners' native language (L1) influence (Best, 1995; Best & Tyler, 2007; Flege, 1995), specifically due to differences in and contact between L1 and L2 phonology and phonotactics (i.e. consonant/vowel acoustics & articulation, syllable structure), as well as prosody (i.e. intonation, rhythm, stress, pitch). The degree or strength of a foreign accent varies among L2 learners and depends on a number of factors, such as the age of L2 acquisition (Piske, MacKay, & Flege, 2001), language experience (i.e., length of residence in an L2-speaking country: Bohn & Flege, 1990), amount of use (Flege, Frieda, & Nozawa, 1997), motivation/language learning aptitude (Piske et al., 2001). Automatic speech recognition (ASR) systems are predominantly trained on native speech data. However, while the native speakers have shown flexibility in adaptation to L2 speech, despite variation in L2 speakers’ accents, proficiency, and fluency, automatic recognition of foreign-accented speech degrades considerably in comparison to recognition of native speech (Derwing, Munro, & Carbonaro, 2000). Thus, the key challenges in ASR for accented speech are to ensure fast model adaptation on limited (and often homogenous) speech data and to facilitate recognition of unseen (untrained) accents (He & Zhao, 2002). Standard Australian English (AusE) is a distinct regional variety of the English language (Cox & Palethorpe, 2007). To enable accurate ASR of AusE speech, models need to be specifically trained on AusE speech data, in addition to American English and British English data (Chengalvarayan, 2001). This dataset contains: • Audio data for 226 speakers of Australian English with an Arabic accent (150 males and 76 females). • Demographic information for all the speakers. • Transcription of the audio data. • Lexicon extracted from the data. Comprehensive documentation publicly available. This dataset contains sensitive information. To discuss the data, please contact d.estival@westernsydney.edu.au ORCID - 0000-0002-6178-3825

Created: 2023-09-14

Data time period: 12 2018 to 31 03 2019

This dataset is part of a larger collection

Click to explore relationships graph

150.88438,-33.97278 150.88438,-33.90613 151.06286,-33.90613 151.06286,-33.97278 150.88438,-33.97278

150.973623,-33.939455

text: Bankstown

text: Liverpool

Identifiers
  • DOI : 10.26183/WEVE-ZG60
  • Local : research-data.westernsydney.edu.au/published/bc102c004e0a11ee8c0aab8bb9294302