Spoken Language

Linguistics: Speech Accent Archive

Great source!

Everyone who speaks a language, speaks it with an accent. A particular accent essentially reflects a person's linguistic background. When people listen to someone speak with a different accent from their own, they notice the difference, and they may even make certain biased social judgments about the speaker.

The speech accent archive is established to uniformly exhibit a large set of speech accents from a variety of language backgrounds. 

Speech Accent Archive

Modern Times Corpus

Charlie Chaplin.jpgIn 2012 and 2013 at the Hamburg Center for Language Corpora (HZSK) I compiled the Hamburg Modern Times Corpus (HaMoTiC). It consists of transcribed audio recordings of learners of German at different proficiency levels who renarrate a few scenes from the silent film “Modern Times” (USA 1936, Charles Chaplin). The main objective was to create a linguistic resource that is both based on and comparative to previous Modern Times corpora (esp. Perdue 1993), and makes use of the tools and methods for transcription, annotation and analysis of spoken language corpora that were implemented at the HZSK, in order to demonstrate their functionality (EXMARaLDA). In terms of their content, the Hamburg Map Task Corpus (HAMATAC) and HaMoTiC complement each other in reference to their authenticity and controllability of learner language. See more details on HaMoTiC at the Virtual Language Observatory by CLARIN.

Image: „Charlie Chaplin“ by P.D Jankens - Fred Chess. Wikimedia Commons.

IBM: Big Data, Speech Processing and Machine Translation

For a machine to truly process speech data, it needs cognitive computing – a system with architecture that imitates how the human brain understands information. IBM Watson’s ability to understand natural language is just a first piece to a complex cognitive computing puzzle. But as cognitive computing is applied to Big Data, it will also revolutionize speech recognition and speech translation.

IBM Research: Dimitri Kanevsky Translating Big Data