Connect:  SPS Facebook Page  SPS Twitter  SPS LinkedIn  SPS YouTube Channel  SPS Google+ Page     Share: Share

Speech and Language Processing Technical Committee Newsletter

November 2013

Welcome to the Winter 2013-2014 edition of the IEEE Speech and Language Processing Technical Committee's Newsletter! This issue of the newsletter includes 7 articles and announcements from 25 contributors, including our own staff reporters and editors. Thank you all for your contributions! This issue includes news about IEEE journals and recent workshops, SLTC call for nominations, and individual contributions.

We believe the newsletter is an ideal forum for updates, reports, announcements and editorials which don't fit well with traditional journals. We welcome your contributions, as well as calls for papers, job announcements, comments and suggestions. You can submit job postings here, and reach us at speechnewseds [at] listserv (dot) ieee [dot] org.

To subscribe to the Newsletter, send an email with the command "subscribe speechnewsdist" in the message body to listserv [at] listserv (dot) ieee [dot] org.

Dilek Hakkani-Tür, Editor-in-chief
William Campbell, Editor
Haizhou Li, Editor
Patrick Nguyen, Editor

From the SLTC and IEEE

From the IEEE SLTC chair

Douglas O'Shaughnessy

CFPs, Jobs, and Announcements

Calls for papers, proposals, and participation

Edited by William Campbell

Job advertisements

Edited by William Campbell

The SLaTE 2013 Workshop

Pierre Badin, Thomas Hueber, Gérard Bailly, Martin Russell, Helmer Strik

SLaTE 2013 was the 5th workshop organised by the ISCA Special Interest Group on Speech and Language Technology for Education. It took place between 30th August and 1st September 2013 in Grenoble, France as a satellite workshop of Interspeech 2013. The workshop was attended by 68 participants from 20 countries. Thirty eight submitted papers and 14 demonstrations were presented in oral and poster sessions.

The INTERSPEECH 2013 Computational Paralinguistics Challenge - A Brief Review

Björn Schuller, Stefan Steidl, Anton Batliner, Alessandro Vinciarelli, Klaus Scherer, Fabien Ringeval, Mohamed Chetouani

The INTERSPEECH 2013 Computational Paralinguistics Challenge was held in conjunction with INTERSPEECH 2013 in Lyon, France, 25-29 August 2013. This Challenge was the fifth in a series held at INTERSPEECH since 2009 as an open evaluation of speech-based speaker state and trait recognition systems. Four tasks were addressed, namely social signals (such as laughter), conflict, emotion, and autism. 65 teams participated, the baseline as was given by the organisers could be exceeded, and a new reference feature set by the openSMILE feature extractor and the four corpora used are publicly available at the repository of the series.

An Overview of the Base Period of the Babel Program

Tara N. Sainath, Brian Kingsbury, Florian Metze, Nelson Morgan, Stavros Tsakalidis

The goal of the Babel program is to rapidly develop speech recognition capability for keyword search in previously unstudied languages, working with speech recorded in a variety of conditions with limited amounts of transcription. Several issues and observations frame the challenges driving the Babel Program. The speech recognition community has spent years improving the performance of English automatic speech recognition systems. However, applying techniques commonly used for English ASR to other languages has often resulted in huge performance gaps for those other languages. In addition, there are an increasing number of languages for which there is a vital need for speech recognition technology but few existing training resources [1]. It is easy to envision a situation where there is a large amount of recorded data in a language which contains important information, but for which there are very few people to analyze the language and no existing speech recognition technologies. Having keyword search in that language to pick out important phrases would be extremely beneficial.

MSR Identity Toolbox v1.0: A MATLAB Toolbox for Speaker-Recognition Research

Seyed Omid Sadjadi, Malcolm Slaney, and Larry Heck

We are happy to announce the release of the MSR Identity Toolbox: A MATLAB toolbox for speaker-recognition research. This toolbox contains a collection of MATLAB tools and routines that can be used for research and development in speaker recognition. It provides researchers with a test bed for developing new front-end and back-end techniques, allowing replicable evaluation of new advancements. It will also help newcomers in the field by lowering the "barrier to entry," enabling them to quickly build baseline systems for their experiments. Although the focus of this toolbox is on speaker recognition, it can also be used for other speech related applications such as language, dialect, and accent identification. Additionally, it provides many of the functionalities available in other open-source speaker recognition toolkits (e.g., ALIZE [1]) but with a simpler design which makes it easier for the users to understand and modify the algorithms.

The REAL Challenge

Maxine Eskenazi

The Dialog Research Center at Carnegie Mellon (DialRC) is organizing the REAL Challenge. The goal of the REAL Challenge ( is to build speech systems that are used regularly by real users to accomplish real tasks. These systems will give the speech and spoken dialog communities steady streams of research data as well as platforms they can use to carry out studies. It will engage both seasoned researchers and high school and undergrad students in an effort to find the next great speech applications.

SPASR workshop brings together speech production and its use in speech technologies

Karen Livescu

The Workshop on Speech Production in Automatic Speech Recognition (SPASR) was recently held as a satellite workshop of Interspeech 2013 in Lyon on August 30.

Speaker Identification: Screaming, Stress and Non-Neutral Speech, is there speaker content?

John H.L. Hansen, Navid Shokouhi

The field of speaker recognition has evolved significantly over the past twenty years, with great efforts worldwide from many groups/laboratories/universities, especially those participating in the biannual U.S. NIST SRE - Speaker Recognition Evaluation [1]. Recently, there has been great interest in considering the ability to perform effective speaker identification when speech is not produced in "neutral" conditions. Effective speaker recognition requires knowledge and careful signal processing/modeling strategies to address any mismatch conditions that could exist between the training and testing conditions. This article considers some past and recent efforts, as well as suggested directions when subjects move from a "neutral" speaking style, vocal effort, and ultimately pure "screaming" when it comes to speaker recognition. In the United States recently, there has been discussion in the news regarding the ability to accurately perform speaker recognition when the audio stream consists of a subject screaming. Here, we illustrate a probe experiment, but before that some background on speech under non-neutral conditions.

Subscribe to the newsletter