IEEE SPEECH TECHNICAL COMMITTEE NEWSLETTER

July 1, 2004

INTRODUCTION:

Welcome to the IEEE Signal Processing Society Speech Technical Committee (STC) newsletter.  We hope all recipients of the STC newsletter have had the opportunity celebrate Canada Day today.  As usual, we would like to invite contributions of events, publications, workshops, and career information to the newsletter  (rose@ece.mcgill.ca).

STC NEWS:
Minutes of May 2004 STC Meeting   (S. Parthasarthy)
The IEEE Speech Technical Committee (M. Rahim)

NEW WORKSHOP ANNOUNCEMENTS:
COST Tutorial Research Workshop on Nonlinear Speech Processing  (Isabel Transcoso)

CALL FOR PAPERS:
IEEE SAP Trans. Special Issue: Statistical and Perceptual Audio Processing  (Isabel Transcoso)
IEEE SAP Trans. Special Issue: Expressive Speech Synthesis  (Isabel Transcoso)
IEEE SAP Trans. Special Issue: Data Mining in Speech, Audio, and Dialog  (Mazin Rahim)

CAREERS:
R&D Positions  at Toshiba Research Limited in Cambridge, UK  (K. M. Knill)

LINKS TO WORKSHOPS AND CONFERENCES:
Links to conferences and workshops organized by date  (Rick Rose)


Editted Minutes of the STC Meeting

ICASSP 2004, Montreal, Quebec

12:00 PM  Wednesday, May 20

  

Note: Meetings of the IEEE SPS Speech Technical Committee take place twice each year.   There is a fall meeting whose main order of business is to organize sessions in the speech area for the upcoming ICASSP conference and a spring meeting which is usually held at the ICASSP conference and covers a wider set of topics.    The attendees of both meetings include both the STC members and the associate editors of the Speech and Audio Transactions.    This meeting took place at the ICASSP2004 conference in Montreal and was presided over by the  incoming STC Chairman, Mazin Rahim.   The following is an editted version of the minutes from this meeting recorded by S. Parthasarathy.


Attending Members and Associate Editors:


Not recorded

Order of Business:

   
1. Initial introduction of each member and SAP associate editor
2. Thanks to outgoing STC Chair Michael Picheny
3. Congratulations to incoming STC Chair Mazin Rahim

Old business:

New Business:

1. The number of nominations is often less than the number of awards. In this case all nominees get
awards.  If the number of nominations is greater than the number of awards, the awards committee votes and the top 4 get awards.
2. The awards committee consists of 13 members (9 from board, a chair, the head of the fellow committee, and 2 others. Only 1-2 members will be from speech. To win, you have to impress non-speech people (as well as speech folks) on the committee.
3. There are good and bad Technical Committees (TCs).  Some TCs don't send in any nominations.
4. Area editors can recommend papers for awards. They may or may not go through TC. Better to go through TC - higher success rate. Nominees from any individual must go through TC.
5. Magazine awards are by general voting and goes through TCs.
6. Nominations for awards board sought from TCs. BOG pick 2. EICs can nominate also.
7. One idea to reduce the mystery in the process is to let more people get involved. This can be done by having a higher turnover on the awards board by reducing the term from 6 years to 3 years.
8. Is speech being discriminated against?.  There were 10 nominations for Technical Achievement - can't complain speech did not win.  Should there be 1 award for each of the IEEE SPS journals? May not be a great idea. Should we give awards to papers that are not great just because we have to give one award to every journal?
top of page

The IEEE Speech Technical Committee (STC)

Mazin Rahim - STC Chair

The purpose of the Speech Technical Committee is to promote the advancement of technologies and applications in speech and language processing and to enhance interaction between the IEEE Signal Processing Society (SPS) and other similar organizations around the world. The committee coordinates all speech-related activities within the SPS, including the annual paper review and technical sessions for ICASSP, nominations for major Society Awards, IEEE speech-related workshops and symposia, and the STC Newsletter. The committee consists of a Chair, General Members and Associate Editors (AE). The Chair is elected by the STC and appointed by the Society Executive Committee to serve a two-year term. General Members are elected annually during the fall to serve a three-year term. Associate Editors are selected by the Editor-in-Chief of the IEEE Transaction on Speech and Audio Processing. There are currently 22 general members and 20 AEs that serve on the STC.

The STC operates through several smaller subcommittees that include among others the Awards nominations, Education, and Workshop and Conference Services. All STC members are current IEEE and SPS members of good standing. STC members play an active role in the technical and professional activities of the SPS. This includes conducting conference reviews as well as being actively involved in the operation of the subcommittees of the STC.

If you wish to be nominated to serve on the STC, then please contact the STC Election Committee. If you wish to nominate any of the SPS awards, then please contact the STC Award Committee. Information about the STC and the various subcommittees are available at our website at http://ewh.ieee.org/soc/sps/stc/.
top of page


TUTORIAL RESEARCH WORKSHOP

Nonlinear Speech Processing: Algorithms and Analysis

September 13-18 2004

Vietri sul Mare (Salerno), ITALY

CALL FOR SCHOOL

COST action 277 (http://www.cordis.lu/cost/src/277_indivpage.htm) jointly with the INTERNATIONAL INSTITUTE FOR ADVANCED SCIENTIFIC STUDIES (IIASS) (www.iiass.it) ETTORE MAJORANA FOUNDATION AND CENTER FOR SCIENTIFIC CULTURE (EMFCSC), Erice (TR), Italy (http://www.ccsem.infn.it/), organize the annual INTERNATIONAL SUMMER SCHOOL ``NEURAL NETS E. R. CAIANIELLO" IX COURSE as a TUTORIAL RESEARCH WORKSHOP on Nonlinear Speech Processing: Algorithms and Analysis September 13-18 2004 Vietri sul Mare (Salerno), ITALY

SUPPORTED BY:

The Management Committee (MC) Members of COST ACTION 277: Non linear speech processing.

Additional information are available on the web site:

http://www.iiass.it/school2004/index.htm

Tutorial and Contributions are accepted from any speech related field (signal processing, linguistics, acoustics, etc). Contributions by lecturers and participants will be published by Springer Verlag in the Lecture Note in Computer Science Series (LCNS).

LIST OF SPEAKERS:

1) Professor Amir Hussain, Department of Computing Science, University of Stirling, Stirling FK9 4LA, UK, http://www.cs.stir.ac.uk/~ahu, Title of talk:Non-linear Adaptive Speech Enhancement Inspired by Early Auditory Processing Modelling”;

2) Professor Gerard Chollet, CNRS URA-820, ENST, Dept. TSI, 46 rue Barrault, 75634 PARIS cedex 13, FR, http://tsi.enst.fr/~chollet, Title of the talk: Phone rate speech compression by indexation of ALISP segments;

3) Professor Anna Esposito, Department of Psychology, Second University of Naples, Via Vivaldi 43, Caserta, and International Institute for Advanced Scientific Studies, Vietri, Italy. Title of the talk: “ Text Independent Speech Segmentation Methods”

4) Professor Marcos Faundez-Zanuy, Escola Universitaria Politecnica de Mataro, Avda. Puig i Cadafalch 101-111, 08303, MATARO (BARCELONA) SPAIN. Title of the talk: “Nonlinear speech processing: Overview and possibilities”;

5) Professor Simon Haykin, Communications Research Laboratory, McMaster University, 1280 Main Street West, Hamilton, Ontario L8S 4K1 Canada. Title of talks:Regularized strict interpolation networks: Theory, design, and applications to speech processing”; “The cocktail party problem”.

6) Professor Eric Keller, Laboratoire d'analyse informatique de la parole (LAIP), Section d'informatique et de méthodes mathématiques (IMM), BFSH2 4096, Faculté des lettres Université de Lausanne, CH-1015 Lausanne, Suisse Title of talk: « On Spectral Analyses of Voice Quality ».

7) Professor Gernot Kubin, Signal Processing & Speech Communication Laboratory, and Christian Doppler Laboratory for Nonlinear Signal Processing, Graz University of Technology , Inffeldgasse 16c/4420, A-8010 Graz, Austria, E.U. Title of talk: Identification of Nonlinear Oscillator Models for Speech Analysis and Synthesis.

8) Professor Eric Moulines, Laboratoire de Traitement et de Communication de l'Information, Ecole Nationale Supérieure des Télécommunications, 46, rue Barrault, 75634 PARIS Cédex 13
FR.
Title of the talk : “Survey on Non-Linear State Space Models and Particle Sequential
Monte-Carlo Method
”;

9) Professor Bojan Petek, Faculty of Natural Sciences and Engineering, University of Ljubljana, Snezniska 5 1000 Ljubljana, Slovenia. Title of talk:Predictive Connectionist Approach to Speech Processing

10) Professor Jose C. Principe, Computational Neuro Engineering Laboratory, EB 451, Bldg #33,
University of Florida, Gainesville, FL 32611, USA.
Title of talk:Neural Computing with a Dynamical Perspective

11) Professor Jean Rouat, Dépt de Génie Elect. et de Génie Info, Institut des Matériaux et Systèmes Intelligents (IMSI), Université de Sherbrooke, 2500 Doul. de l'Université, SHERBROOKE, Québec, CANADA, J1K 2R1. Title of talk: Auditory scene analysis in link with non-linear speech processing and spiking Neural Networks

12) Professor Jean Schoentgen, Laboratory of Experimental Phonetics, CP 110, Université Libre de Bruxelles, 50, Av. F.-D. Roosevelt, B-1050 Brussels, Belgium. Title of Talk : “Speech Modeling based on Acoustic-to-Articulatory Mapping”.

top of page


Call for Papers

The IEEE Transactions on Speech and Audio Processing

Special Issue on Statistical and Perceptual Audio Processing


Current trends in audio analysis are strongly founded in statistical principles, or on approaches that are influenced by empirically derived, or perceptually motivated rules of auditory perception. These approaches are orthogonal and new ideas that draw upon from both perceptual and statistical principles are likely to result in superior performance. However, how these two approaches relate to each other has not been thoroughly explored.

In this special issue we invite researchers to submit papers on original and previously unpublished work on both approaches, and especially on hybrid techniques that combine perceptual and statistical principles, as applied to speech, music and audio analysis. Papers describing relevant research and new concepts are solicited on, but not limited to, the following topics:

SUBMISSION PROCEDURE

Prospective authors should prepare manuscripts according to the Information for Authors as published in any recent issue of the Transactions and as available on the web at http://www.ieee.org/organizations/society/sp/infotsa.html. Note that all rules will apply with regard to submission lengths, mandatory overlength page charges, and color charges.

Manuscripts should be submitted electronically through the online IEEE manuscript submission system at    
http://sps-ieee.manuscriptcentral.com/
.
When selecting a manuscript type, authors must click on “Special Issue of T-SA on Statistical and Perceptual Audio Processing.” Authors should follow the instructions for the IEEE Transactions on Speech and Audio Processing and indicate in the Comments to the Editor-in-Chief that the manuscript is submitted for publication in the Special Issue on Statistical and Perceptual Audio Processing. We require a completed copyright form to be signed and faxed to 1-732-562-8905 at the time of submission. Please indicate the manuscript number on the top of the page.

SCHEDULE

Submission deadline: 31 January 2005
Notification of acceptance: 30 July 2005
Final manuscript due: 1 September 2005
Tentative publication date: January 2006

GUEST EDITORS

Dr. Bhiksha Raj Mitsubishi Electric Research Labs, Cambridge, MA. bhiksha@merl.com
Dr. Malcolm Slaney IBM Almaden Research Center, Almaden CA. malcolm@ieee.org
Dr. Daniel Ellis Columbia University New York, NY.
dpwe@ee.columbia.edu
Dr. Paris Smaragdis Mitsubishi Electric Research Labs, Cambridge, MA. paris@merl.com
Dr. Judith Brown Wellesley College, Visiting Scientist at MIT brown @media.mit.edu

top of page


Special Issue of
The IEEE Transactions on Speech and Audio Processing
Expressive Speech Synthesis

Expressive Speech Synthesis (ESS) is a multidisciplinary research area that addresses one of the most complex problems in speech and language processing. The challenges posed by ESS have been the subject of several collaborative research projects across universities and laboratories around the world. Over the last decade ESS has benefited from advances in speech and language processing as well as from the availability of large conversational-speech databases. These advances have spurred research on the expressiveness of speech and on conveying paralinguistic information including emotion, speaker-state, and speaker-listener relationships. There have also been substantial efforts towards automating database creation and evaluating the quality of speech synthesised for a variety of tasks that require not just the transmission of information, but also the expression of affect.

The purpose of this special issue is to present recent advances in Expressive Speech Synthesis. Original, previously unpublished research is sought in all areas relevant to the field. In particular, submissions on theory and methods for the following areas are encouraged:

 

Submission procedure:

Prospective authors should prepare manuscripts according to the Information for Authors as published in any recent issue of the Transactions and as available on the web at http://www.ieee.org/organizations/society/sp/infotsa.html. Note that all rules will apply with regard to submission lengths, mandatory overlength page charges, and color charges.

Manuscripts should be submitted electronically through the online IEEE manuscript submission system at http://sps-ieee.manuscriptcentral.com/. When selecting a manuscript type, authors must click on "Special Issue of T-SA on Expressive Speech Synthesis." Authors should follow the instructions for the IEEE Transactions on Speech and Audio Processing and indicate in the Comments to the Editor-in-Chief that the manuscript is submitted for publication in the Special Issue on Statistical and Perceptual Audio Processing. We require a completed copyright form to be signed and faxed to 1-732-562-8905 at the time of submission. Please indicate the manuscript number on the top of the page.

Schedule:
Submission deadline: 1 June 2005
Notification of acceptance: 1 December 2005
Final manuscript due: 28 February 2006
Tentative publication date: May 2006

Guest Editors:
Dr. Nick Campbell ATR Network Informatics Research Labs, Kyoto, Japan nick@atr.jp
Dr. Wael Hamza IBM T.J. Watson Research Center, Yorktown Heights, USA hamzaw@us.ibm.com
Dr. Harald Hoge SIEMENS AG Central Technology, Germany harald.hoege@siemens.com
Dr. Tao Jianhua Pattern Recognition Laboratory, the Chinese Academy of Sciences jhtao@nlpr.ia.ac.cn
Dr. Gerard Bailly Institut de la Communication Parlee, France bailly@icp.inpg.fr
top of page


Call for Papers Special Issue of
The IEEE Transactions on Speech and Audio Processing
Data Mining of Speech, Audio and Dialog

Data mining methods are used to discover patterns and extract potentially useful or interesting information automatically or semi-automatically from data. As a result of the recent advances in machine learning and data mining algorithms, along with the availability of inexpensive storage space and faster processing, data mining has become practical in new areas including speech, audio and spoken language dialog.  Data mining research in these areas is growing rapidly given the influx of speech, audio and dialog data that are becoming more widely available. Fundamental research in areas of prediction, explanation, learning and language understanding of speech and audio data are becoming increasingly important in revolutionizing business processes by providing essential sales and marketing information about the service, customers and product offerings. This research is also enabling a new class of learning conversational systems to be created that can infer knowledge and trends automatically from data, analyze and report application performance, and adapt and improve over time with minimal or zero human involvement.

The purpose of this special issue is to present recent advances in Data Mining Research for Speech, Audio, and Spoken Language Dialog.  Original, previously unpublished submissions for the following areas are encouraged:

Guest Editors:
Dr. Mazin Rahim AT&T Research, Florham Park, USA mazin@research.att.com
Dr. Usama M. Fayyad DMX Group, Seattle, USA fayyad@dmxgroup.com
Dr. Roger Moore 20/20 Speech Ltd., Malvern, U.K. r.moore@2020speech.com
Dr. Geoff Zweig IBM Research, Yorktown Heights, USA gzweig@us.ibm.com

Schedule:
Submission deadline: 1 July 2004 (early submission is encouraged)
Notification of acceptance: 1 January 2005
Final manuscript due: 31 March 2005
Tentative publication date: 1 July 2005

Submission procedure:

Prospective authors should follow the regular guidelines of the IEEE Transactions on Speech and Audio Processing for electronic submission via Manuscript Central. Authors must enter the title of the special issue into the field labeled “Please enter any additional keywords related to this submitted manuscript in order for the paper to be properly assigned to a Guest Editor.” In addition, the title of the special issue should be referenced again in the field marked “Comments to Editor-in-Chief” along with any other pertinent information. You are required to provide a properly executed copyright form to be faxed to the IEEE Signal Processing Society Publications Office (via +1 732-562-8905) at the time of submission. An 8-page limit will be enforced on papers published in the special issue and all papers are subject to the published policy for overlength page charges and color charges.
top of page


Speech Recognition and Text-to-Speech Synthesis R&D Positions

 Would you like to do research on real speech problems and see your work incorporated into products worldwide ?

Would you like to work in a small multinational team of dynamic individuals, based in the centre of a beautiful university town with a large amount of high-tech activity ?

If the answer to the above is yes, then the embedded speech technology posts we have open at Toshiba Research Europe Limited may interest you. We are looking for a speech recognition engineer to work on statistical language modelling. Knowledge of the state-of-the-art and experience with compact representation of SLMs desirable.

We are also looking for a TTS engineer to initially focus on the design and development of prosody modules for European/American TTS systems. The successful candidate will have a good knowledge and understanding of TTS systems, particularly prosodic stages. Experience with algorithm design for state-of-the-art systems is preferred in each case.

For both positions, candidates should have a PhD in Electronic Engineering, Computer Science or a related discipline, followed by 1-3 years industrial/post-doctoral experience, and demonstrated achievement in the area of ASR or TTS (or equivalent industrial experience). Strong software skills using C and/or C++, Linux/Unix, and Perl and/or Python are required. Industrial coding experience is preferred. Knowledge of more than one major European language is desirable.

An attractive salary and benefit package will be offered. TREL carries out work in collaboration with Toshiba’s R&D Center in Japan and the University of Cambridge.

Applicants should send a CV, the names and addresses of three referees and a covering letter to: stg-jobs@crl.toshiba.co.uk


Dr K M Knill, Group Leader, Speech Technology Group,
Cambridge Research Laboratory, Toshiba Research Europe Limited
St George House, 1 Guildhall St
Cambridge, CB2 3NH, UK.

Closing date for applications: 21 May 2004 (applications after the closing date will be considered while the post is unfilled)
top of page

Links to Upcoming Conferences and Workshops

(Organized by Date)

Odyssey2004 - ISCA Workshop on Speaker and Language Recognition
Toledo, Spain, May 31 - June 1, 2004
http://www.odyssey04.org/

3rd International Conference MESAQIN 2004
Czech Republic, June 10-11, 2004
http://wireless.feld.cvut.cz/mesaqin/

IEEE2004 Workshop on Signal Processing Advances in Wireless Communications
Lisbon Portugal, July 11 - 14, 2004
http://spawc2004.isr.ist.utl.pt

SCI2004 - 8th World Conference on Systemics, Cybernetics, and Informatics
Orlando, Florida, July 18 - 21, 2004
http://www.iisci.org/sci2004


Robust 2004: COST278 Workshop on Robustness Issues in Conversational Interaction
University of East Anglia, Norwich, U.K., August 30 - 31, 2004
http://www.cmp.uea.ac.uk/robust04/

EUSIPCO2004
Vienna, Austria, Sept. 7-10, 2004
http://www.nt.tuwien.ac.at/eusipco2004/


COST277 Tutorial Workshop on Nonlinear Speech Processing - Algorithms and Analysis
Vietri sul Mare (Salerno), Italy, September 13-18, 2004
http://www.iiass.it/school2004/index.htm

MLSP'04 IEEE Workshop on Multimedia Signal Processing
Siena, Italy, September 29 - October 1, 2004
http://mmsp.unisi.it

MMSP'04 IEEE Workshop on Machine Learning for Signal Processing
Sao Luis, Brazil, September 29 - October 1, 2004
http://isp.imm.dtu.dk/mlsp2004

4th International Conference on Chinese Spoken Language Processing
Hong Kong, China, December 15-18, 2004

http://www.se.cuhk.edu.hk/~iscslp/index.html

ICSLP2004 - INTERSPEECH 8th Biennial International Conference on Spoken Language Processing
Jeju Island, Korea, October 4-8, 2004
http://www.icslp2004.org

ICASSP2005
Philadelphia, Pennsylvania, May, 2005
http://www.icassp2005.org/

EUROSPEECH 2005 9th European Conference on Speech Communication and Technology
Lisbon, Portugal, September 4-8, 2005
http://www.interspeech2005.org/

back to top