Introducing VoicePrivacy

The VoicePrivacy initiative is spearheading the effort to develop privacy preservation solutions for speech technology. It aims to gather a new community to define the task and metrics and to benchmark initial solutions using common datasets, protocols and metrics. VoicePrivacy takes the form of a competitive challenge. The challenge is to develop anonymization solutions which suppress personally identifiable information contained within speech signals. At the same time, solutions should preserve linguistic content and speech quality/naturalness. The challenge will conclude with a session/event held in conjunction with Interspeech 2020 at which challenge results will be made publicly available.


Participants are encouraged to subscribe to the VoicePrivacy 2020 mailing list by sending an email to:

with “subscribe 2020” as the subject line. Successful subscriptions are confirmed by return email.

To post messages to the mailing list itself, emails should be addressed to:


Participants are requested to register for the evaluation. Registration should be performed once only for each participating entity and by sending an email to:

with “VoicePrivacy 2020 registration” as the subject line.

The mail body should include:

(i) the name of the team; (ii) the name of the contact person; (iii) their country; (iv) their status (academic/nonacademic).


Anywhere on Earth (AoE)

Release of evaluation plan 6th February 2020
Release of training and development data 8th February 2020
Release of evaluation data 15th February 2020
Deadline-1 for participants to submit objective evaluation results 8th May 2020
Interspeech-2020 paper submission deadline 8th May 2020
Anonymized development and evaluation speech data upload 12th May 2020
Submission of system descriptions-1 15th May 2020
Deadline-2 for participants to submit objective evaluation results 16th June 2020
Anonymized development and evaluation speech data upload (for deadline-2) 19th June 2020
Submission of system descriptions-2 23rd June 2020
Annoncement of requriments for submission of additional data (optional) 7th July 2020
Submission of additional data (optional) 19th July 2020
Interspeech paper acceptance/rejection notification 24th July 2020
Interspeech paper camera ready 7th August 2020
System description camera ready 10th August 2020
Organizers return subjective evaluation results to participants October 2020
VoicePrivacy special session/event at Interspeech 2020 26th–29th October 2020
Journal special paper issue submission deadline 8th January 2021

Participants may submit to one or both deadlines. Interspeech paper submission is encouraged but optional. All participants will be invited to present their work at the VoicePrivacy session/event.


Several publicly available corpora will be used for training, development and evaluation of speaker anonymization systems. The detailed development and evalaution subsets are described in Evaluation plan. They will be comprised of subsets from the following corpora:


  1. VoxCeleb-1,2
  2. Librispeech (train-clean-100, train-other-500)
  3. LibriTTS (train-clean-100, train-other-500)


  1. Librispeech subset libri_dev can be downloaded from server in
  2. VCTK subset vctk_dev can be downloaded from server in


We provide a software for two different anonymization system baselines:

  1. Baseline-1: Anonymization using x-vectors and neural waveform models
  2. Baseline-2: Anonymization using McAdams coefficient


The following are examples of original and anonymised versions (primary Baseline-1).

  • LibriSpeech utterances:

  • VCTK utterances:

Evaluation metrics

We propose to use both objective and subjective metrics to assess speaker verification/re-identification ability. In addition to preserving privacy, the privacy-driven transformation should preserve speech intelligibility and naturalness when used in human communication scenarios, and automatic speech recognition (ASR) training and/or testing performance when used in human-machine communication scenarios.

Objective measures

  1. Speaker verifiability:
    • Equal error rate (EER)
    • Log-likelihood-ratio cost function Cllr / minCllr
  2. Speech intelligibility:
    • Word error rate (WER)

Subjective measures

  1. Subjective speaker verifiability
  2. Subjective speaker linkability
  3. Subjective speech intelligibility
  4. Subjective speech naturalness

Submission of results (deadline-2)

Each participant may make up to 5 different submissions. In case of several system submissions, participants should indicate a primary system among them, and the rest systems should be marked as contrastive. Only primary systems will be considered in subjective evaluation.

Submissions consist of the three parts:

  1. Results and scores.
  2. Anonymized speech data.
  3. System description.

1. Results and scores.

Deadline: June 16, 2020, 23.59 Anywhere on Earth (AoE)

Submission: a gzipped TAR archive sent as an email attachment to The name of the archive file should correspond to the name of the team used in registration.

Archive structure: the TAR archive should include directories: primary, contrastive1, contrastive2,… where each directory contains the full results directory generated by the run of the evaluation system (in exp\results-<date>-<time>). See example of the results directory: results-example.

Instructions to run evaluation with your anonymization:

2. Anonymized speech data.

Deadline: June 19, 2020, AoE

Submission: a gzipped TAR archive uploaded to the sftp challenge server Each registration team will receive a letter with a personal login and password to upload the data. The name of the archive file should correspond to the name of the team used in registration.

Archive structure: the TAR archive should include directories: primary, contrastive1, contrastive2,… with the same names as used for “results and scores” submission. Each directory should contain the corresponding anonymized speech data (wav files, 16kHz, with the same names as in the original corpus) generated from the evaluation and development datasets. For evaluation, wav files will be converted to 16-bit Signed Integer PCM format, and this format is recommended for submission. These data will be used by the challenge organizers to verify the submitted scores, make post-evaluation analysis with other metrics and to run listening tests for subjective evaluation.

deadline-2\  < TEAM NAME USED IN REGISTRATION > \primary\

3. System description.

For the challenge participants, there are two types of paper submissions: “Interspeech paper” or/and “system description”.

  • Interspeech paper:

Papers should be submitted through the Interspeech-2020 submission system choosing the topic “13.8 Voice Privacy Challenge“. For authors who submitted a paper to Interspeech and whose final system has not incurred major changes w.r.t. that paper, the system description is considered to be identical to that paper.

Deadline: May 8, 2020, AoE. Updates to the pdf and media files, the title, the abstract and LREC resources will be permitted until 15-May-2020, 23:59 AoE. Updates to authors and topics cannot be made.

  • System description:

The teams that submit results by deadline-1 or deadline-2, but do not submit Interspeech papers related to their challenge entry, or whose system has incurred major changes w.r.t. that paper, should email their system description to For the system description, the Interspeech-2020 paper template should be used, but the rules are less strict w.r.t. the paper length, number of pages can be 2-6.

Deadline-1: May 15, 2020, AoE.

Deadline-2: June 23, 2020, AoE.

In the system description, participants should present their results in the same table format as in the challenge evaluation plan or as in the paper Introducing the VoicePrivacy Initiative. Check the following overleaf document for the allowed table formats to report the results: latex template for table results. In order to convert the result file generated by the evaluation results to LaTeX tables in the required format, participants can use the following script:

Organisers (in alphabetical order)

Jean-François Bonastre - University of Avignon - LIA, France
Nicholas Evans - EURECOM, France
Fuming Fang - NII, Japan
Andreas Nautsch - EURECOM, France
Paul-Gauthier Noé - University of Avignon - LIA, France
Jose Patino - EURECOM, France
Md Sahidullah - Inria, France
Brij Mohan Lal Srivastava - Inria, France
Natalia Tomashenko - University of Avignon - LIA, France
Massimiliano Todisco - EURECOM, France
Emmanuel Vincent - Inria, France
Xin Wang - NII, Japan
Junichi Yamagishi - NII, Japan and University of Edinburgh, UK

System descriptions


X-Vector Singular Value Modification and Statistical-Based Decomposition with Ensemble Regression Modeling for Speaker Anonymization System. Candy Olivia Mawalim, Kasorn Galajit, Jessada Karnjana, Masashi Unoki.

DA-IICT SpeechGroup

Design of Voice Privacy System using Linear Prediction. Priyanka Gupta, Gauri P. Prajapati, Shrishti Singh, Madhu R. Kamble, Hemant A. Patil.


Adjustable Deterministic Pseudonymisation of Speech: Idiap-NKI’s submission to VoicePrivacy 2020 Challenge. S. Pavankumar Dubagunta, Rob J.J.H. van Son and Mathew Magimai.-Doss.

Kyoto Team

System Description for Voice Privacy Challenge (Kyoto Team). Yaowei Han, Sheng Li, Yang Cao, Masatoshi Yoshikawa.


Speaker information modification in the VoicePrivacy 2020 toolchain. Pierre Champion, Denis Jouvet, Anthony Larcher.


Speaker Anonymization with Distribution-Preserving X-Vector Generation for the VoicePrivacy Challenge 2020. Henry Turner, Giulio Lovisotto, Ivan Martinovic.


Speaker De-identification System using Autoencoders and Adversarial Training. Fernando M. Espinoza-Cuadros, Juan M. Perero-Codosero, Javier Anton-Martin, Luis A. Hernandez-Gomez.


Analysis of PingAn Submission in the VoicePrivacy 2020 Challenge. Chien-Lin Huang.


This work was supported in part by the French National Research Agency under projects HARPOCRATES (ANR-19-DATA-0008) and DEEP-PRIVACY (ANR-18-CE23-0018), by the European Union’s Horizon 2020 Research and Innovation Program under Grant Agreement No. 825081 COMPRISE, and jointly by the French National Research Agency and the Japan Science and Technology Agency under project VoicePersonae.