The VoicePrivacy initiative is spearheading the effort to develop privacy preservation solutions for speech technology. It aims to gather a new community to define the task and metrics and to benchmark initial solutions using common datasets, protocols and metrics. VoicePrivacy takes the form of a competitive challenge. The challenge is to develop anonymization solutions which suppress personally identifiable information contained within speech signals. At the same time, solutions should preserve linguistic content and speech quality/naturalness. The challenge will conclude with a session/event held in conjunction with Interspeech 2020 at which challenge results will be made publicly available.
Results of the VoicePrivacy Challenge are presented at the VoicePrivacy 2020 Virtual Workshop at Odyssey 2020
Participants are encouraged to subscribe to the VoicePrivacy 2020 mailing list by sending an email to:
sympa@lists.voiceprivacychallenge.org
with “subscribe 2020” as the subject line. Successful subscriptions are confirmed by return email.
To post messages to the mailing list itself, emails should be addressed to:
2020@lists.voiceprivacychallenge.org
Participants are requested to register for the evaluation. Registration should be performed once only for each participating entity and by sending an email to:
organisers@lists.voiceprivacychallenge.org
with “VoicePrivacy 2020 registration” as the subject line.
The mail body should include:
(i) the name of the team; (ii) the name of the contact person; (iii) their country; (iv) their status (academic/nonacademic).
Anywhere on Earth (AoE)
Release of evaluation plan | |
Release of training and development data | |
Release of evaluation data | |
Deadline-1 for participants to submit objective evaluation results | |
Interspeech-2020 paper submission deadline | |
Anonymized development and evaluation speech data upload | |
Submission of system descriptions-1 | |
Deadline-2 for participants to submit objective evaluation results | |
Anonymized development and evaluation speech data upload (for deadline-2) | |
Submission of system descriptions-2 | |
Annoncement of requriments for submission of additional data (optional) | |
Submission of additional data (optional) | |
Interspeech paper acceptance/rejection notification | |
Interspeech paper camera ready | |
System description camera ready | |
VoicePrivacy special session/event at Interspeech 2020 | |
VoicePrivacy 2020 Virtual Workshop at Odyssey 2020 | |
Journal special paper issue submission deadline | 8th January 2021 |
Participants may submit to one or both deadlines. Interspeech paper submission is encouraged but optional. All participants will be invited to present their work at the VoicePrivacy session/event.
Several publicly available corpora will be used for training, development and evaluation of speaker anonymization systems. The detailed development and evalaution subsets are described in Evaluation plan. They will be comprised of subsets from the following corpora:
We provide a software for two different anonymization system baselines:
https://github.com/Voice-Privacy-Challenge/Voice-Privacy-Challenge-2020
The following are examples of original and anonymised versions (primary Baseline-1).
We propose to use both objective and subjective metrics to assess speaker verification/re-identification ability. In addition to preserving privacy, the privacy-driven transformation should preserve speech intelligibility and naturalness when used in human communication scenarios, and automatic speech recognition (ASR) training and/or testing performance when used in human-machine communication scenarios.
Objective measures
Subjective measures
Each participant may make up to 5 different submissions. In case of several system submissions, participants should indicate a primary system among them, and the rest systems should be marked as contrastive. Only primary systems will be considered in subjective evaluation.
Submissions consist of the three parts:
Deadline: June 16, 2020, 23.59 Anywhere on Earth (AoE)
Submission: a gzipped TAR archive sent as an email attachment to organisers@lists.voiceprivacychallenge.org. The name of the archive file should correspond to the name of the team used in registration.
Archive structure: the TAR archive should include directories: primary, contrastive1, contrastive2,… where each directory contains the full results directory generated by the run of the evaluation system (in exp\results-<date>-<time>). See example of the results directory: results-example.
Instructions to run evaluation with your anonymization: https://github.com/Voice-Privacy-Challenge/Voice-Privacy-Challenge-2020/wiki/Evaluation
Deadline: June 19, 2020, AoE
Submission: a gzipped TAR archive uploaded to the sftp challenge server voiceprivacychallenge.univ-avignon.fr. Each registration team will receive a letter with a personal login and password to upload the data. The name of the archive file should correspond to the name of the team used in registration.
Archive structure: the TAR archive should include directories: primary, contrastive1, contrastive2,… with the same names as used for “results and scores” submission. Each directory should contain the corresponding anonymized speech data (wav files, 16kHz, with the same names as in the original corpus) generated from the evaluation and development datasets. For evaluation, wav files will be converted to 16-bit Signed Integer PCM format, and this format is recommended for submission. These data will be used by the challenge organizers to verify the submitted scores, make post-evaluation analysis with other metrics and to run listening tests for subjective evaluation.
deadline-2\ < TEAM NAME USED IN REGISTRATION > \primary\
libri_dev\
libri_test\
vctk_dev\
vctk_test\
\contrastive1\
libri_dev\
libri_test\
vctk_dev\
vctk_test\
...
For the challenge participants, there are two types of paper submissions: “Interspeech paper” or/and “system description”.
Papers should be submitted through the Interspeech-2020 submission system choosing the topic “13.8 Voice Privacy Challenge“. For authors who submitted a paper to Interspeech and whose final system has not incurred major changes w.r.t. that paper, the system description is considered to be identical to that paper.
Deadline: May 8, 2020, AoE. Updates to the pdf and media files, the title, the abstract and LREC resources will be permitted until 15-May-2020, 23:59 AoE. Updates to authors and topics cannot be made.
The teams that submit results by deadline-1 or deadline-2, but do not submit Interspeech papers related to their challenge entry, or whose system has incurred major changes w.r.t. that paper, should email their system description to organisers@lists.voiceprivacychallenge.org. For the system description, the Interspeech-2020 paper template should be used, but the rules are less strict w.r.t. the paper length, number of pages can be 2-6.
Deadline-1: May 15, 2020, AoE.
Deadline-2: June 23, 2020, AoE.
In the system description, participants should present their results in the same table format as in the challenge evaluation plan or as in the paper Introducing the VoicePrivacy Initiative. Check the following overleaf document for the allowed table formats to report the results: latex template for table results. In order to convert the result file generated by the evaluation results to LaTeX tables in the required format, participants can use the following script: results_to_latex.py.
Jean-François Bonastre - University of Avignon - LIA, France
Nicholas Evans - EURECOM, France
Fuming Fang - NII, Japan
Andreas Nautsch - EURECOM, France
Paul-Gauthier Noé - University of Avignon - LIA, France
Jose Patino - EURECOM, France
Md Sahidullah - Inria, France
Brij Mohan Lal Srivastava - Inria, France
Natalia Tomashenko - University of Avignon - LIA, France
Massimiliano Todisco - EURECOM, France
Emmanuel Vincent - Inria, France
Xin Wang - NII, Japan
Junichi Yamagishi - NII, Japan and University of Edinburgh, UK
X-Vector Singular Value Modification and Statistical-Based Decomposition with Ensemble Regression Modeling for Speaker Anonymization System.
Candy Olivia Mawalim, Kasorn Galajit, Jessada Karnjana, Masashi Unoki.
Design of Voice Privacy System using Linear Prediction.
Priyanka Gupta, Gauri P. Prajapati, Shrishti Singh, Madhu R. Kamble, Hemant A. Patil.
Adjustable Deterministic Pseudonymisation of Speech: Idiap-NKI’s submission to VoicePrivacy 2020 Challenge.
S. Pavankumar Dubagunta, Rob J.J.H. van Son and Mathew Magimai.-Doss.
System Description for Voice Privacy Challenge (Kyoto Team).
Yaowei Han, Sheng Li, Yang Cao, Masatoshi Yoshikawa.
Speaker information modification in the VoicePrivacy 2020 toolchain.
Pierre Champion, Denis Jouvet, Anthony Larcher.
Speaker Anonymization with Distribution-Preserving X-Vector Generation for the VoicePrivacy Challenge 2020.
Henry Turner, Giulio Lovisotto, Ivan Martinovic.
Speaker De-identification System using Autoencoders and Adversarial Training.
Fernando M. Espinoza-Cuadros, Juan M. Perero-Codosero, Javier Anton-Martin, Luis A. Hernandez-Gomez.
Analysis of PingAn Submission in the VoicePrivacy 2020 Challenge.
Chien-Lin Huang.
Some slides are more detailed than those actually presented at the workshop.
ZEBRA. Andreas Nautsch Slides
Linkability. Mohamed Maouche Slides
Similarity matricies. Paul Gauthier Noé Slides
Using anonymized speech data to train attack models and ASR. Natalia Tomashenko Slides
This work was supported in part by the French National Research Agency under projects HARPOCRATES (ANR-19-DATA-0008) and DEEP-PRIVACY (ANR-18-CE23-0018), by the European Union’s Horizon 2020 Research and Innovation Program under Grant Agreement No. 825081 COMPRISE, and jointly by the French National Research Agency and the Japan Science and Technology Agency under project VoicePersonae.