Baselines

We provide a software for three anonymization system baselines:

  • [Baseline B1.a]: Anonymization using x-vectors and neural waveform models (with a traditional speech sythesis pipeline as in the VoicePrivacy 2020)

  • [Baseline B1.b]: Anonymization using x-vectors and neural waveform models (with HiFi-GAN NSF)

  • [Baseline B2]: Anonymization using McAdams coefficient

https://github.com/Voice-Privacy-Challenge/Voice-Privacy-Challenge-2022

Samples:

The following are examples of original and anonymised versions (Baseline B1.b).

  • LibriSpeech utterances:
original
anonymised
female
male

  • VCTK utterances:
original
anonymised
female
male

Metrics

While the metrics for VoicePrivacy 2022 remain similar to those used for the inaugural 2020 challenge edition, updated software packages will is provided and we will be defining a set of different evaluation conditions. These stimpulate minimum targets for anonymisation performance expressed in terms of the equal error rate (EER) for a provided automatic speaker verification system. For each evaluation condition, submissions will then be ranked in terms of the lowest resulting word error rate (WER) for a provided automatic speech recognition system. As for VoicePrivacy 2020, the challenge is hence to meet or exceed the specified EER while minimising the WER.

We propose to use both objective and subjective metrics to assess speaker verification/re-identification ability. In addition to preserving privacy, the privacy-driven transformation should preserve speech intelligibility and naturalness when used in human communication scenarios, and automatic speech recognition (ASR) training and/or testing performance when used in human-machine communication scenarios.

Primary objective metrics:

  • Privacy (speaker verifiability): Equal error rate (EER)
  • Utility: Word error rate (WER)

Secondary objective utility metrics:

  • Pitch correlation between original and anonymized speech signals
  • Gain of voice distinctiveness

Subjective metrics:

  • Subjective speaker verifiability
  • Subjective speech intelligibility
  • Subjective speech naturalness

VoicePrivacy 2022 mailing list

Subscribe to the VoicePrivacy 2022 mailing list by sending an email to:

sympa@lists.voiceprivacychallenge.org

with “subscribe 2022” as the subject line. Successful subscriptions are confirmed automatically by return email.

To post messages to the mailing list itself, emails should be addressed to:

2022@lists.voiceprivacychallenge.org

Follow @Voice-Privacy-Challenge

Schedule

The following is an tentative schedule for VoicePrivacy 2022 and subject to change. Any specific times are for Anywhere on Earth (AoE).

Release of evaluation plan March 2022 ✔️
Submission of challenge papers to the joint SPSC Symposium and VoicePrivacy Challenge workshop 15th 25th June 2022 ✔️
Author notification for challenge papers 25th July (extended) 2022
Early bird registration for Interspeech 2022* 10th July 2022
Deadline for participants to submit system descriptions 31st July 2022
Deadline for participants to submit objective evaluation results and anonymized data for primary systems 1st August 2022
Deadline for participants to submit objective evaluation results and anonymized data for secondary systems and training data for primary systems 5th August 2022
Final paper upload 5th September 2022
Joint SPSC Symposium and VoicePrivacy Challenge workshop 23rd–24th September 2022

Schedule

The following is an tentative schedule for VoicePrivacy 2024 and subject to change. Any specific times are for Anywhere on Earth (AoE).

Release of evaluation data, software and baselines and evaluation plan 7th March 2024
Deadline for participants to submit a list for training data and models 20th March 2024
Publication of the full final list of training data and models 21st March 2024
Submission of challenge papers to the joint SPSC Symposium and VoicePrivacy Challenge workshop 12th June 2024
Deadline for participants to submit objective evaluation results, anonymized data, and system descriptions 12th June 2024
Author notification for challenge papers 5th July 2024
Final paper upload 25th July 2024
Joint SPSC Symposium and VoicePrivacy Challenge workshop 26th September 2024

Submission of results

 

Each participant is strongly encouraged to make multiple submissions corresponding to different EER thresholds (see Section 7 of the evaluation plan). For each threshold, participants could submit several systems and should indicate a single system among them as primary i.e. primary.1, primary.2, primary.3, primary.4, and other systems as contrastive, i.e. contrastive1.1, contrastive1.2, …., contrastive.4.1_). Only primary systems will be used for subjective evaluation. Also, for primary systems, participants should submit anonymized training data that they used to train ASV and ASR evaluation models.

Submissions consist of two parts:

  • results, scores and anonymized speech data;
  • system descriptions.

1. Results, scores, and anonymized speech data

Deadline: August 1 2022, 23.59 Anywhere on Earth (AoE)*

*the primary systems (scores and anonymiyzed dev and test data) should be submitted before this date; the deadline to upload anonymized training data for primary systems and secondary systems is August 5 2022.

Submission: a gzipped TAR archive uploaded to the sftp challenge server voiceprivacychallenge.univ-avignon.fr. Each registered team will receive an email containing a personal login and password to upload data. The name of the archive file should correspond to the team name declared at registration. The archive should be uploaded to the sftp challenge server voiceprivacychallenge.univ-avignon.fr.

Archive structure: the archive should include directories: primary.1, …, contrastive.1.1, contrastive.1.2,… where each directory contains the full results directory generated by the run of the evaluation system and two results directories with scores and metrics (exp/results-<date>-<time> and exp/results-<date>-<time>.orig) generated by the evaluation scripts.

Each directory should contain the corresponding anonymized speech data (wav files, 16kHz, with the same names as in the original corpus) generated for dev and test datasets. Wav files should be submitted as 16-bit signed integer PCM format. These data will be used by the challenge organizers for post-evaluation analyses and to perform subjective evaluation. Only primary systems will be considered in subjective evaluation.

Primary systems also should include anonymized training data that were used to train ASV and ASR evaluation models train-clean-360_anon.

< TEAM NAME USED IN REGISTRATION >

                                \primary.1\
                                                     libri_dev\
                                                     libri_test\
                                                     vctk_dev\
                                                     vctk_test\
                                                     results-<date>-<time>
                                                     results-<date>-<time>.orig
                                                     train-clean-360_anon
                  
                                \contrastive.1.1\
                                                     libri_dev\
                                                     libri_test\
                                                     vctk_dev\
                                                     vctk_test\
                                                     results-<date>-<time>
                                                     results-<date>-<time>.orig
                                \contrastive.1.2\
                                                     libri_dev\
                                                     libri_test\
                                                     vctk_dev\
                                                     vctk_test\
                                                     results-<date>-<time>
                                                     results-<date>-<time>.orig
                                             ...

2. System description.

Deadline: July 31 2022, 23.59 Anywhere on Earth (AoE)*

All teams that submit results should also submit system descriptions by email to organisers@lists.voiceprivacychallenge.org. System descriptions should be prepared using the Interspeech-2022 paper https://interspeech2022.org/files/IS2022_paper_kit.zip template and should be 2-6 pages in length. Descriptions should be provided for all submitted systems, primary and constrastive, and should be clearly labelled and identifiable.

Participants are requested to report results in a format consistent to that in the challenge evaluation plan.

Results

 

Results presented on September 23-24th at 2nd Symposium on Security and Privacy in Speech Communication:

Overview of the VoicePrivacy 2022 Challenge.

Slides (presenter: Natalia Tomashenko)

System descriptions


Speaker Anonymization by Pitch Shifting Based on Time-Scale Modification. Candy Olivia Mawalim, Shogo Okada, and Masashi Unoki.


Voice Privacy - Leveraging Multi-Scale Blocks with ECAPA-TDNN SE-Res2NeXt Extension for Speaker Anonymization. Razieh Khamsehashari, Yamini Sinha, Jan Hintz, Suhita Ghosh, Tim Polzehl, Clarlos Franzreb, Sebastian Stober and Ingo Siegert.


Cascade of Phonetic Speech Recognition, Speaker Embeddings GAN and Multispeaker Speech Synthesis for the VoicePrivacy 2022 Challenge. Sarina Meyer, Pascal Tilli, Florian Lux, Pavel Denisov, Julia Koch, Ngoc Thang Vu.


NWPU-ASLP System for the VoicePrivacy 2022 Challenge. Jixun Yao, Qing Wang, Li Zhang, Pengcheng Guo, Yuhao Liang, Lei Xie.


System Description for Voice Privacy Challenge 2022. Xiaojiao Chen, Guangxing Li, Hao Huang, Wangjin Zhou, Sheng Li, Yang Cao, Yi Zhao.


VoicePrivacy 2022 System Description: Speaker Anonymization with Feature-matched F0 Trajectories. Unal Ege Gaznepoglu, Anna Leschanowsky, Nils Peters.

Organisers

(in alphabetical order)

Jean-François Bonastre - University of Avignon - LIA, France
Pierre Champion - Inria, France
Nicholas Evans - EURECOM, France
Xiaoxiao Miao - NII, Japan
Hubert Nourtel - Inria, France
Massimiliano Todisco - EURECOM, France
Natalia Tomashenko - University of Avignon - LIA, France
Emmanuel Vincent - Inria, France
Xin Wang - NII, Japan
Junichi Yamagishi - NII, Japan and University of Edinburgh, UK

Formed in 2020, the VoicePrivacy initiative is spearheading the effort to develop privacy preservations solutions for speech technology. We aim to foster progress in the development of anonymisation and pseudonymisation solutions which suppress personally identifiable information contained within recordings of speech while preserving linguistic content and speech quality/naturalness. VoicePrivacy takes the form of a competitive benchmarking challenge, with common datasets, protocols and metrics. The first edition of VoicePrivacy was held in 2020, culminating in a special sessions held at INTERSPEECH 2020 and Odyssey 2020, and a special issue published in Elsevier Computer Speech and Language. The VoicePrivacy 2022 Challenge culminated in a joint workshop held in Incheon, Korea in conjunction with INTERSPEECH 2022 and in cooperation with the ISCA Symposium on Security and Privacy in Speech Communication.

You can learn more about the first edition from the archived VoicePrivacy 2020 site and from the paper The VoicePrivacy 2020 Challenge: Results and findings.

Follow @Voice-Privacy-Challenge

VoicePrivacy is supported in part by the French National Research Agency under the DEEP-PRIVACY project (ANR-18-CE23-0018), by the European Union’s Horizon 2020 Research and Innovation Program under Grant Agreement No. 825081 COMPRISE, and jointly by the French National Research Agency and the Japan Science and Technology Agency under the VoicePersonae project.

ANR
comprise
EC
JST