Page Not Found
Page not found. Your pixels are in another canvas.
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Page not found. Your pixels are in another canvas.
About me
This is a page not in th emain menu
Published:
This post will show up by default. To disable scheduling of future posts, edit config.yml
and set future: false
.
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Short description of portfolio item number 1
Short description of portfolio item number 2
Published in International Conference on Applied Cryptography and Network Security, 2020
Biometrics authentication is now widely deployed, and from that omnipresence comes the necessity to protect private data. Recent studies proved touchscreen handwritten digits to be a reliable biometrics. We set a threat model based on that biometrics: in the event of theft of unlabelled embeddings of handwritten digits, we propose a labelling method inspired by recent unsupervised translation algorithms. Provided a set of unlabelled embeddings known to have been produced by a Long Short Term Memory Recurrent Neural Network (LSTM RNN), we demonstrate that inferring their labels is possible. The proposed approach involves label-wise clustering of the embeddings and label identification of each group by matching their distribution to the label-relative classes of a comparison hand-crafted labeled set of embeddings.
Download here
Published in ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021
In this paper, we investigate template reconstruction attack of touchscreen biometrics, based on handwritten digits writer verification. In the event of a template database theft, we show that reconstructing the original drawn digit from the embeddings is possible without access to the original embedding encoder. Using an external labelled dataset, an attack encoder is trained along with a Mixture Density Recurrent Neural Network decoder. Thanks to an alignment flow, initialized with Linear Discriminant Analysis and Procrustes, the transfer function between the output space of the original and the attack encoder is estimated. The successive application of transfer function and decoder to the stolen embeddings allows to reconstruct the original drawings, which can be used to spoof the behavioural biometrics system.
Download here
Published in 2021 IEEE International Workshop on Information Forensics and Security (WIFS), 2021
In this paper we investigate a template reconstruction attack against a speaker verification system. A stolen speaker embedding is processed with a zero-shot voice-style transfer system to reconstruct a Mel-spectrogram containing as much speaker information as possible. We assume the attacker has a black box access to a state-of-the-art automatic speaker verification system. We modify the AutoVC voice-style transfer system to spoof the automatic speaker verification system. We find that integrating a new loss targeting embedding reconstruction and optimizing training hyper-parameters significantly improves spoofing. Results obtained for speaker verification are similar to other biometrics, such as handwritten digits or face verification. We show on standard corpora (VoxCeleb and VCTK) that the reconstructed Mel-spectrograms contain enough speaker characteristics to spoof the original authentication system.
Download here
Published in 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2021
This paper explores various attack scenarios on a voice anonymization system using embeddings alignment techniques. We use Wasserstein-Procrustes (an algorithm initially designed for unsupervised translation) or Procrustes analysis to match two sets of -vectors, before and after voice anonymization, to mimic this transformation as a rotation function. We compute the optimal rotation and compare the results of this approximation to the official Voice Privacy Challenge results. We show that a complex system like the baseline of the Voice Privacy Challenge can be approximated by a rotation, estimated using a limited set of -vectors. This paper studies the space of solutions for voice anonymization within the specific scope of rotations. Rotations being reversible, the proposed method can recover up to 62% of the speaker identities from anonymized embeddings.
Download here
Published in Le Mans University, 2022
This report describe the research done during the first ESPERANTO/JSALT workshop from the 13th June 2022 to the 5th of August 2022.
Download here
Published in Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023), 2023
This paper presents JHU’s submissions to the IWSLT 2023 dialectal and low-resource track of Tunisian Arabic to English speech translation. The Tunisian dialect lacks formal orthography and abundant training data, making it challenging to develop effective speech translation (ST) systems. To address these challenges, we explore the integration of large pre-trained machine translation (MT) models, such as mBART and NLLB-200 in both end-to-end (E2E) and cascaded speech translation (ST) systems. We also improve the performance of automatic speech recognition (ASR) through the use of pseudo-labeling data augmentation and channel matching on telephone data. Finally, we combine our E2E and cascaded ST systems with Minimum Bayes-Risk decoding. Our combined system achieves a BLEU score of 21.6 and 19.1 on test2 and test3, respectively.
Download here
Published in Workshop on Automatic Speech Recognition and Understanding (ASRU 2023), 2023
Poisoning attacks entail attackers intentionally tampering with training data. In this paper, we consider a dirty-label poisoning attack scenario on a speech commands classifi- cation system. The threat model assumes that certain utter- ances from one of the classes (source class) are poisoned by superimposing a trigger on it, and its label is changed to another class selected by the attacker (target class). We propose a filtering defense against such an attack. First, we use DIstillation with NO labels (DINO) to learn unsupervised representations for all the training examples. Next, we use K-means and LDA to cluster these representations. Finally, we keep the utterances with the most repeated label in their cluster for training and discard the rest. For a 10% poisoned source class, we demonstrate a drop in attack success rate from 99.75% to 0.25%. We test our defense against a variety of threat models, including different target and source classes, as well as trigger variations.
Download here
Published:
I gave an 1h15 talk about the bases of Voice Conversion, which was then followed by a 3h competitive lab on antispoofing techniques against various voice conversion and TTS systems, co-animated by Thibault Gaudier and Valentin Pelloin. This talk was targetting PhD student and grad students.
Published:
Abstract: The majority of today’s machine learning algorithms share common foundations and core concepts, rendering them susceptible to various attacks. In this short talk, I would like to dive into the world of adversarial attacks and poisoning attacks on speech systems. What are they, how dangerous are they, and what can be done against them?
Published:
Abstract: As the prevalence of voice-controlled devices and speech systems continues to grow, so too does the importance of ensuring their security and reliability. However, these systems are increasingly vulnerable to adversarial and poisoning attacks, which can exploit vulnerabilities and compromise their performance. In this talk, we delve into the intricate landscape of adversarial attacks targeting speech systems, presenting our research on detecting and classifying these attacks to better understand their nuances and impact. Furthermore, we discuss the creation of dirty and clean label poisoning attacks, where maliciously crafted data is injected into training datasets, and explore their implications on system integrity. We also examine a range of defenses designed to mitigate the effects of poisoning attacks, aiming to increase the resilience of speech recognition systems against such threats.
Master course, ENSIM, 2022
I was a Teaching Assistant for the course on Machine Learning in the ‘Ecole Nationale Superieure d’Ingenieurs du Mans” in the Spring 2022. We taught students about basic Machine learning techniques, data preparation and Deep Learning basics, through Python. I had the responsibility of 2 groups of 10-15 students for one lab per week.
Undergrad Course, Johns Hopkins University, ECE, 2023
I was co-instructor for the Computational Modelling for Electrical and Computer Engineering Course for undergrad students of the ECE department in the Spring 2023. This course included 3 teaching periods per week to 50 students, for a total of 3 credits for the attendees.
Undergrad Course, Johns Hopkins University, ECE, 2023
I was Instructor for the Explore Machine Learning solutions for Security Course for undergrad students of the School of Engineering during the fall 2023. This course included one period of teaching per week, to two separate groups of 10 students, for a total of 1 credit for the attendees.
400/600 Course, Johns Hopkins University, ECE, 2024
I was co-instructor for the Machine Learning for Signal Processing Course for grad and undergrad students of the ECE department in the Fall 2024. This course included 3 teaching periods per week to 50 students, for a total of 3 credits for the attendees.
Undergrad Course, Johns Hopkins University, ECE, 2024
I was co-instructor for the Computational Modelling for Electrical and Computer Engineering Course for undergrad students of the ECE department in the Spring 2024. This course included 3 teaching periods per week to 50 students, for a total of 3 credits for the attendees.
Student mentoring, Johns Hopkins University, ECE, 2024
I was mentor for the WISE (Women In Science and Engineering), where I suppervised a highschool student - Jayden Stewart - for 4 months, working on depression detection from speech and discovering research.
400/600 Course, Johns Hopkins University, ECE, 2024
I will be Instructor and Founder of the Biometric Systems: Techniques, Applications, and Ethics for grad and undergrad students of the ECE department in the Fall 2024. This course will include 3 teaching periods per week to 50 students, for a total of 3 credits for the attendees.