Data Free Backdoor Attacks

Research Poster Engineering 2025 Graduate Exhibition

Presentation by BOCHUAN CAO

Exhibition Number 101

Abstract

Backdoor attacks aim to inject a backdoor into a classifier such that it predicts any input with an attacker-chosen backdoor trigger as an attacker-chosen target class. Existing backdoor attacks require either retraining the classifier with some clean data or modifying the model's architecture.As a result, they are 1) not applicable when clean data is unavailable, 2) less efficient when the model is large, and 3) less stealthy due to architecture changes. In this work, we propose DFBA, a novel retraining-free and data-free backdoor attack without changing the model architecture. Technically, our proposed method modifies a few parameters of a classifier to inject a backdoor. Through theoretical analysis, we verify that our injected backdoor is provably undetectable and unremovable by various state-of-the-art defenses under mild assumptions. Our evaluation on multiple datasets further demonstrates that our injected backdoor: 1) incurs negligible classification loss, 2) achieves 100\% attack success rates, and 3) bypasses six existing state-of-the-art defenses. Moreover, our comparison with a state-of-the-art non-data-free backdoor attack shows our attack is more stealthy and effective against various defenses while achieving less classification accuracy loss.We will release our code upon paper acceptance.

Importance

Backdoor attacks pose a critical threat to the reliability of machine learning models, yet existing methods often require retraining or architectural modifications, making them impractical in real-world scenarios. Our study introduces Data Free Backdoor Attacks, a novel backdoor attack that operates without access to clean data, retraining, or model structure changes. By modifying only a small subset of model parameters, DFBA achieves a highly effective and stealthy attack that remains undetectable and resilient against state-of-the-art defenses. Our findings highlight fundamental vulnerabilities in deep learning models, emphasizing the urgent need for more robust defense mechanisms.