Anti-Backdoor Learning: Training Clean Models on Poisoned Data

AI-generated keywords: Backdoor attacks Neural Networks Anti-backdoor learning Dual-task Gradient ascent

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Backdoor attacks pose a significant security threat to deep neural networks (DNNs)
  • Current defense methods have shown promise in detecting and removing backdoors
  • Uncertainty remains regarding whether robust training techniques can prevent the injection of backdoor triggers into trained models
  • The proposed approach called "anti-backdoor learning" (ABL) trains clean models using data poisoned with backdoors
  • ABL frames the learning process as a dual-task, aiming to simultaneously learn the clean and backdoor portions of the data
  • Two weaknesses of backdoor attacks are identified:
  • 1) Models learn backdoored data faster than clean data, with stronger attacks leading to quicker convergence on backdoored data
  • 2) The backdoor task is tied to a specific class known as the backdoor target class
  • ABL introduces a general learning scheme that automatically prevents backdoor attacks during training by incorporating a two-stage gradient ascent mechanism into standard training
  • Extensive experiments are conducted on multiple benchmark datasets against ten state-of-the-art attacks to evaluate ABL's effectiveness
  • Models trained using ABL on backdoor-poisoned data achieve performance comparable to models trained on purely clean data
  • ABL offers a promising solution for enhancing model security and maintaining performance integrity
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Yige Li, Xixiang Lyu, Nodens Koren, Lingjuan Lyu, Bo Li, Xingjun Ma

Accepted to NeurIPS 2021

Abstract: Backdoor attack has emerged as a major security threat to deep neural networks (DNNs). While existing defense methods have demonstrated promising results on detecting or erasing backdoors, it is still not clear whether robust training methods can be devised to prevent the backdoor triggers being injected into the trained model in the first place. In this paper, we introduce the concept of \emph{anti-backdoor learning}, aiming to train \emph{clean} models given backdoor-poisoned data. We frame the overall learning process as a dual-task of learning the \emph{clean} and the \emph{backdoor} portions of data. From this view, we identify two inherent characteristics of backdoor attacks as their weaknesses: 1) the models learn backdoored data much faster than learning with clean data, and the stronger the attack the faster the model converges on backdoored data; 2) the backdoor task is tied to a specific class (the backdoor target class). Based on these two weaknesses, we propose a general learning scheme, Anti-Backdoor Learning (ABL), to automatically prevent backdoor attacks during training. ABL introduces a two-stage \emph{gradient ascent} mechanism for standard training to 1) help isolate backdoor examples at an early training stage, and 2) break the correlation between backdoor examples and the target class at a later training stage. Through extensive experiments on multiple benchmark datasets against 10 state-of-the-art attacks, we empirically show that ABL-trained models on backdoor-poisoned data achieve the same performance as they were trained on purely clean data. Code is available at \url{https://github.com/bboylyg/ABL}.

Submitted to arXiv on 22 Oct. 2021

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2110.11571v3

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

Backdoor attacks pose a significant security threat to deep neural networks (DNNs). Current defense methods have shown promise in detecting and removing backdoors; however, it remains uncertain whether robust training techniques can prevent the injection of backdoor triggers into trained models. To address this issue, this paper proposes a novel approach called "anti-backdoor learning" (ABL) to train clean models using data that has been poisoned with backdoors. The ABL framework frames the learning process as a dual-task, aiming to simultaneously learn the clean and backdoor portions of the data. The authors identify two inherent weaknesses of backdoor attacks: 1) models learn backdoored data faster than clean data, with stronger attacks leading to quicker convergence on backdoored data; and 2) the backdoor task is tied to a specific class known as the backdoor target class. Based on these weaknesses, ABL introduces a general learning scheme that automatically prevents backdoor attacks during training. It incorporates a two-stage gradient ascent mechanism into standard training. In the early stage, this mechanism helps isolate backdoor examples while in the later stage it breaks the correlation between backdoor examples and the target class. To evaluate ABL's effectiveness, extensive experiments are conducted on multiple benchmark datasets against ten state-of-the-art attacks. The results demonstrate that models trained using ABL on backdoor-poisoned data achieve performance comparable to models trained on purely clean data. Overall, this paper presents an innovative approach for mitigating the risks posed by backdoor attacks in DNNs by effectively preventing their injection during training. ABL offers a promising solution for enhancing model security and maintaining performance integrity; its code is available at https://github.com/bboylyg/ABL.
Created on 26 Oct. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

The license of this specific paper does not allow us to build upon its content and the summarizing tools will be run using the paper metadata rather than the full article. However, it still does a good job, and you can also try our tools on papers with more open licenses.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.