Marich: A Query-efficient Distributionally Equivalent Model Extraction Attack using Public Data
AI-generated Key Points
- The paper focuses on black-box model stealing attacks where an attacker can only query a machine learning model through publicly available APIs.
- The authors aim to design a black-box model extraction attack that uses a minimal number of queries to create an informative and distributionally equivalent replica of the target model.
- They define distributionally equivalent and max-information model extraction attacks and reduce both attacks into a variational optimization problem.
- The attacker solves this problem to select the most informative queries that simultaneously maximize entropy and reduce the mismatch between the target and stolen models, leading to an active sampling-based query selection algorithm called Marich.
- Marich is evaluated on various text and image datasets, including BERT and ResNet18 models, yielding models that achieve 69-96% of the true model's accuracy using 1,070-6,950 samples from publicly available query datasets that are different from private training datasets.
- Extracted models by Marich lead to prediction distributions that are approximately 2-4 times closer to the target's distribution compared to existing active sampling-based algorithms.
- Additionally, these extracted models lead to 85-95% accuracy under membership inference attacks.
- In related works, the authors elaborate on previous research in model extraction literature and aim to mitigate some of its limitations by generalizing three approaches for extracting models: task accuracy, fidelity, and functional equivalence model extractions using a novel definition of distributional equivalence.
- They also introduce a novel information-theoretic objective for model extraction which maximizes mutual information between target and extracted models over the whole data domain.
- Furthermore, they discuss different classes of target models such as linear models or neural networks as well as types of query feedback used in learning-based attack algorithms such as probability vectors or gradients of last layers in neural networks.
- Finally, they highlight different types of query datasets used in previous research such as synthetically generated samples, adversarially perturbed private datasets, and publicly available datasets. The authors use publicly available datasets in Marich to avoid restricting access to knowledge of the private dataset or any perturbed version of it.
Authors: Pratik Karmakar, Debabrota Basu
Abstract: We study black-box model stealing attacks where the attacker can query a machine learning model only through publicly available APIs. Specifically, our aim is to design a black-box model extraction attack that uses minimal number of queries to create an informative and distributionally equivalent replica of the target model. First, we define distributionally equivalent and max-information model extraction attacks. Then, we reduce both the attacks into a variational optimisation problem. The attacker solves this problem to select the most informative queries that simultaneously maximise the entropy and reduce the mismatch between the target and the stolen models. This leads us to an active sampling-based query selection algorithm, Marich. We evaluate Marich on different text and image data sets, and different models, including BERT and ResNet18. Marich is able to extract models that achieve $69-96\%$ of true model's accuracy and uses $1,070 - 6,950$ samples from the publicly available query datasets, which are different from the private training datasets. Models extracted by Marich yield prediction distributions, which are $\sim2-4\times$ closer to the target's distribution in comparison to the existing active sampling-based algorithms. The extracted models also lead to $85-95\%$ accuracy under membership inference attacks. Experimental results validate that Marich is query-efficient, and also capable of performing task-accurate, high-fidelity, and informative model extraction.
Ask questions about this paper to our AI assistant
You can also chat with multiple papers at once here.
Assess the quality of the AI-generated content by voting
Score: 0
Why do we need votes?
Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.
The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.
Look for similar papers (in beta version)
By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.
Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.