Summary:
UCLA researchers in the Department of Electrical and Computer Engineering have developed a novel preprocessing algorithm to purify imperceptibly poisoned training datasets that may lead to misclassification and maintain image quality.
Background:
Large datasets are important for the effectiveness of machine learning models, serving as the main tool for the training of algorithms. Unfortunately, even a small number of poisoned images covertly inserted into these sets can cause significant harm and result in misclassifications and compromised integrity of the entire model. These poisoned images have been remarkably adept at manipulating deep learning systems and pose a big threat across sectors heavily reliant on artificial intelligence (AI), such as healthcare, security, finance, and autonomous vehicles.
Poisoned attacks typically come in two forms: triggered and triggerless. Triggered attacks embed imperceptible trigger patterns within training data, resulting in the misclassification of test-time sample containing these concealed triggers. In contrast, triggerless attacks introduce subtle perturbations to individual images, subtly changing their classification and subsequently resulting in incorrect classifications. Given AI’s growing critical role, safeguarding model integrity is paramount. Existing defense strategies often compromise performance, offer limited protection, or impose excessive computational burdens. There is, thus, an urgent need for robust defense mechanisms capable of countering imperceptible poisons without compromising efficiency or imposing excessive computational overhead. Addressing this need is crucial for ensuring continued success of AI models in real-world applications and instilling trust in these machine learning models.
Innovation:
UCLA researchers have developed a preprocessing algorithm that purifies adversarially poisoned training datasets to enable robust deep learning classifier training. This purifying energy-based model (PureEBM) utilizes the learned energy landscape of an energy-based model (EBM) to identify poisoned data points characterized by high energy levels and subsequently purify them towards the natural data manifold. The PureEBM works through a random process to clean up purified data by iteratively adjusting the data to make it similar to normal, clean data, while preserving image quality. Notably, this model requires no prior knowledge of the poison attack or classifier model to be successful. The key innovation in this technology lies in its ability to purify unnoticed attacks, target both triggered and triggerless attacks, and uphold image integrity throughout the purification process. This innovation thus has the potential to ensure AI models maintain integrity in the presence of poisoned images.
Potential Applications:
• Machine Learning Systems
• Network Security Systems for protection against cyber attacks
• Online healthcare systems to protect medical data and patient confidentiality
• Online information services for protecting the integrity and privacy of customer/ financial data
• Watermarking and copyright protection
• Defense applications
Advantages:
• Effectively purifies state-of-the-art poison attacks while preserving high natural accuracy with no required prior knowledge of the poison attack
• Targets both triggerless and triggered poison attacks without significant time investments
• Preserves image quality following EBM purification, ensuring high quality training data
Development-To-Date: First successful demonstration of the invention completed June 2023.