Artificial Neural Network Train-Time Poison Defense via Energy-Based Model Dynamics Sampling (Case No. 2024-199)

Summary:

UCLA researchers in the Department of Electrical and Computer Engineering have developed a novel preprocessing algorithm to purify imperceptibly poisoned training datasets that may lead to misclassification and maintain image quality.

Background:

Large datasets are important for the effectiveness of machine learning models, serving as the main tool for the training of algorithms. Unfortunately, even a small number of poisoned images covertly inserted into these sets can cause significant harm and result in misclassifications and compromised integrity of the entire model. These poisoned images have been remarkably adept at manipulating deep learning systems and pose a big threat across sectors heavily reliant on artificial intelligence (AI), such as healthcare, security, finance, and autonomous vehicles.

Poisoned attacks typically come in two forms: triggered and triggerless. Triggered attacks embed imperceptible trigger patterns within training data, resulting in the misclassification of test-time sample containing these concealed triggers. In contrast, triggerless attacks introduce subtle perturbations to individual images, subtly changing their classification and subsequently resulting in incorrect classifications. Given AI’s growing critical role, safeguarding model integrity is paramount. Existing defense strategies often compromise performance, offer limited protection, or impose excessive computational burdens. There is, thus, an urgent need for robust defense mechanisms capable of countering imperceptible poisons without compromising efficiency or imposing excessive computational overhead. Addressing this need is crucial for ensuring continued success of AI models in real-world applications and instilling trust in these machine learning models.

Innovation:

UCLA researchers have developed a preprocessing algorithm that purifies adversarially poisoned training datasets to enable robust deep learning classifier training. This purifying energy-based model (PureEBM) utilizes the learned energy landscape of an energy-based model (EBM) to identify poisoned data points characterized by high energy levels and subsequently purify them towards the natural data manifold. The PureEBM works through a random process to clean up purified data by iteratively adjusting the data to make it similar to normal, clean data, while preserving image quality. Notably, this model requires no prior knowledge of the poison attack or classifier model to be successful. The key innovation in this technology lies in its ability to purify unnoticed attacks, target both triggered and triggerless attacks, and uphold image integrity throughout the purification process. This innovation thus has the potential to ensure AI models maintain integrity in the presence of poisoned images.

Potential Applications:

•   Machine Learning Systems
•   Network Security Systems for protection against cyber attacks
•   Online healthcare systems to protect medical data and patient confidentiality
•   Online information services for protecting the integrity and privacy of customer/ financial data
•   Watermarking and copyright protection
•   Defense applications

Advantages:

•   Effectively purifies state-of-the-art poison attacks while preserving high natural accuracy with no required prior knowledge of the poison attack
•   Targets both triggerless and triggered poison attacks without significant time investments
•   Preserves image quality following EBM purification, ensuring high quality training data

Development-To-Date: First successful demonstration of the invention completed June 2023.

Download as PDF

For More Information:

Joel Kehle

Business Development Officer

joel.kehle@tdg.ucla.edu

Inventors:

Categories:

Software & Algorithms

Platforms > Diagnostic Platform Technologies

Therapeutics > Radiology

Medical Devices > Monitoring And Recording Systems

Medical Devices > Neural Stimulation

Mechanical > Sensors