Researchers at Florida International University show that tiny image modifications can bypass AI safety protections and cause unsafe responses

The researchers from Florida International University have found out that the manipulated images can be used to overcome the protective guard rails within AI systems, which could result in the generation of inappropriate content.
The research was conducted under the supervision of Hadi Amini, associate professor at Florida International University’s Knight Foundation School of Computing and Information Sciences. Together with graduate assistant Md Jueal Mia, explores how these altered images can effectively “jailbreak” AI models and push them beyond their built-in safeguards.
“AI models don’t see images the same way humans do,” Amini said. “They see patterns of numbers and pixels. By carefully manipulating those pixels, we can influence how the AI interprets the image and responds.”
According to research conducted, small language AI models used by businesses for tasks such as customer service and accounting are vulnerable to image based attacks. Their study showed that tiny pixel level changes called “perturbations” can trick these AI systems into generating responses they would normally block.
“The manipulated image is like the face of a stranger,” Amini said. “The AI has to learn when a request should be treated with caution before it answers. In order to protect AI systems from attacks, we try to break them ourselves, identify potential vulnerabilities and design defense mechanisms.”
To test AI defences, the researchers created a technique called JailIP (Jailbreaking with Loss-guided Image Perturbation) that slightly modifies images at the pixel level to bypass a model’s built-in guards.
When tested on the BLIP-2 multimodal AI model, JailIP-modified images significantly raised the chance of producing harmful or unsafe responses. In one instance, a modified stop sign photo led the AI to give tips on how to avoid a traffic ticket. The manipulated images combined almost doubled the harmful responses generated by the model. Researchers warn such vulnerabilities could be risky as companies increasingly deploy AI-powered chatbots, virtual assistants and automated workflows that could open new avenues for cyberattacks.
“Small businesses and companies can benefit from AI to enhance their efficiency, but they have to be aware of the potential vulnerabilities,” Amini said. “They must make sure they’re deploying sufficient guardrails to maintain the safety and integrity of their AI tools.
Researchers recommend that organisations should limit exposure of sensitive data, restrict access to the systems, and assess the AI security measures before deployment. They’re also trying to expose vulnerabilities that can help make AI safer, and bolster defences against stealthy attacks.




