, /PRNewswire/ — It may look like a picture of a panda bear to you, but to your business’s AI agent, it can act like a skeleton key, bypassing safety safeguards and potentially causing the model to generate harmful, misleading or policy-violating outputs.
That risk is the focus of new research from Hadi Amini, associate professor at Florida International University’s Knight Foundation School of Computing and Information Sciences. Together with graduate assistant Md Jueal Mia, he is studying how manipulated images can “jailbreak” certain AI systems, pushing them beyond their built-in safeguards.
“AI models don’t see images the same way humans do,” Amini said. “They see patterns of numbers and pixels. By carefully manipulating those pixels, we can influence how the AI interprets the image and responds.”
The team’s research demonstrated how small-language AI models – the kind frequently employed by small businesses to execute routine tasks like accounting or customer service – have become particularly susceptible to image-based hacks. As shown in research presented at the 2025 International Conference on Machine Learning and Applications (ICMLA), the team found that by introducing microscopic pixel-level changes called “perturbations” into an image, they could trick these AI systems into generating responses that they would normally block.
“The manipulated image is like the face of a stranger,” Amini said. “The AI has to learn when a request should be treated with caution before it answers. In order to protect AI systems from attacks, we try to break them ourselves, identify potential vulnerabilities and design defense mechanisms.”
The researchers then set out to probe the system’s defenses. The more successfully they penetrated the models’ guardrails, the more the systems could be trained to resist future threats. To do this, Amini and his team developed a method called JaiLIP (Jailbreaking with Loss-guided Image Perturbation), which uses an algorithm to determine the optimal degree of pixel-level manipulation.
In tests using BLIP-2, a multimodal AI model used by researchers and developers, Amini and his team found that images modified with JaiLIP significantly increased the likelihood that the system would generate harmful or unsafe responses. In one example, a JaiLIP-altered version of a stoplight tricked the AI model into divulging detailed instructions on how to run the light while avoiding a traffic ticket. Overall, the use of JaiLIP images nearly doubled the number of harmful responses generated by AI models.
The risk extends beyond users simply prompting AI systems for instructions on illegal activity. As businesses increasingly adopt AI-powered customer service agents, chatbots and automated workflows, vulnerabilities in open-source or lightly protected systems could negatively impact users’ trust or create new avenues for cyberattacks.
“Small businesses and companies can benefit from AI to enhance their efficiency, but they have to be aware of the potential vulnerabilities,” Amini said. “They must make sure they’re deploying sufficient guardrails to maintain the safety and integrity of their AI tools.”
Amini said there are some basic precautions that everyone should use before integrating AI into their business or workplace, including limiting the sensitive information they provide to AI systems (especially images), restricting who can access those systems and carefully evaluating the security measures built into AI tools before deployment.
Because safety is paramount, Amini and his team are working to stay one step ahead of potential bad actors in the AI sphere. The more vulnerabilities he and his team can find, the quicker the AI will learn to repair them. The challenge, he said, is ensuring that AI can recognize threats hidden in plain sight — even when humans cannot.
Photos and videos of Amini’s AI research, including interviews and b-roll, are available for media use via Dropbox.
Media Contact:
Brian Zimmerman
305-348-8448
[email protected]
news.fiu.edu
@FIU
SOURCE Florida International University
