Attacking Vision-Language Computer Agents via Pop-ups

Failed to add items

Sorry, we are unable to add the item because your shopping cart is already at capacity.

Add to Cart failed.

Please try again later

Add to Wish List failed.

Please try again later

Remove from wishlist failed.

Please try again later

Adding to library failed

Please try again

Follow podcast failed

Please try again

Unfollow podcast failed

Please try again

Attacking Vision-Language Computer Agents via Pop-ups

Listen for free

View show details

About this listen

😈 Attacking Vision-Language Computer Agents via Pop-ups

This research paper examines vulnerabilities in vision-language models (VLMs) that power autonomous agents performing computer tasks. The authors show that these VLM agents can be easily tricked into clicking on carefully crafted malicious pop-ups, which humans would typically recognize and avoid. These deceptive pop-ups mislead the agents, disrupting their task performance and reducing success rates. The study tests various pop-up designs across different VLM agents and finds that even simple countermeasures, such as instructing the agent to ignore pop-ups, are ineffective. The authors conclude that these vulnerabilities highlight serious security risks and call for more robust safety measures to ensure reliable agent performance.

📎 Link to paper

No reviews yet

Get Started

Popular Lists

Explore Audible

Attacking Vision-Language Computer Agents via Pop-ups

Failed to add items

Add to Cart failed.

Add to Wish List failed.

Remove from wishlist failed.

Adding to library failed

Follow podcast failed

Unfollow podcast failed

Attacking Vision-Language Computer Agents via Pop-ups

About this listen