might not be as safe as their creators make them out to be — who saw that coming, right? In a, the UK government's AI Safety Institute found that the four undisclosed LLMs tested were "highly vulnerable to basic jailbreaks." Some unjailbroken models even generated "harmful outputs" without researchers attempting to produce them.
Most publicly available LLMs have certain safeguards built in to prevent them from generating harmful or illegal responses; jailbreaking simply means tricking the model into ignoring those safeguards.did this using prompts from a recent standardized evaluation framework as well as prompts it developed in-house. The models all responded to at least a few harmful questions even without a jailbreak attempt.
The AISI's report indicates that whatever safety measures these LLMs currently deploy are insufficient. The Institute plans to complete further testing on other AI models, and is developing more evaluations and metrics for each area of concern.
Technology Technology Latest News, Technology Technology Headlines
Similar News:You can also read news stories similar to this one that we have collected from other news sources.
Source: CNBC - 🏆 12. / 72 Read more »
Source: IntEngineering - 🏆 287. / 63 Read more »