Pliny the Prompter says it typically takes him about 30 minutes to break the world’s most powerful artificial intelligence models. The pseudonymous hacker has manipulated Meta’s Llama 3 into sharing instructions for making napalm. He made Elon Musk’s Grok gush about Adolf Hitler. His own hacked version of OpenAI’s latest GPT-4o model, dubbed “Godmode GPT”, was banned by the start-up after it started advising on illegal activities.
Other variations have emerged, such as EscapeGPT, BadGPT, DarkGPT and Black Hat GPT, according to AI security group SlashNext. Some hackers use “uncensored” open-source models. For others, jailbreaking attacks — or getting around the safeguards built into existing LLMs — represent a new craft, with perpetrators often sharing tips in communities on social media platforms such as Reddit or Discord.
Technology Technology Latest News, Technology Technology Headlines
Similar News:You can also read news stories similar to this one that we have collected from other news sources.
Source: TIME - 🏆 93. / 53 Read more »
Source: SkyNews - 🏆 35. / 67 Read more »