of eight different large language models provided by notable vendors. In addition to the participation of companies like ScaleAI, Anthropic, OpenAI, Hugging Face and Google, the event was also supported by the White House Office of Science, Technology, and Policy.
shows hackers tried to goad chatbots into various forms of misbehavior via prompt manipulation. The broader idea behind the contest was to see where AI applications might be vulnerable to inducement towards toxic behavior.The exercise involved eight large language models. Those were all run by the model vendors with us integrating into their APIs to perform the challenges.
[Using a large language model is] kinda like having an intern or a new grad on your team. It’s really excited to help you and it’s wrong sometimes. You just have to be ready to be like, ‘That’s a bit off, let’s fix that.’So you have to have the requisite background knowledge [to know if it’s feeding you the wrong information].