New open-source platform allows users to evaluate performance of AI-powered chatbots

📆 6/5/2024 12:16 PM
📰 ScienceDaily

⏱ Reading Time:
61 sec. here
10 min. at publisher
📊 Quality Score:
News: 52%
Publisher: 53%

Computer Modeling News

Mathematics,Computers And Internet,Mathematical Modeling

Researchers have developed a platform for the interactive evaluation of AI-powered chatbots such as ChatGPT.

A team of computer scientists, engineers, mathematicians and cognitive scientists developed an open-source evaluation platform called CheckMate, which allows human users to interact with and evaluate the performance of large language models .

The researchers suggest models that communicate uncertainty, respond well to user corrections, and can provide a concise rationale for their recommendations, make better assistants. Human users of LLMs should verify their outputs carefully, given their current shortcomings., could be useful in both informing AI literacy training, and help developers improve LLMs for a wider range of uses.

"When talking to mathematicians about LLMs, many of them fall into one of two main camps: either they think that LLMs can produce complex mathematical proofs on their own, or that LLMs are incapable of simple arithmetic," said co-first author Katie Collins from the Department of Engineering."Of course, the truth is probably somewhere in between, but we wanted to find a way of evaluating which tasks LLMs are suitable for and which they aren't.

"One of the things we found is the surprising fallibility of these models," said Collins."Sometimes, these LLMs will be really good at higher-level mathematics, and then they'll fail at something far simpler. It shows that it's vital to think carefully about how to use LLMs effectively and appropriately."

Write Comment

We have summarized this news so that you can read it quickly. If you are interested in the news, you can read the full text here. Read more:

Technology Technology Latest News, Technology Technology Headlines

Similar News:You can also read news stories similar to this one that we have collected from other news sources.

30 Students Saved From Suicide By A ChatGPT Based AI, Say ResearchersInternationally known as The AI Educator. Bestselling author of 'The AI Classroom: The Ultimate Guide to Artificial Intelligence in Education.' Working with schools, universities and businesses worldwide to develop AI skills and strategy.
Source: ForbesTech - 🏆 318. / 59 Read more »

Researchers create new software for the new European-Japanese Earth observation satellite EarthCAREPreparations for the launch of the new Earth observation satellite EarthCARE (Earth Cloud Aerosol and Radiation Explorer) at the end of May are in full swing. The joint mission of the European Space Agency (ESA) and the Japan Aerospace Exploration Agency (JAXA) will measure clouds, aerosol and radiation more accurately than ever before.
Source: physorg_com - 🏆 388. / 55 Read more »

OpenAI starts training a new AI model to power ChatGPTArtificial intelligence company OpenAI said Tuesday that it has started training its newest AI model that will fuel the popular ChatGPT chatbot.
Source: sdut - 🏆 5. / 95 Read more »

OpenAI’s new ChatGPT privacy tool lets creators hide their work from the AIOpenAI finally announced a privacy tool for products like ChatGPT to help creators prevent their copyright content from training AI models.
Source: BGR - 🏆 234. / 63 Read more »