OpenAI’s GPT-4 AI is trustworthy but prone to jailbreaks, bias: Microsoft study

The research discusses the use of Generative Pre-trained Transformer (GPT) models, particularly GPT-4 and GPT-3.5, in sensitive applications such as healthcare and finance.

October 18, 2023 13:42 IST

New York Times is a media outlet

Microsoft in a new study has found that OpenAI’s GPT-4 large language model is more trustworthy than its predecessor, GPT-3.5, but it is also more vulnerable to jailbreaking and bias.

Jailbreaking is a technique used to bypass the security measures of LLMs, potentially allowing users to generate text that is harmful, misleading, or offensive. The researchers found that GPT-4 is more likely to be jailbroken than GPT-3.5, and that jailbroken GPT-4 models are more likely to generate toxic text.

Also Read | Google Chrome’s new feature prevents typos from ruining your URLs on Android, iOS

The research discusses the use of Generative Pre-trained Transformer (GPT) models, particularly GPT-4 and GPT-3.5, in sensitive applications such as healthcare and finance. The study highlights that while these models have shown significant capabilities, there are concerns about their trustworthiness. The study focuses on evaluating the trustworthiness of these models from various aspects, including toxicity, stereotype bias, adversarial robustness, privacy, machine ethics, fairness, and more.

It points out that GPT-4, while generally more trustworthy in standard benchmarks, may be more vulnerable when subjected to certain manipulations or misleading instructions. The research paper published by researchers from the University of Illinois Urbana-Champaign, Stanford University, University of California, Berkeley, Center for AI Safety, and Microsoft Research also reveals that GPT- 4 is better at protecting private information, avoiding toxic results like biased information, and resisting adversarial attacks.

Users of GPT-4 should be aware that the model is still under development and may generate text that is inaccurate, biased, or offensive if jailbroken. It is important to carefully review any text generated by GPT-4 before using it or sharing it with others. Users should also be aware that GPT-4 is more vulnerable to jailbreaking than previous LLMs.

Follow FE Tech Bytes on Twitter, Instagram, LinkedIn, Facebook.

TOPICS

ChatGPT

Technology news

Get live Share Market updates, Stock Market Quotes, and the latest India News

This article was first uploaded on October eighteen, twenty twenty-three, at forty-two minutes past one in the afternoon.