More News Guides Info

Categories

News Bitcoin Altcoins Market Industry Blockchain DeFi Regulation Guides Beginners Blockchain DeFi Regulation Scams Security

Info

Info About Contact Privacy policy Disclaimer Editorial Code of Conduct Corrections Policy Terms & Conditions Advertising & Affiliate Disclosure

Latest Articles

Bitcoin Slides Alongside Gold and Silver as Pressure Hits Markets

Bitcoin Slides Alongside Gold and Silver as Pressure Hits Markets

Peter Schiff Warns of a U.S. Dollar Collapse Far Worse Than 2008

Peter Schiff Warns of a U.S. Dollar Collapse Far Worse Than 2008

Dubai Insurance Launches Crypto Wallet for Premium Payments & Claims

Dubai Insurance Launches Crypto Wallet for Premium Payments & Claims

XRP Trades Under Heavy Structure as Derivatives Pressure Continue

XRP Trades Under Heavy Structure as Derivatives Pressure Continue

Categories

Bitcoin Altcoins Market Industry Blockchain DeFi Regulation

Latest News

Bitcoin Slides Alongside Gold and Silver as Pressure Hits Markets

Bitcoin Slides Alongside Gold and Silver as Pressure Hits Markets

Peter Schiff Warns of a U.S. Dollar Collapse Far Worse Than 2008

Peter Schiff Warns of a U.S. Dollar Collapse Far Worse Than 2008

Dubai Insurance Launches Crypto Wallet for Premium Payments & Claims

Dubai Insurance Launches Crypto Wallet for Premium Payments & Claims

XRP Trades Under Heavy Structure as Derivatives Pressure Continue

XRP Trades Under Heavy Structure as Derivatives Pressure Continue

Categories

Beginners Blockchain DeFi Regulation Scams Security

Latest Guides

How “Legit” Projects Quietly Exit Without Rug Pulling

How “Legit” Projects Quietly Exit Without Rug Pulling

The New Era of Crypto Regulation: Where Policy Is Actually Heading

The New Era of Crypto Regulation: Where Policy Is Actually Heading

Data Availability: A Bottleneck Slowing Crypto Growth

Data Availability: A Bottleneck Slowing Crypto Growth

Clarity Act 2026: Everything You Should Know

Clarity Act 2026: Everything You Should Know

About Contact Privacy policy Disclaimer Editorial Code of Conduct Corrections Policy Terms & Conditions Advertising & Affiliate Disclosure

Bitcoin Slides Alongside Gold and Silver as Pressure Hits Markets

Bitcoin Slides Alongside Gold and Silver as Pressure Hits Markets

Patrick | 2026-01-29

Peter Schiff Warns of a U.S. Dollar Collapse Far Worse Than 2008

Peter Schiff Warns of a U.S. Dollar Collapse Far Worse Than 2008

Patrick | 2026-01-29 |

Dubai Insurance Launches Crypto Wallet for Premium Payments & Claims

Dubai Insurance Launches Crypto Wallet for Premium Payments & Claims

Patrick | 2026-01-29 |

XRP Trades Under Heavy Structure as Derivatives Pressure Continue

XRP Trades Under Heavy Structure as Derivatives Pressure Continue

Patrick | 2026-01-29 |

UAE Central Bank Approves First USD-Backed Stablecoin, USDU

UAE Central Bank Approves First USD-Backed Stablecoin, USDU

Patrick | 2026-01-29 |

Why Bitcoin Looks Weak Today But Strong Long-Term

Why Bitcoin Looks Weak Today But Strong Long-Term

Ana | 2026-01-29 |

Tom Lee’s Bitmine Deepens Massive Ethereum Staking Bet to $7.6B

Tom Lee’s Bitmine Deepens Massive Ethereum Staking Bet to $7.6B

Ana | 2026-01-29 |

LIVE

Loading prices...

OpenAI introduces new benchmark to ensure language models produce more accurate answers

A man communicating with AI

OpenAI introduces new benchmark to ensure language models produce more accurate answers

Artificial Intelligence

Shedrach Kongvong 2024-10-30

OpenAI has introduced a new measurement benchmark to ensure that language models provide more accurate answers based on verified facts.

Ad

The company in an announcement on 30 October said the new benchmark known as SimpleQA will aid in measuring the factuality of language models, with a focus on getting models to correctly answer short, fact-seeking questions.

Solving the problem of factuality

Anyone who has used generative AI chatbots such as ChatGPT knows that they give inaccurate or factually incorrect answers many times.

This is because training models that produce factually correct responses is challenging in the AI space.

Ad

As a result, current language models often produce false outputs or answers unsubstantiated by evidence, a problem known as “hallucinations”.

It is also difficult to measure the factuality of such evidence, but OpenAI seeks to correct this With SimpleQA, The company is focusing on measuring the factuality of answers to short, fact-seeking questions rather than long ones.

While this reduces the usefulness of the new benchmark, it is easier to track the factuality of such responses.

The training dataset, according to OpenAI will have high correctness and diversity, with challenging design for frontier models and a good researcher UX.

The process

To build SimpleQA, OpenAI hired AI trainers to browse the web and create short, fact-seeking questions and corresponding answers.

Questions included in the dataset must meet a strict set of criteria, one of which is that a second, independent AI trainer answered each question without seeing the original response. Only questions where both AI trainers’ answers agreed were included.

To finally verify the quality of the dataset, a third AI trainer answered a random sample of 1,000 questions from the dataset, which matched the original agreed answers 94.4% of the time, with a 5.6% disagreement rate, ensuring a high level of factuality.

How do you rate this article?

Subscribe to our YouTube channel for crypto market insights and educational videos.

💥The EU's finally catching up to the Crypto Race

💥The EU's finally catching up to the Crypto Race

Bitcoin might be losing to gold 💀

Bitcoin might be losing to gold 💀

💥Trump Just Cancelled Europe's' Tariffs

💥Trump Just Cancelled Europe's' Tariffs

🏆This is how you build a WINNING Crypto Portfolio

🏆This is how you build a WINNING Crypto Portfolio

😳BitMine Just Made a $200M Bet on Mr Beast!

😳BitMine Just Made a $200M Bet on Mr Beast!

💥Iran's Crypto Activity Just Exploded

💥Iran's Crypto Activity Just Exploded

Subscribe View all videos

Briefly, clearly and without noise – get the most important crypto news and market insights first.

Follow us on X @Techgaged_news

Join our Telegram TechGaged Global

Follow on Flipboard @TechGaged

Tags:

Artificial Intelligence

Shedrach Kongvong

Shedrach Kongvong Senior Staff Reporter. Have a question? Write us at hello@techgaged.com.

Most Read Today

Samsung crushes Apple with over 700 million more smartphones shipped in a decade

Peter Schiff Warns of a U.S. Dollar Collapse Far Worse Than 2008

Dubai Insurance Launches Crypto Wallet for Premium Payments & Claims

XRP Whales Buy The Dip While Price Goes Nowhere

Luxury Meets Hash Power: This $40K Watch Actually Mines Bitcoin

Latest

Bitcoin Slides Alongside Gold and Silver as Pressure Hits Markets

Bitcoin Slides Alongside Gold and Silver as Pressure Hits Markets

Patrick | 2026-01-29

Peter Schiff Warns of a U.S. Dollar Collapse Far Worse Than 2008

Peter Schiff Warns of a U.S. Dollar Collapse Far Worse Than 2008

Patrick | 2026-01-29

Dubai Insurance Launches Crypto Wallet for Premium Payments & Claims

Dubai Insurance Launches Crypto Wallet for Premium Payments & Claims

Patrick | 2026-01-29

XRP Trades Under Heavy Structure as Derivatives Pressure Continue

XRP Trades Under Heavy Structure as Derivatives Pressure Continue

Patrick | 2026-01-29

UAE Central Bank Approves First USD-Backed Stablecoin, USDU

UAE Central Bank Approves First USD-Backed Stablecoin, USDU

Patrick | 2026-01-29

Shedrach Kongvong

Senior Staff Reporter. Have a question? Write us at hello@techgaged.com.

Most Read Today

Samsung crushes Apple with over 700 million more smartphones shipped in a decade

Smartphones &Amp; Iot

Jastra 2025-02-11
352 d. ago
Peter Schiff Warns of a U.S. Dollar Collapse Far Worse Than 2008

Crypto News

Patrick 2026-01-29
7 val. ago
Dubai Insurance Launches Crypto Wallet for Premium Payments & Claims

Crypto News

Patrick 2026-01-29
6 val. ago

MOST ENGAGING

DOGE Waits Its Turn as Gold Goes Parabolic

Altcoins

Ana 2026-01-27
2 d. ago
Smart Money Scoops Up Cardano as Retail Capitulates

Altcoins

Ana 2026-01-27
2 d. ago
Is Silver Echoing XRP’s Explosive Setup?

Altcoins

Ana 2026-01-27
2 d. ago
Dollar Index Near Key Level That Previously Ignited Bitcoin Rallies

Bitcoin

Ana 2026-01-27
2 d. ago
Luxury Meets Hash Power: This $40K Watch Actually Mines Bitcoin

Bitcoin

Ana 2026-01-27
2 d. ago

8

Also read

Similar stories you might like.

Robert Kiyosaki Warns AI Bubble May Be Bigger Than Dot-Com

Artificial Intelligence

Robert Kiyosaki Warns AI Bubble May Be Bigger Than Dot-Com

Ana | 2025-12-24

New Epstein Disclosures Collide With an AI Project Built Months Earlier

Artificial Intelligence

New Epstein Disclosures Collide With an AI Project Built Months Earlier

Ana | 2025-12-22

Tesla Semi Tease Hints At Big Changes Under The Hood

Artificial Intelligence

Tesla Semi Tease Hints At Big Changes Under The Hood

Ana | 2025-12-19

Scientists Use AI to Crack Nature’s Hidden Laws

Artificial Intelligence

Scientists Use AI to Crack Nature’s Hidden Laws

Ana | 2025-12-18

Can AI Ever Be Proven Conscious? A Philosopher Says No

Artificial Intelligence

Can AI Ever Be Proven Conscious? A Philosopher Says No

Ana | 2025-12-18

Researchers Uncover Hidden Bias in Cancer Diagnosis AI

Artificial Intelligence

Researchers Uncover Hidden Bias in Cancer Diagnosis AI

Ana | 2025-12-17

$AI Learns Math Better by Arguing With Itself$

Artificial Intelligence

AI Learns Math Better by Arguing With Itself

Ana | 2025-12-15

New Study Shows AI Can Learn Values Like Children

Artificial Intelligence

New Study Shows AI Can Learn Values Like Children

Ana | 2025-12-12

Musk Says Next FSD Model Adds “Big Missing Piece”

Artificial Intelligence

Musk Says Next FSD Model Adds “Big Missing Piece”

Ana | 2025-12-11

Tesla Update Ends the “Lost in Parking Lot” Problem

Artificial Intelligence

Tesla Update Ends the “Lost in Parking Lot” Problem

Ana | 2025-12-11