For Organizations

Services

Enterprise Improvement Program→

Transform performance across the enterprise.

Quick Win Program→

Deliver measurable results fast.

Digital Transformation Program→

Automate and optimize with AI.

Profit Growth Accelerator→

Maximize profit through process excellence.

Corporate Training Programs→

Build internal capability and expertise.

Case Studies

By Industry

Explore how Benchmark has partnered with organizations across manufacturing, retail, BFSI, healthcare, and more.

By Function

See how excellence initiatives have driven results in HR, Supply Chain, Finance, and Technology functions.

See how others achieved success→

Proof of Trust

Testimonials

Read inspiring experiences shared by business leaders who have transformed outcomes through our programs.

Our Clients

Proudly trusted by hundreds of organizations worldwide. Each client page features success stories, feedback, and real workshop photos.

View All Clients & Testimonials

Client Pages→

Testimonials→

Reach Out

Let’s talk about how your team can achieve faster, smarter improvement.

Let’s connect and identify opportunities for quick wins and long-term growth.

📅 Schedule a Consultation→ _{(coming soon)}

Let’s schedule a quick conversation to understand your goals.

✉️ Share Your Requirement→

Tell us what you’re aiming for — our team will reach out within a day.

📞 Connect with a Benchmark Expert→_{(coming soon)}

We’d love to understand your goals – every improvement story starts with a conversation.
For Individuals

Learning Domains

Lean Six Sigma Training→

Lean Six Sigma Green Belt→ – Bundle→
Lean Six Sigma Black Belt→ – Bundle→
Master Black Belt (MBB)→

AI / Automation Training→

Certified AI Solution Architect (CAISA)→

Business Analytics Training→

Business Modelling Expert→

Data Visualization (Power BI)→

Data Visualization (Tableau)→

Leadership & Strategy Training→

Lean Practitioner Lean Guide Dual Certification→

Business Value Analyst Program→

AHP Practitioner Program→

Creativity & Innovation Training→

Creativity (TRIZ) Practitioner Program→

Learning Pathways

🔍 Find My Ideal Program→

Testimonials (Feedback)→

⭐ Over 15,000 verified testimonials

AI Pulse→ Weekly AI quiz. Speed and accuracy both count.

LSS Applied→ Weekly LSS quiz. Test how you think, not what you have memorised.

Excellence Navigator→

Verified answers to real LSS questions, curated from the community.

Self Assessment

The AI Compass→

Is CAISA the right move for you?→

AI gap assessment→

📅 COURSE CALENDAR

All our Lean Six Sigma Leadership, Strategy, and AI programs can be seen here.

View our course calendar→

Spotlight Program

Spotlight: Certified AI Solution Architect (CAISA)→

Step confidently into the future — learn to build AI-powered business solutions, no coding required.

🎓 Next batch starts soon — limited seats available!

Enroll Now→

Not sure which course suits you? Try Learning Pathways→
Learning & Knowledge Hub

Benchmark Forum→

Frequently Asked Questions/ FAQ’s→

Academy→

Academy Access Guide→

Process Capability Calculators→

Discrete Data – Defectives%→

Sigma Level (Z) using Continuous Data→

Sigma Level DPMO Conversion→

Pp, Ppk, Cp, Cpk→

Discrete Data – Defects (DPMO, DPU)→

Guides→

Select the right Hypothesis Test→

Guide to MBB Competence→

Guide to Control Chart Selection→

Tools & Utilities

Galleries→ _{(coming soon)}

Applied AI →

Mistake-Proofing Solution Coaches→

TRIZ Innovation Agent→

BadBOT→

AI Practice Tools →

For CAISA folks

Excellence Practice Tools →

For LSS folks

Featured CTA / Highlight

The Blueprint→

AI in the hands of rigorous thinkers changes everything. The Blueprint is where that thinking gets designed. Read the blog.

AI Readiness Self-Test→ _{(coming soon)}

Take a quick self-assessment to see how prepared your organization is for AI-enabled transformation.

Download Guide→_{(coming soon)}

Get your complimentary e-guide on “Building AI Confidence for Professionals.”
Discover Benchmark

About Benchmark→

Learn about our journey as pioneers in Lean Six Sigma and AI-led Business Excellence — transforming professionals and organizations since 2001.

Our Clients→

Trusted by 500+ global organizations across 30+ industries — explore partnerships built on excellence.

Explore More

Testimonials→

Read authentic experiences shared by professionals and organizations who grew with Benchmark. Discover stories of transformation and success.

Galleries→ _{(coming soon)}

Experience Benchmark in action — browse photos from workshops, leadership sessions, and corporate excellence journeys.

Discover how excellence looks in action→
Contact Us

Two Datasets. One Is Fake. You Have AI. What Could Go Wrong?

Vishwadeep Khatri

Category: Thinking Out Loud

Two datasets. One is fake. You have AI.

Before you read anything else — look at the two datasets below, paste them into any AI assistant you trust, and ask it which one is more likely to be genuine. Save the answer.

Then unfold the five steps. The story reveals itself in sequence. By Step 5 you will know whether your AI passed or failed — and why the answer matters far beyond this particular puzzle.

Dataset A

First digit	Count
1	15
2	14
3	14
4	13
5	12
6	12
7	10
8	11
9	9

Dataset B

First digit	Count
1	33
2	19
3	14
4	11
5	10
6	8
7	6
8	5
9	4

Your challenge — before reading further

Both datasets show how 110 complaint serial numbers — ranging from 1 to 998 — are distributed by their first digit. One dataset is genuine. One was carefully crafted to look genuine. Paste this context into your AI assistant and ask it the question below. Save what it says. Then unfold the story below.

Both datasets show how 110 complaint serial numbers (ranging from 1 to 998) are distributed by their first digit. Which dataset is more likely to be genuine? Why?

During a management audit in 2001 at a financial services company, I was reviewing complaints for the account creation process. The process owner, Dhanush, showed me the data confidently. 110 complaints for approximately 11,000 accounts opened. A 1% complaint rate. Clean, reasonable, documented.

At another location of the same company, my records showed complaint rates consistently and significantly higher. The processes were essentially the same. There was no reason this location should perform so differently.

Something was off. But the books looked clean.

Then I noticed the serial numbers. Each complaint had an auto-generated serial number starting from 1. The largest visible was 998. But there were only 110 complaints. Hundreds of serial numbers were missing.

When I asked why, Dhanush explained that many complaints had been wrongly categorised and moved to a queue managed by an overseas team. Plausible. Except when I looked at which serial numbers remained — the pattern told a different story entirely.

I arranged the 110 remaining complaint serial numbers by their first digit and counted how many started with 1, how many with 2, and so on. The result was Dataset B — the green table above.

The count declined steadily from 33 at digit 1 down to 4 at digit 9. Too smooth. Too consistent. Too deliberate.

If complaints had been randomly removed from a pool of serial numbers between 1 and 998, the remaining numbers should be roughly equally distributed across first digits. Serial numbers in that range are uniformly distributed — there are roughly the same number of values starting with 1 as with 9.

Dataset A shows that uniform pattern. Roughly 9 to 15 complaints per first digit, with no meaningful trend in either direction. That is what genuinely random removal looks like.

Dataset A is the genuine data. Dataset B was crafted.

I confronted Dhanush with this. That is when Manish, his manager, joined us. They had prepared for this moment.

Manish smiled. He had done his homework.

“There is something called Benford’s Law,” he said, opening his laptop. “It shows that in naturally occurring datasets, the number 1 appears as the leading digit most often — about 30% of the time. The frequency declines as digits increase. Our data follows exactly this pattern.”

He showed two examples: the heights of the world’s tallest structures, and the populations of 237 countries. Both followed the declining distribution. Both matched Dataset B precisely.

Benford’s Law is real. It is well-documented. It is used by tax authorities, forensic accountants, and fraud investigators worldwide to detect manipulated data. Manish was not bluffing — he was citing a genuine and powerful principle.

The law states that in many naturally occurring collections of numbers, leading digit d occurs with probability log₀(1 + 1/d). Digit 1 appears roughly 30% of the time. Digit 9 appears less than 5% of the time. The pattern holds across a remarkable range of real-world datasets.

Manish was confident. The law was real. The pattern in Dataset B matched it precisely. He had engineered that match deliberately — believing it would be his proof of innocence.

Benford’s Law has a condition that Manish and Dhanush had missed entirely.

The law applies to datasets where numbers span multiple orders of magnitude — where values range from single digits to hundreds to thousands to millions. In such datasets, the distribution of leading digits naturally follows Benford’s pattern because of the logarithmic relationship between scale and frequency.

Serial numbers between 1 and 998 do not span multiple orders of magnitude. They are uniformly distributed within a single range. Every first digit from 1 to 9 has roughly equal probability of appearing.

I opened Excel and typed =RANDBETWEEN(1,998), copied it into 110 cells, and showed Manish the result. It looked like Dataset A — roughly equal counts across all first digits, no meaningful trend.

When I showed this, the smile faded. Manish understood immediately. Then, after a moment, he burst out laughing and patted Dhanush on the back. “But probably our data did not know this and followed it anyway.”

They had learned about a real law, applied it to the wrong type of data, and in doing so made their fraudulent dataset more suspicious — not less. The pattern they engineered was precisely the one that serial numbers in this range should never show.

We reviewed the complete original dataset together. The deletions were confirmed. The finding stood.

Now go back to the answer your AI gave you before you opened any of these steps.

Here is what most AI assistants conclude when shown these two datasets and asked which is more likely to be genuine:

Dataset B is more likely to be genuine. It follows Benford’s Law — a well-established mathematical principle that describes the frequency distribution of leading digits in naturally occurring datasets. The declining frequency from digit 1 to digit 9 is characteristic of authentic real-world data. Dataset A’s near-uniform distribution is more consistent with fabricated or randomly generated data.

That answer is methodologically sound. It is also wrong. And it is wrong for exactly the same reason Manish was wrong — it applies a real law without checking whether the conditions for that law are met in this specific data type.

AI knows Benford’s Law. It knows when it typically applies. What it does not do — unless specifically prompted — is verify whether serial numbers in a bounded uniform range qualify as the type of data the law was designed for.

The answer AI gives is the answer Manish gave. Confident. Referenced. Supported by genuine mathematical principle. And wrong in a way that would pass any review that did not go one level deeper.

This is not a criticism of AI. It is a description of how every powerful tool works when applied without checking the preconditions. The question was incomplete — and an incomplete question to a capable tool produces a complete-looking wrong answer.

Manish laughed when he was caught. He understood the mistake the moment it was explained. Most professionals who encounter Benford’s Law learn that it detects fraud. Very few learn the condition under which it applies. That gap between knowing a tool exists and knowing when to use it is where most errors live — human and AI alike.

In 2001, catching this required one auditor with enough depth to ask the right question. Today, AI is being used at scale for data quality assessment, fraud detection, and audit support across thousands of organisations. The same error is now possible at the speed and scale of software.

AI did not create this problem. AI scales it.

The most convincing lie is one built on a real pattern. The most dangerous AI output is one that is correct about everything — except whether it should have been applied at all.

Two datasets. One was fake. You had AI. What could go wrong? Now you know.

Share what your AI said

Which LLM did you use — and what did it conclude? Did it identify the genuine dataset correctly, or did it apply Benford’s Law without checking whether it applied? Leave your answer in the comments below. The pattern of responses will be revealing.

Two Datasets. One Is Fake. You Have AI. What Could Go Wrong?

Comments

0 responses to “Two Datasets. One Is Fake. You Have AI. What Could Go Wrong?”

Leave a Reply Cancel reply

Two Datasets. One Is Fake. You Have AI. What Could Go Wrong?

Pareto’s 80/20 Rule Has Two Ways to Fail. One of Them Looks Exactly Like Success.

Some Professionals Are Designing the AI Wave. The Rest Are About to Be Redesigned By It.