How To Not Be Wrong About AI



Greg Wilson

June 2026

http://third-bit.com/notwrong/

What This Talk Is About

Why Media Coverage Fails You

Empirical Software Engineering

The Question You Actually Need to Answer

For Example

Claims, Studies, and Evidence

Why "Productivity" Is Hard to Define

Construct Validity and Proxy Metrics

What You Can and Cannot Measure

The Big Three Mistakes

Bias and Baselines

Metrics That Mislead

Qualitative Methods: When and Why

Designing Good Interviews and Surveys

Thematic Analysis

Controlled Experiments

p-Values: What They Are and Are Not

Effect Size Matters

Most SE Experiments Are Underpowered

Observational Studies: Watching the World

Looking Where the Light Is

Reading Studies Critically

HARKing and p-Hacking

A Checklist for Evaluating a Study

Goal-Question-Metric

Starting Cheaply

Formative vs. Summative Evaluation

Finding Participants Without a Budget

Running a Think-Aloud Session

The RITE Method and What You Can Claim

So, What Do We Know?

What to Do Next

Sharing Results Responsibly

Thank You

start where you are · use what you have · help who you can