SQL
Wrong Sort Column Order
Run this query and look at the first several rows. Is the output ordered by island first, or by species first?
-- Show the species, island, and sex of all penguins,
-- sorted alphabetically by island and then by species within each island.
select species, island, sex
from penguins
order by species, island;
Show explanation
A query meant to sort penguins by island and then by species within each island
produces species-first ordering instead; the bug is listing the columns in the
wrong order in the order by clause. Teaches how order by applies columns from
left to right and why column order in that clause matters.
Off-by-One in OFFSET Paging
Run this query and note which rows are returned. Count from the beginning of the full table to check whether rows 11–15 are actually what you got.
-- Retrieve rows 11 through 15 from the penguins table (the third page of 5).
select *
from penguins
limit 5 offset 11;
Show explanation
A query meant to retrieve the third page of five rows returns rows 12–16 instead
of rows 11–15; the bug is using offset 11 (which skips 11 rows) instead of
offset 10. Teaches how limit and offset interact and how to calculate the
correct offset for a given page number.
Integer Division Truncation
Run this query and compare the mass_kg column to the body_mass_g column.
Do the kilogram values look right for a penguin that weighs, say, 3750 g?
-- Show body mass in kilograms for the first 10 penguins.
select species, body_mass_g,
body_mass_g / 1000 as mass_kg
from penguins
limit 10;
Show explanation
A query that converts body mass from grams to kilograms silently truncates the
result because SQLite performs integer division when both operands are integers;
the bug is dividing by 1000 instead of 1000.0. Teaches how to force
floating-point arithmetic in SQL and why checking computed columns against known
values catches this class of error.
DISTINCT Over Too Many Columns
Run this query and count the rows returned. How many distinct species are in the dataset? Does the row count match that number?
-- List the distinct species found in the penguins dataset (one row per species).
select distinct species, sex
from penguins;
Show explanation
A query intended to list the distinct species in the dataset returns seven rows
instead of three because distinct is applied to both species and sex,
producing one row per unique (species, sex) combination; the bug is including sex
in the select clause. Teaches that distinct operates on the entire set of
selected columns, not on individual ones.
NULL Equality Test
Run this query. How many rows does it return? Are there penguins in the table whose sex was not recorded?
-- Find all penguins whose sex was not recorded.
select *
from penguins
where sex = null;
Show explanation
A query that tries to find penguins with no recorded sex returns zero rows because
comparing any value to null with = produces null (not true) in SQL's
ternary logic, so the where clause never passes; the bug is using = null instead
of is null. Teaches SQL's three-valued logic and the two special tests is null
and is not null.
Impossible AND Condition
Run this query. How many rows does it return? Choose a penguin you know is on Biscoe island from a previous query and check whether it appears here.
-- Find all penguins from either Biscoe island or Dream island.
select species, island, sex
from penguins
where island = 'Biscoe' and island = 'Dream';
Show explanation
A query that should return penguins from either Biscoe or Dream island returns
zero rows because and requires both conditions to be true at the same time, but
a penguin can only be on one island; the bug is using and where or is needed.
Teaches the difference between and and or and how to spot conditions that can
never simultaneously be true.
Case-Sensitive String Value
Run this query and count the rows returned. Then run a query to find the distinct
values stored in the sex column. Do any of them match the literal used in the
where clause?
-- Find all female penguins.
select *
from penguins
where sex = 'female';
Show explanation
A query that should return all female penguins returns zero rows because the sex
column stores the value 'FEMALE' in uppercase but the query filters on
'female' in lowercase; SQLite string comparisons are case-sensitive for non-ASCII
characters. Teaches how to inspect the actual stored values before writing a filter
and how to use functions like upper() or lower() to normalise comparisons.
WHERE References a SELECT Alias
Run this query. Does it raise an error or return wrong results? Add a second
select that uses the full expression body_mass_g / 1000.0 in the where
clause and compare the two queries' output.
-- Find all penguins heavier than 4 kg, displaying mass in kilograms.
select species, body_mass_g / 1000.0 as mass_kg
from penguins
where mass_kg > 4.0;
Show explanation
A query that tries to filter using a column alias defined in the select clause
fails with a "no such column" error because SQL evaluates where before select,
so the alias does not yet exist at that point; the bug is referencing mass_kg in
where instead of repeating the full expression body_mass_g / 1000.0. Teaches
the order of SQL clause evaluation.
AND/OR Operator Precedence
Run this query. Which species and islands appear in the result? Is the output what you expected based on the comment describing the intent?
-- Find penguins that are on Biscoe island and are either Adelie or Chinstrap.
select species, island
from penguins
where species = 'Adelie' or species = 'Chinstrap' and island = 'Biscoe';
Show explanation
A query intended to find Adelie or Chinstrap penguins that are on Biscoe island
instead returns all Adelie penguins from any island plus Chinstraps on Biscoe,
because and binds more tightly than or; the bug is missing parentheses around
the or sub-expression. Teaches SQL operator precedence and why parentheses are
needed to make complex where conditions unambiguous.
Non-Aggregated Column in GROUP BY
Run this query and examine the sex column. Pick one species and check a few rows
against the original data. Are the sex values consistent with what you expected
for that species group?
-- Show the average body mass for each species along with the sex of each penguin.
select species, sex, avg(body_mass_g) as avg_mass
from penguins
group by species;
Show explanation
A query that groups by species but also selects sex without aggregating it
produces an arbitrary, unpredictable sex value for each species group because sex
is not in the group by clause; the bug is selecting a column that is neither
aggregated nor part of the grouping. Teaches why every column in select must
either appear in group by or be wrapped in an aggregation function.
Aggregate Function in WHERE Clause
Run this query. Does it return a result or raise an error? Check whether moving
the condition to a having clause after group by fixes the problem.
-- Find species whose average body mass exceeds 4000 g.
select species, avg(body_mass_g) as avg_mass
from penguins
where avg(body_mass_g) > 4000.0
group by species;
Show explanation
A query that tries to filter groups by their average value using where raises an
error because aggregate functions like avg() cannot appear in where; where
filters individual rows before grouping, while having filters groups after
aggregation; the bug is placing the condition in where instead of having.
Teaches the difference in timing between where and having in SQL's execution
order.
GROUP BY Missing a Column
Run this query and examine the island column. Do all the rows for a given species
show the same island? Check a few values against the original table.
-- Count the number of penguins observed on each island, broken down by species.
select species, island, count(*) as num_penguins
from penguins
group by species;
Show explanation
A query that counts penguins per species per island produces unreliable island
values because island appears in select but not in group by, so the database
picks an arbitrary island for each species group; the bug is omitting island from
the group by clause. Teaches that every non-aggregated column in select must
also appear in group by.
count(column) Skips NULL Rows
Run this query. Then run a second query using count(*) and compare the two
results. If the numbers differ, find the rows that account for the difference.
-- Report the total number of penguins in the dataset.
select count(sex) as total_penguins
from penguins;
Show explanation
A query meant to count all penguins in the dataset underreports the total because
count(sex) only counts rows where sex is not null, omitting the 11 penguins
whose sex was not recorded; the bug is using count(sex) instead of count(*).
Teaches the difference between count(column) (excludes nulls) and count(*)
(counts every row).
Manual Average With Wrong Denominator
Run this query and note the result. Then run select avg(body_mass_g) from penguins
and compare. Are the two values the same?
-- Calculate the average body mass of all penguins.
select sum(body_mass_g) / count(*) as avg_mass
from penguins;
Show explanation
A manual average calculation divides the sum of body masses by the total row count,
but sum() skips the two rows where body_mass_g is null while count(*) still
counts them, making the denominator too large and the result too small; the bug is
dividing by count(*) instead of count(body_mass_g) or using avg() directly.
Teaches how aggregation functions treat null values and why the built-in avg()
should be preferred over hand-rolled sum/count averages.
NULL Inequality Test
Run this query. How many rows does it return? Then run a query using is not null
and compare the row counts.
-- Find all penguins whose sex has been recorded.
select *
from penguins
where sex != null;
Show explanation
A query that should return only penguins with a recorded sex returns zero rows
because comparing any value to null with != produces null (not true) in SQL's
ternary logic, so the where clause never passes any row; the bug is using
!= null instead of is not null. Teaches that null comparisons with = or !=
always yield null, and that is null / is not null are the only reliable null
tests.
INNER JOIN Drops Unmatched Rows
Run this query and list the people in the result. Then run select distinct person
from work and compare. Does every person from the work table appear in the
query output?
-- List every person and the total credits they have earned.
-- People who only did jobs not listed in the job table should appear with 0 credits.
select work.person, sum(job.credits) as total
from work inner join job
on work.job = job.name
group by work.person;
Show explanation
A query that should list total credits for every person silently omits people whose
jobs do not appear in the job table (because their work has no matching credit
value to join to); the bug is using inner join instead of left join. Teaches
the difference between inner and left joins and when each is appropriate.
Missing ON Clause (Cartesian Product)
Run this query. Count the rows returned. How many rows are in the job table?
How many are in the work table? How do those numbers relate to the row count
you observed?
-- Match each job entry to the work records for that job.
select *
from job join work
limit 10;
Show explanation
A query that should pair each job with its matching work records instead returns
every possible combination of rows from both tables because the on clause is
missing; a join without an on condition (or equivalent where condition) is
a Cartesian product, producing rows equal to the product of the two table sizes.
Teaches that a join without a matching condition is almost never correct and how to
recognise a Cartesian product by its unexpectedly large row count.
NULL Sum Needs COALESCE
Run this query and check the total column for each person. Which person has a
null total? What jobs did that person do, according to the work table?
-- Show total credits earned per person.
-- People who only did jobs not in the job table should show 0, not null.
select work.person, sum(job.credits) as total
from work left join job
on work.job = job.name
group by work.person
order by work.person;
Show explanation
A query that uses a left join to include all people shows null instead of 0 for
a person whose jobs produce no credit values, because sum() of an all-null group
returns null; the bug is not wrapping the sum in coalesce(..., 0). Teaches how
coalesce provides a fallback value for null results and why left joins often
require it.
JOIN ON Condition References Same Table Twice
Run this query. Does every person appear? Compare the num_surveys counts to what
you get from select count(*) from survey group by person_id. Do the numbers match?
-- List each person's full name and the number of surveys they have done.
select
person.personal || ' ' || person.family as full_name,
count(*) as num_surveys
from person join survey
on survey.person_id = survey.person_id
group by person.person_id
order by person.family, person.personal;
Show explanation
A query that should join people to their surveys instead produces a Cartesian
product because the on clause compares survey.person_id to itself (which is
always true) rather than to person.person_id; the bug is a copy-paste error
that left the table name on the left side of on as survey instead of person.
Teaches how to carefully check on conditions when the same column name appears in
both tables.
Self-Join Aliases Confused
Run this query and look at a few rows. Use the person table to verify which
people are supervisors and which are subordinates. Are the labels in the output
correct?
-- For each person, show their full name and their supervisor's full name.
select
pa.personal || ' ' || pa.family as person_name,
pb.personal || ' ' || pb.family as supervisor_name
from person pa join person pb
on pa.person_id = pb.supervisor_id
order by pa.family, pa.personal;
Show explanation
A self-join query that should display each person's name alongside their
supervisor's name has the labels reversed: the on condition pa.person_id =
pb.supervisor_id makes pa the supervisor and pb the subordinate, but the
aliases label pa as person_name and pb as supervisor_name; the bug is
swapping the alias names. Teaches how to reason about the direction of a self-join
and how to verify the result against the underlying data.