SQL
Display records sorted by last name then first name
Run this query and look at the first several rows. Is the output ordered by island first, or by species first?
-- Show the species, island, and sex of all penguins,
-- sorted alphabetically by island and then by species within each island.
select species, island, sex
from penguins
order by species, island;
Show explanation
A query meant to sort penguins by island and then by species within
each island produces species-first ordering instead; the bug is
listing the columns in the wrong order in the order by clause.
Shows: how order by applies columns from left to right and why
column order in that clause matters.
To find it: run the query and examine the first few rows. If all
'Adelie' rows appear before any 'Chinstrap' rows regardless of
island, the sort is species-first. Compare the order by column
sequence to the intended priority.
Retrieve the second page of search results
Run this query and note which rows are returned. Count from the beginning of the full table to check whether rows 11-15 are actually what you got.
-- Retrieve rows 11 through 15 from the penguins table (the third page of 5).
select *
from penguins
limit 5 offset 11;
Show explanation
A query meant to retrieve the third page of five rows returns rows 12-16 instead
of rows 11-15; the bug is using offset 11 (which skips 11 rows) instead of
offset 10.
Shows: how limit and offset interact and how to calculate the
correct offset for a given page number.
To find it: add rowid to the select clause and run the query. If the first
returned row has rowid = 12 but the page should start at row 11, the offset is
off by one. The formula is (page_number - 1) * page_size, so page 3 with
page size 5 gives offset 10.
Calculate the average score per student
Run this query and compare the mass_kg column to the body_mass_g column.
Do the kilogram values look right for a penguin that weighs, say, 3750 g?
-- Show body mass in kilograms for the first 10 penguins.
select species, body_mass_g,
body_mass_g / 1000 as mass_kg
from penguins
limit 10;
Show explanation
A query that converts body mass from grams to kilograms silently truncates the
result because SQLite performs integer division when both operands are integers;
the bug is dividing by 1000 instead of 1000.0.
Shows: how to force floating-point arithmetic in SQL and why checking computed columns against known values catches this class of error.
To find it: run select body_mass_g, body_mass_g / 1000 as mass_kg from penguins
limit 5. A penguin weighing 3750 g should show 3.75 kg; if it shows 3, integer
division is truncating the decimal. Change 1000 to 1000.0 and rerun.
List unique participants in a study
Run this query and count the rows returned. How many distinct species are in the dataset? Does the row count match that number?
-- List the distinct species found in the penguins dataset (one row per species).
select distinct species, sex
from penguins;
Show explanation
A query intended to list the distinct species in the dataset returns seven rows
instead of three because distinct is applied to both species and sex,
producing one row per unique (species, sex) combination; the bug is including sex
in the select clause.
Shows: that distinct operates on the entire set of selected columns,
not on individual ones.
To find it: run select count(*) from (select distinct species from penguins). If
that returns 3 but the original query returns 7, the extra column is what is
producing the extra rows.
Find records with no assigned reviewer
Run this query. How many rows does it return? Are there penguins in the table whose sex was not recorded?
-- Find all penguins whose sex was not recorded.
select *
from penguins
where sex = null;
Show explanation
A query that tries to find penguins with no recorded sex returns zero rows because
comparing any value to null with = produces null (not true) in SQL's
ternary logic, so the where clause never passes; the bug is using = null instead
of is null.
Shows: SQL's three-valued logic and the two special tests is null
and is not null.
To find it: run select count(*) from penguins where sex is null. If that returns
a nonzero count but where sex = null returns zero, the = test is failing because
null comparisons always produce null, never true.
Filter for records in two different categories
Run this query. How many rows does it return? Choose a penguin you know is on Biscoe island from a previous query and check whether it appears here.
-- Find all penguins from either Biscoe island or Dream island.
select species, island, sex
from penguins
where island = 'Biscoe' and island = 'Dream';
Show explanation
A query that should return penguins from either Biscoe or Dream island returns
zero rows because and requires both conditions to be true at the same time, but
a penguin can only be on one island; the bug is using and where or is needed.
Shows: the difference between and and or and how to spot
conditions that can never simultaneously be true.
To find it: ask whether a single penguin can be on Biscoe island and Dream island
at the same time. It cannot, so AND can never be true. Whenever a zero-row result
is unexpected, check whether the conditions are mutually exclusive.
Query records by status label
Run this query and count the rows returned. Then run a query to find the distinct
values stored in the sex column. Do any of them match the literal used in the
where clause?
-- Find all female penguins.
select *
from penguins
where sex = 'female';
Show explanation
A query that should return all female penguins returns zero rows because the sex
column stores the value 'FEMALE' in uppercase but the query filters on
'female' in lowercase; SQLite string comparisons are case-sensitive for non-ASCII
characters.
Shows: how to inspect the actual stored values before writing a filter
and how to use functions like upper() or lower() to normalise
comparisons.
To find it: run select distinct sex from penguins before writing the filter. If
the results show 'FEMALE' in uppercase, a where sex = 'female' will return zero
rows because the literal does not match.
Filter on a calculated column
Run this query. Does it raise an error or return wrong results? Add a second
select that uses the full expression body_mass_g / 1000.0 in the where
clause and compare the two queries' output.
-- Find all penguins heavier than 4 kg, displaying mass in kilograms.
select species, body_mass_g / 1000.0 as mass_kg
from penguins
where mass_kg > 4.0;
Show explanation
A query that tries to filter using a column alias defined in the select clause
fails with a "no such column" error because SQL evaluates where before select,
so the alias does not yet exist at that point; the bug is referencing mass_kg in
where instead of repeating the full expression body_mass_g / 1000.0.
Shows: the order of SQL clause evaluation.
To find it: run the query and read the error: no such column: mass_kg. SQL
evaluates where before select, so the alias defined in select does not yet
exist when where runs. Replace the alias in where with the full expression
body_mass_g / 1000.0.
Select records matching one of two conditions
Run this query. Which species and islands appear in the result? Is the output what you expected based on the comment describing the intent?
-- Find penguins that are on Biscoe island and are either Adelie or Chinstrap.
select species, island
from penguins
where species = 'Adelie' or species = 'Chinstrap' and island = 'Biscoe';
Show explanation
A query intended to find Adelie or Chinstrap penguins that are on Biscoe island
instead returns all Adelie penguins from any island plus Chinstraps on Biscoe,
because and binds more tightly than or; the bug is missing parentheses around
the or sub-expression.
Shows: SQL operator precedence and why parentheses are needed to make
complex where conditions unambiguous.
To find it: run the query and check whether any Adelie penguins from non-Biscoe
islands appear in the result. If they do, and is binding more tightly than or
and the condition is being read as species = 'Adelie' or (species = 'Chinstrap' and
island = 'Biscoe'). Add parentheses around the or sub-expression.
Summarize sales by region and product
Run this query and examine the sex column. Pick one species and check a few rows
against the original data. Are the sex values consistent with what you expected
for that species group?
-- Show the average body mass for each species along with the sex of each penguin.
select species, sex, avg(body_mass_g) as avg_mass
from penguins
group by species;
Show explanation
A query that groups by species but also selects sex without aggregating it
produces an arbitrary, unpredictable sex value for each species group because sex
is not in the group by clause; the bug is selecting a column that is neither
aggregated nor part of the grouping.
Shows: why every column in select must either appear in group by
or be wrapped in an aggregation function.
To find it: pick one species in the result and verify its sex value against the
original table. If the grouped query shows a single sex for the group but the raw
data contains both sexes for that species, the database is picking an arbitrary
value.
Keep only groups above a minimum size
Run this query. Does it return a result or raise an error? Check whether moving
the condition to a having clause after group by fixes the problem.
-- Find species whose average body mass exceeds 4000 g.
select species, avg(body_mass_g) as avg_mass
from penguins
where avg(body_mass_g) > 4000.0
group by species;
Show explanation
A query that tries to filter groups by their average value using where raises an
error because aggregate functions like avg() cannot appear in where; where
filters individual rows before grouping, while having filters groups after
aggregation; the bug is placing the condition in where instead of having.
Shows: the difference in timing between where and having in SQL's
execution order.
To find it: run the query and read the error message — something like misuse of
aggregate function avg(). Aggregate functions cannot appear in where because rows
haven't been grouped yet at that point. Move the condition to a having clause after
the group by.
Count entries per category and subcategory
Run this query and examine the island column. Do all the rows for a given species
show the same island? Check a few values against the original table.
-- Count the number of penguins observed on each island, broken down by species.
select species, island, count(*) as num_penguins
from penguins
group by species;
Show explanation
A query that counts penguins per species per island produces unreliable island
values because island appears in select but not in group by, so the database
picks an arbitrary island for each species group; the bug is omitting island from
the group by clause.
Shows: that every non-aggregated column in select must also appear
in group by.
To find it: run the query and note the island value for one species. Then run
select island from penguins where species = 'Adelie' to see all islands that
species actually inhabits. If the grouped result shows only one island but the raw
data shows three, the database is picking arbitrarily.
Count how many records have a phone number
Run this query. Then run a second query using count(*) and compare the two
results. If the numbers differ, find the rows that account for the difference.
-- Report the total number of penguins in the dataset.
select count(sex) as total_penguins
from penguins;
Show explanation
A query meant to count all penguins in the dataset underreports the total because
count(sex) only counts rows where sex is not null, omitting the 11 penguins
whose sex was not recorded; the bug is using count(sex) instead of count(*).
Shows: the difference between count(column) (excludes nulls) and
count(*) (counts every row).
To find it: run select count(*), count(sex) from penguins. If the two numbers
differ, the difference is the number of penguins count(sex) is missing because
their sex is null.
Compute a class average including absent students
Run this query and note the result. Then run select avg(body_mass_g) from penguins
and compare. Are the two values the same?
-- Calculate the average body mass of all penguins.
select sum(body_mass_g) / count(*) as avg_mass
from penguins;
Show explanation
A manual average calculation divides the sum of body masses by the total row count,
but sum() skips the two rows where body_mass_g is null while count(*) still
counts them, making the denominator too large and the result too small; the bug is
dividing by count(*) instead of count(body_mass_g) or using avg() directly.
Shows: how aggregation functions treat null values and why the
built-in avg() should be preferred over hand-rolled sum/count
averages.
To find it: run select avg(body_mass_g) from penguins and compare it to the manual
calculation. If they differ, check whether count(*) and count(body_mass_g) return
the same number — if not, some rows have a null mass and the manual denominator is
too large.
Find records that don't match a given value
Run this query. How many rows does it return? Then run a query using is not null
and compare the row counts.
-- Find all penguins whose sex has been recorded.
select *
from penguins
where sex != null;
Show explanation
A query that should return only penguins with a recorded sex returns zero rows
because comparing any value to null with != produces null (not true) in SQL's
ternary logic, so the where clause never passes any row; the bug is using
!= null instead of is not null.
Shows: that null comparisons with = or != always yield null, and
that is null / is not null are the only reliable null tests.
To find it: run select count(*) from penguins where sex is not null. If that
returns rows but the original query returns zero, the != null test is failing
because null comparisons always produce null, not true or false.
List all customers with or without orders
Run this query and list the people in the result. Then run select distinct person
from work and compare. Does every person from the work table appear in the
query output?
-- List every person and the total credits they have earned.
-- People who only did jobs not listed in the job table should appear with 0 credits.
select work.person, sum(job.credits) as total
from work inner join job
on work.job = job.name
group by work.person;
Show explanation
A query that should list total credits for every person silently omits people whose
jobs do not appear in the job table (because their work has no matching credit
value to join to); the bug is using inner join instead of left join.
Shows: the difference between inner and left joins and when each is appropriate.
To find it: run select distinct person from work and compare the names to those in
the join result. Any name present in work but absent from the join result was
dropped because inner join excludes rows with no matching partner in the other
table.
Join two tables to combine related records
Run this query. Count the rows returned. How many rows are in the job table?
How many are in the work table? How do those numbers relate to the row count
you observed?
-- Match each job entry to the work records for that job.
select *
from job join work
limit 10;
Show explanation
A query that should pair each job with its matching work records instead returns
every possible combination of rows from both tables because the on clause is
missing; a join without an on condition (or equivalent where condition) is
a Cartesian product, producing rows equal to the product of the two table sizes.
Shows: that a join without a matching condition is almost never correct and how to recognise a Cartesian product by its unexpectedly large row count.
To find it: count the rows in job and the rows in work, then multiply. If the
query result has exactly that many rows, the join produced a Cartesian product,
which means the on clause is missing.
Total donations where some records are empty
Run this query and check the total column for each person. Which person has a
null total? What jobs did that person do, according to the work table?
-- Show total credits earned per person.
-- People who only did jobs not in the job table should show 0, not null.
select work.person, sum(job.credits) as total
from work left join job
on work.job = job.name
group by work.person
order by work.person;
Show explanation
A query that uses a left join to include all people shows null instead of 0 for
a person whose jobs produce no credit values, because sum() of an all-null group
returns null; the bug is not wrapping the sum in coalesce(…, 0).
Shows: how coalesce provides a fallback value for null results and
why left joins often require it.
To find it: look for null values in the total column of the result. Then run
select person, job from work where person = '<name with null total>' to confirm
that person has work records. The null total comes from sum() over an all-null
group — wrap it in coalesce(..., 0) to return zero instead.
Match employees to their department records
Run this query. Does every person appear? Compare the num_surveys counts to what
you get from select count(*) from survey group by person_id. Do the numbers match?
-- List each person's full name and the number of surveys they have done.
select
person.personal || ' ' || person.family as full_name,
count(*) as num_surveys
from person join survey
on survey.person_id = survey.person_id
group by person.person_id
order by person.family, person.personal;
Show explanation
A query that should join people to their surveys instead produces a Cartesian
product because the on clause compares survey.person_id to itself (which is
always true) rather than to person.person_id; the bug is a copy-paste error
that left the table name on the left side of on as survey instead of person.
Shows: how to carefully check on conditions when the same column
name appears in both tables.
To find it: compare num_surveys in the join result to select count(*) from
survey group by person_id. If every person shows the same num_surveys equal to
the total number of survey rows, the on clause is always true and is joining every
row to every other row.
Find pairs of employees in the same department
Run this query and look at a few rows. Use the person table to verify which
people are supervisors and which are subordinates. Are the labels in the output
correct?
-- For each person, show their full name and their supervisor's full name.
select
pa.personal || ' ' || pa.family as person_name,
pb.personal || ' ' || pb.family as supervisor_name
from person pa join person pb
on pa.person_id = pb.supervisor_id
order by pa.family, pa.personal;
Show explanation
A self-join query that should display each person's name alongside their
supervisor's name has the labels reversed: the on condition pa.person_id =
pb.supervisor_id makes pa the supervisor and pb the subordinate, but the
aliases label pa as person_name and pb as supervisor_name; the bug is
swapping the alias names.
Shows: how to reason about the direction of a self-join and how to verify the result against the underlying data.
To find it: pick one row from the result and look up both IDs in the person table.
If the person labeled supervisor_name is actually listed as a subordinate in the
raw data, the alias names in the query are reversed.