The Unix Shell

Numeric Sort

Run this script and look at the order of the numbers it prints. Is the smallest number first? What order would you expect?

#!/usr/bin/env bash
# Print the numbers in numericsort.txt in ascending order.

sort numericsort.txt
10
9
2
100
7
1
50
Show explanation

The bug is calling sort without the -n flag, which causes lexicographic (alphabetical) ordering so 10 appears before 2. Teaches the difference between alphabetical and numeric sort and when to use sort -n.

Redirect Overwrites

Run this script and then look at summary.txt. How many lines does it contain? How many did you expect?

#!/usr/bin/env bash
# Collect the first (header) line of each .dat file into summary.txt.

for f in overwrite_a.dat overwrite_b.dat overwrite_c.dat
do
    head -n 1 "$f" > summary.txt
done
Show explanation

The bug is using > inside the loop, which overwrites the file on every iteration instead of appending to it, so the summary contains only one entry. Teaches the difference between > (overwrite) and >> (append).

Wildcard Too Broad

Look at the list of .txt files in the directory. Which files does *.txt match? Are all of them files you want to copy?

#!/usr/bin/env bash
# Copy the monthly notes files to the archive directory.
# Monthly notes are notes_jan.txt and notes_feb.txt.
# notes_archive.txt is the running archive and should NOT be copied.

cp *.txt archive/
Show explanation

The bug is that *.txt matches every .txt file in the directory, including the running archive file itself, so the script copies more than the monthly notes files. Teaches how to check what a wildcard matches before using it in a destructive command, and how to narrow a pattern (e.g., notes_???.txt) to match only the intended files.

Unsorted Input to uniq

Run this script and check the count it reports. Then look at the species column in species.csv. How many distinct species are there, and does the count match?

#!/usr/bin/env bash
# Count the number of distinct species recorded in species.csv.

cut -d, -f2 species.csv | uniq | wc -l
date,species,count
2024-03-01,sparrow,12
2024-03-01,robin,5
2024-03-02,sparrow,8
2024-03-02,crow,3
2024-03-03,robin,7
2024-03-03,sparrow,10
2024-03-04,crow,4
2024-03-04,robin,6
2024-03-05,sparrow,9
2024-03-05,crow,2
Show explanation

The bug is piping directly to uniq without sorting first. Only adjacent identical lines are collapsed, so non-adjacent occurrences of the same species are counted separately and the reported count is too high. Teaches that uniq only removes adjacent duplicates and that sort | uniq is the correct pattern for counting distinct values.

Unquoted Loop Variable

Run this script with a filename that contains a space (e.g., "field notes.txt"). Does it process the file correctly?

#!/usr/bin/env bash
# Count the lines in each text file passed to this script.

for f in "$@"
do
    wc -l $f
done
Show explanation

The bug is using $f without quotes, so the shell splits the filename on the space and passes the two halves as separate arguments to wc, causing the loop to fail for any filename that contains a space. Teaches why loop variables should always be quoted as "$f" and how spaces in filenames require consistent quoting throughout a script.

Wrong Positional Parameter

Run the script with a small input file and the name of an output file. Look at which file was created and which file was modified.

#!/usr/bin/env bash
# Usage: wrongarg.sh input_file output_file
# Copy the first 20 lines of input_file into output_file.

head -n 20 "$2" > "$1"
Show explanation

The bug is that the arguments $1 and $2 are swapped, so the script reads from the output path and writes to the input path instead. Teaches how to verify which argument is which by reading the usage comment, and how to use echo to print argument values before acting on them.

head Instead of tail

Run this script and note which rows are printed. Are they from the beginning or end of the file?

#!/usr/bin/env bash
# Print the last 5 data rows of headtail.txt (excluding the header line).

head -n 5 headtail.txt
experiment_id,temperature,pressure
E001,22.1,101.3
E002,23.4,100.8
E003,21.9,102.1
E004,24.0,99.5
E005,22.8,101.7
E006,23.1,100.2
E007,21.5,102.8
E008,24.3,99.1
E009,22.6,101.4
E010,23.9,100.5
Show explanation

The bug is using head (which prints the first N lines) instead of tail (which prints the last N lines), so the script shows the oldest entries instead of the most recent ones. Teaches the difference between head and tail and how to combine them — for example tail -n 5 for the last 5, or head -n 10 | tail -n 5 for lines 6–10.

Too Many ..

Read the comment at the top of the script. Map out the expected directory structure on paper. How many levels up does ../.. go? Is that where shared/ lives?

#!/usr/bin/env bash
# Run this script from inside results/2024/.
# Save a sorted copy of measurements.txt to the shared/ directory,
# which is one level above the current directory (i.e., results/shared/).

sort -n measurements.txt > ../../shared/sorted.txt
depth_m,temp_c,salinity
0,18.4,35.1
10,17.9,35.2
20,17.1,35.4
50,14.3,35.7
100,10.8,36.0
200,7.2,36.3
Show explanation

The bug is writing ../.. when the target is only one level up (..), so the script saves output two levels up instead of one. Teaches how to trace relative paths by counting directory levels, and how to use pwd and ls .. to verify the directory structure before running a script.

Wrong wc Flag

Run this script and look at the numbers it prints. Do they match the number of lines (records) in the file?

#!/usr/bin/env bash
# Report the number of observation records in wcflag.txt.
# Each line is one record.

wc -w wcflag.txt
sparrow observed at grid B4
robin observed at grid A2
crow observed at grid C1
sparrow observed at grid D3
robin observed at grid B1
crow observed at grid A4
sparrow observed at grid C3
Show explanation

The bug is using wc -w (count words) instead of wc -l (count lines), so the script prints a much larger number than the number of records. Teaches the difference between wc flags and how to use wc --help to check which flag produces which count.

Unquoted Glob in find

Run this script from a directory that contains at least one .csv file. What arguments does find actually receive? Use echo in place of find to check.

#!/usr/bin/env bash
# Find all CSV files anywhere under the data/ directory.

find data -name *.csv
Show explanation

The bug is passing *.csv without quotes, so the shell expands the glob before find runs and the command searches for files whose names match already-expanded filenames from the current directory. Teaches that the shell expands unquoted wildcards before passing them to any command, and that patterns given to find -name must be quoted.

Wrong cut Delimiter

Run this script and examine the output. Does each output line contain just the family name, or does it contain the whole row?

#!/usr/bin/env bash
# Extract the family name (first column) from the tab-delimited roster.

cut -d, -f1 cutdelim.txt
Smith	Jane	Biology
Jones	Tom	Chemistry
Garcia	Maria	Physics
Chen	Wei	Biology
Okafor	Amara	Chemistry
Show explanation

The bug is specifying -d, (comma delimiter) when the file uses tabs. Since there are no commas, cut treats each line as a single field and returns it whole. Teaches how to identify the actual delimiter in a file (using cat -A to show invisible characters) and how to specify a tab with -d$'\t'.

Inverted grep

Run this script and read the output. Do the lines shown contain the word you were searching for?

#!/usr/bin/env bash
# Print all lines in the log that contain the word "ERROR".

grep -v "ERROR" grepinvert.txt
2024-03-01 08:02 INFO  server started
2024-03-01 08:15 ERROR disk usage above 90%
2024-03-01 09:00 INFO  backup complete
2024-03-01 09:47 ERROR connection timeout after 30s
2024-03-01 10:30 INFO  request processed
2024-03-01 11:15 ERROR out of memory in worker 3
2024-03-01 12:00 INFO  daily report sent
Show explanation

The bug is the -v flag, which inverts the match so grep shows lines that do not contain the pattern. The script prints everything except the error lines instead of just the error lines. Teaches what -v does and how to check the result of a grep command against a small known file to confirm it is filtering in the right direction.

Missing Script Argument

Run this script with no arguments. Does it print useful output, produce an error, or do something else? Use Ctrl-C to stop it if it appears to hang.

#!/usr/bin/env bash
# Usage: missingarg.sh filename
# Print the filename and its line count.

echo "File: $1"
wc -l $1
Show explanation

The bug is that $1 expands to nothing when no argument is given and wc -l with no filename reads from standard input, so the script hangs waiting indefinitely for keystrokes. Teaches how positional parameters expand to empty strings when omitted, and how to check for missing arguments with echo "Usage: ..." before using them.

Wrong cat Order

Run this script and read the resulting report.txt. Does the report begin with the introduction, or with a different section?

#!/usr/bin/env bash
# Assemble the three sections of the report in order: introduction, methods, results.

cat section3.txt section1.txt section2.txt > report.txt
Introduction
============
This report summarises findings from the 2024 field season.
Three sites were surveyed between January and March.
Methods
=======
Samples were collected using standard protocols.
Each site was visited on three consecutive days.
Temperature and salinity were recorded at each visit.
Results
=======
Site A showed elevated salinity compared to baseline.
Site B was within normal range on all measures.
Site C had anomalously low temperatures on day 2.
Show explanation

The bug is that the filenames are listed in the wrong sequence on the cat command line, so the sections appear in the wrong order. Teaches that cat concatenates files in the order they are given, and how to verify the result with head before treating the output as correct.

Loop Clobbers on Re-run

Run this script once, then run it again. How many lines does summary.txt contain after the first run? After the second run?

#!/usr/bin/env bash
# Collect word counts for all .dat files into summary.txt.
# Running this script a second time should produce the same summary,
# not a file with doubled entries.

for f in *.dat
do
    wc -w "$f" >> summary.txt
done
Show explanation

The bug is using >> inside the loop without clearing the file first, so each run appends to whatever the previous run left and the file contains double the expected entries after a second run. Teaches how to decide between > and >>, and the pattern of redirecting the first write with > or removing the output file before the loop begins.

Numeric Reverse Sort

Run this script in a directory that has files of different sizes. Compare the output to ls -s | sort -rn | head -n 3. Are the results the same?

#!/usr/bin/env bash
# List the three largest files in the current directory by block size,
# largest first.

ls -s | sort -r | head -n 3
Show explanation

The bug is using sort -r, which reverses alphabetical order rather than numeric order. A file of size 100 blocks sorts as smaller than 8 because "8" comes after "1" in alphabetical order, so the three largest files are listed in the wrong order. Teaches that -r alone reverses the current sort order and that -rn is needed to sort numbers in descending order.

head Off by One

Run this script and count the data rows in the output. Then count the data rows in the original file. Are they the same?

#!/usr/bin/env bash
# Extract the header row plus the first 10 data rows from headcount.csv
# (11 lines total: 1 header + 10 data).

head -n 10 headcount.csv
site,reading,value
A,1,42.3
A,2,38.9
A,3,55.1
B,1,31.7
B,2,29.4
B,3,33.8
C,1,47.2
C,2,51.0
C,3,44.6
D,1,28.3
D,2,30.1
Show explanation

The bug is head -n 10 when the header itself is one of the ten lines, so only nine data rows remain instead of ten. Teaches how to account for header lines when counting with head and how to use wc -l to verify the actual line count of the output.

Unquoted $@

Run this script with a filename that contains a space. Does it process the file, or does it report an error about a non-existent file?

#!/usr/bin/env bash
# Usage: quotedall.sh file1 file2 ...
# Count the lines in each file passed as an argument.

for f in $@
do
    wc -l "$f"
done
Show explanation

The bug is using $@ without quotes, which causes word-splitting so each space-separated token is treated as a separate argument and the script fails for any argument that contains a space. Teaches the difference between $@ and "$@": the quoted form preserves each argument as a single token, even if it contains spaces.

find Missing -type f

Create a directory whose name ends in .log (e.g., mkdir debug.log). Now run this script. Does the count include the directory?

#!/usr/bin/env bash
# Count all log files (and only log files) under the current directory.

find . -name "*.log" | wc -l
Show explanation

The bug is that -name "*.log" matches any filesystem entry, not just regular files, so directories whose names end in .log are included in the count. Teaches the use of -type f to restrict find results to regular files and -type d to restrict to directories.

cp Without -r

Run this script. Does it copy the directory, or does it produce an error message? What does the error message say?

#!/usr/bin/env bash
# Back up the entire results/ directory to backup/results/.

cp results/ backup/results/
Show explanation

The bug is calling cp without the -r flag, which is required to copy a directory and its contents recursively, so the script fails with the message "omitting directory". Teaches the difference between copying a file and copying a directory, and how to read cp --help to find the right flag.