Basic Python

Calculate a running average over a time series

Run this code with a small list (for example, five elements and a window size of three) and count the windows by hand. Does the number of windows the code returns match the number you counted?

def sliding_windows(data, k):
    """Return all sliding windows of size k over data."""
    return [data[i : i + k] for i in range(len(data) - k)]


if __name__ == "__main__":
    data = [1, 2, 3, 4, 5]
    k = 3
    windows = sliding_windows(data, k)
    print(f"Windows: {windows}")
    print(f"Got {len(windows)}, expected {len(data) - k + 1}")

Show explanation

The bug is range(len(data) - k) instead of range(len(data) - k + 1), so the last window is never produced.

Shows: how to identify off-by-one errors in index arithmetic and how to verify boundary conditions with small, hand-checkable examples.

To find it: call the function with data = [1, 2, 3, 4, 5] and k = 3, then list every expected window by hand — [1,2,3], [2,3,4], [3,4,5] — giving three windows. Print len(range(len(data) - k)) to see it returns 2. The mismatch between 3 and 2 reveals the off-by-one.

Add up totals from a text file

Run this script with the provided input file and examine the total it prints. Does the value look like a reasonable sum of exam scores?

import sys


def average_scores(filename):
    """Return the average of numeric scores stored one per line."""
    total = 0
    count = 0
    with open(filename) as f:
        for line in f:
            line = line.strip()
            if line:
                total = total + line
                count += 1
    return total / count


if __name__ == "__main__":
    filename = sys.argv[1] if len(sys.argv) > 1 else "catadd.txt"
    print(f"Average: {average_scores(filename)}")

Show explanation

The bug is accumulating scores with total = total + line.strip() (string concatenation) instead of converting each line to a number first, so the script always reports a nonsensical total.

Shows: the difference between string + and numeric +, and how to check the type of a value at runtime using type() or isinstance().

To find it: print type(total) after the first iteration. You will see <class 'str'>, not <class 'int'>. Alternatively, print repr(total) after two iterations to see the digits concatenated as a string rather than added as numbers.

Validate a user registration form

Call the validation function with several passwords, including one you expect to be accepted. Does it accept any of them?

def is_valid_password(password):
    """Return True if password is at least 8 characters and contains a digit."""
    has_length = len(password) >= 8
    has_digit = any(c.isdigit() for c in password)
    if not has_length and not has_digit:
        return False
    return True


if __name__ == "__main__":
    tests = [
        ("abc",      False),  # too short, no digit      — should be rejected
        ("abcdefgh", False),  # long enough but no digit — should be rejected
        ("abc1",     False),  # has digit but too short  — should be rejected
        ("abcdefg1", True),   # valid: long enough and has a digit
    ]
    for password, expected in tests:
        result = is_valid_password(password)
        status = "OK  " if result == expected else "FAIL"
        print(f"{status} is_valid_password({password!r}) = {result} (expected {expected})")

Show explanation

The bug is joining the two conditions with and instead of or, which requires both to fail simultaneously and almost never happens, so valid passwords are always rejected.

Shows: how Boolean logic errors cause silent misbehavior, and how a small truth table reveals which operator is correct.

To find it: write a two-row truth table for a password that satisfies one condition but not the other — say, correct length but no special character. With and, both conditions must be false for the overall check to return True; valid passwords almost never satisfy that, so the gate always rejects.

Convert temperatures between units

Call the conversion function with a coordinate you can verify by hand—for example, 1 degree, 30 minutes, 0 seconds should equal 1.5 decimal degrees. Does the function return the correct value?

def dms_to_decimal(degrees, minutes, seconds):
    """Convert degrees/minutes/seconds to decimal degrees."""
    return degrees + minutes / 60 + seconds / 60


if __name__ == "__main__":
    tests = [
        (1, 30, 0,    1.5),   # 1°30'0"  = 1.5°
        (0, 0,  3600, 1.0),   # 0°0'3600" = 1.0°
        (45, 15, 30,  45.2583333),
    ]
    for deg, mins, secs, expected in tests:
        result = dms_to_decimal(deg, mins, secs)
        print(f"{deg}°{mins}'{secs}\" = {result:.7f}  (expected {expected})")

Show explanation

The bug is dividing seconds by 60 instead of 3600 (a misremembered formula), so the function gives wrong results.

Shows: how to verify formulas against known values (e.g., 1°30′0″ = 1.5°) and how to add assertion checks for values that must fall within a known range.

To find it: call dms_to_decimal(0, 0, 3600), which represents exactly one degree expressed entirely in seconds. The correct result is 1.0; if the function divides seconds by 60, it returns 60.0, making the wrong constant visible without needing a reference table.

Sort a list of student scores

Run this script and examine what it prints. Is the list what you expected?

def squared_sorted(numbers):
    """Return a sorted list of squares of the input numbers."""
    squared = [x**2 for x in numbers]
    squared = squared.sort()
    return squared


if __name__ == "__main__":
    result = squared_sorted([3, 1, 4, 1, 5, 9, 2, 6])
    print(f"Result: {result}")

Show explanation

The bug is calling list.sort() (which returns None) and assigning the result, so the list is always empty.

Shows: that many list methods mutate in place and return None.

To find it: print sorted_scores immediately after the assignment. The output None shows the return value of list.sort() rather than a sorted list.

Count words across multiple files

Run this script with the provided input file and examine the run-length counts. Then trace through the loop by hand with a short example—track what the "previous line" variable holds at each step.

import sys


def count_runs(filename):
    """Return a list of (value, count) pairs for runs of identical lines."""
    with open(filename) as f:
        lines = [line.rstrip("\n") for line in f]

    runs = []
    prev = ""
    run_count = 0
    i = 0
    while i < len(lines):
        line = lines[i]
        if line != prev:
            if run_count > 0:
                runs.append((prev, run_count))
            run_count = 0
        run_count += 1
        i += 1
    prev = line

    if run_count > 0:
        runs.append((prev, run_count))
    return runs


if __name__ == "__main__":
    filename = sys.argv[1] if len(sys.argv) > 1 else "indent.txt"
    for value, count in count_runs(filename):
        print(f"{count} x {value!r}")

apple
apple
apple
banana
banana
cherry

Show explanation

The bug is that the variable storing the previous line is updated outside (after) the while loop body due to a missing level of indentation, so every line is counted as starting a new run.

Shows: how indentation governs control flow in Python and how to step through a loop mentally to find where state is updated at the wrong time.

To find it: add print(f"prev={prev!r}") as the first line inside the loop body. Run with the sample file and watch the printed value — it never changes, which means the update is happening outside the loop rather than at the start of each iteration.

Check whether a number is prime

Call the function and print its return value. Is it what you expected?

def filter_negatives(numbers):
    """Return a new list containing only the non-negative values."""
    result = []
    for n in numbers:
        if n >= 0:
            result.append(n)


if __name__ == "__main__":
    data = [-3, 1, -1, 4, -1, 5, -9, 2, -6]
    filtered = filter_negatives(data)
    print(f"Result:   {filtered}")
    print("Expected: [1, 4, 5, 2]")

Show explanation

The bug is a missing return statement: the function builds the result but does not return it, so it returns None.

Shows: that Python functions return None by default and how to spot missing return in control flow.

To find it: print the function's return value directly — print(is_prime(7)). Seeing None identifies a missing return; follow the control flow to find the branch that computes the result without returning it.

Count word frequencies in a document

Run this script with the provided input file. Read the full traceback carefully. Which line raises the error, and what does the error message tell you about what is missing?

import sys


def count_words(filename):
    """Return a dictionary mapping each word to its frequency."""
    counts = {}
    with open(filename) as f:
        for line in f:
            for word in line.split():
                counts[word] += 1
    return counts


if __name__ == "__main__":
    filename = sys.argv[1] if len(sys.argv) > 1 else "nokey.txt"
    counts = count_words(filename)
    for word, count in sorted(counts.items(), key=lambda x: -x[1])[:5]:
        print(f"{word}: {count}")

to be or not to be that is the question
whether tis nobler in the mind to suffer
the slings and arrows of outrageous fortune
or to take arms against a sea of troubles

Show explanation

The bug is incrementing counts[word] without first checking whether the key exists, so the function crashes with a KeyError on the first new word it encounters.

Shows: defensive dictionary access using dict.get(key, 0) or collections.defaultdict, and how to read a KeyError traceback to identify the missing key.

To find it: run the script on the sample file and read the traceback from bottom to top. The last line shows KeyError: 'some_word', naming the exact key that was missing. The line above it in the traceback shows counts[word] += 1, which is where the crash happened — the key was used before it was created.

Match user IDs loaded from a config file

Run this script with a user ID taken directly from the JSON file. Does it grant access? Use type() to examine the types of the two values being compared.

import json
import sys


def is_allowed(user_id, allowed_file):
    """Return True if user_id appears in the allowed list in allowed_file."""
    with open(allowed_file) as f:
        data = json.load(f)
    return user_id in data["allowed_ids"]


if __name__ == "__main__":
    user_id = sys.argv[1] if len(sys.argv) > 1 else "42"
    result = is_allowed(user_id, "streq.json")
    print("Access granted." if result else "Access denied.")

{
    "allowed_ids": [1, 42, 100, 256]
}

Show explanation

The bug is that the JSON file stores IDs as integers but the login ID arrives as a string from user input, and "42" != 42 in Python, so the script always reports "access denied" even for valid users.

Shows: how JSON types map to Python types and why type conversion must happen explicitly at system boundaries.

To find it: print type(user_id) and type(allowed_ids[0]) side by side. You will see <class 'str'> and <class 'int'> on consecutive lines; that mismatch explains why == always returns False.

Compute factorials recursively

Call the function with the argument 0. Does it return the correct result?

def factorial(n):
    """Return n! for non-negative n."""
    if n > 0:
        return n * factorial(n - 1)


if __name__ == "__main__":
    for n in [5, 3, 1, 0]:
        print(f"{n}! = {factorial(n)}")

Show explanation

The bug is a base-case condition that uses > instead of >=, so calling the function with zero triggers infinite recursion and raises a RecursionError.

Shows: how to identify missing or incorrect base cases in recursion.

To find it: call factorial(0) directly and read the RecursionError. Then read the base-case condition: if n > 0 means 0 > 0 is False, so the function recurses instead of returning 1. Replacing > with >= fixes it.

Collect results with a helper function

Call the function twice in a row with no arguments and compare the two return values. Are they the same?

def collect_items(new_items, result=[]):
    """Append new_items to result and return it."""
    for item in new_items:
        result.append(item)
    return result


if __name__ == "__main__":
    first = collect_items(["a", "b"])
    print(f"First call:  {first}")    # ['a', 'b']

    second = collect_items(["c"])
    print(f"Second call: {second}")

    print(f"First again: {first}")

Show explanation

The bug is a mutable default argument (def f(result=[])), so every call starts with leftover items from previous calls.

Shows: Python's mutable default argument trap and why None is the correct default.

To find it: call collect() twice with no arguments and print both return values on the same line — print(collect(), collect()). The second list will contain all items from the first call plus new ones, proving the two calls share the same underlying list.

Compare lines read from a file

Run this script with the provided input file. Use repr() on a field value that fails to match its expected string. Does the repr() output reveal anything that was not visible before?

import sys


def find_value(filename, target):
    """Return the value for the matching region name in a pipe-delimited file."""
    with open(filename) as f:
        for line in f:
            parts = line.rstrip("\n").split("|")
            if len(parts) >= 2:
                region = parts[0]
                value = parts[1].strip()
                if region == target:
                    return value
    return None


if __name__ == "__main__":
    filename = sys.argv[1] if len(sys.argv) > 1 else "trailing.txt"
    for target in ["North", "South", "East", "West"]:
        result = find_value(filename, target)
        if result is None:
            print(f"{target!r}: not found")
        else:
            print(f"{target!r}: {result}")

North |142
South|98
East |201
West|77

Show explanation

The bug is that certain applications pad fields with trailing spaces, so string comparisons always fail for those rows even though the values look correct to the naked eye.

Shows: that real-world data often contains invisible characters, and how .strip() and repr() help diagnose string comparison failures.

To find it: print repr(row[field]) for a row that fails the comparison. You will see trailing spaces — e.g., 'Smith ' — that are invisible in a normal print but visible in the repr() output.

Clean up user-submitted text

Call the function and print the message before and after the call. Has the message changed?

def censor(message, banned_words):
    """Replace each banned word in message with '***'."""
    for word in banned_words:
        message.replace(word, "***")
    return message


if __name__ == "__main__":
    text = "The quick brown fox jumps over the lazy dog"
    banned = ["quick", "lazy"]
    result = censor(text, banned)
    print(f"Result:   {result}")
    print("Expected: The *** brown fox jumps over the *** dog")

Show explanation

The bug is that str.replace returns a new string and the return value is never assigned back, so the original message is unchanged at the end.

Shows: that string methods never mutate their argument, and that every string transformation must be captured in a variable.

To find it: print message both before and after calling clean_message. If the message is unchanged, the transformation's return value was discarded. Check that the result of str.replace is assigned back to a variable.

Remove expired entries from a list

Run this script and count how many items were removed. Is it the number you expected? Try with a list where every element should be removed.

def remove_negatives(numbers):
    """Remove all negative numbers from the list in place and return it."""
    for n in numbers:
        if n < 0:
            numbers.remove(n)
    return numbers


if __name__ == "__main__":
    data = [-1, -2, 3, -4, 5]
    result = remove_negatives(data)
    print(f"Result:   {result}")
    print("Expected: [3, 5]")

Show explanation

The bug is modifying a list while iterating over it, which causes the loop to skip every other matching item.

Shows: why mutating a collection during iteration causes unpredictable behavior.

To find it: run with items = [2, 4, 6, 8], which should remove all four elements. Print len(items) after the loop; you will get 2 instead of 0. Trace the first iteration: removing items[0] shifts items[1] into position 0, which the loop then skips on its next step.