Intermediate Python

Reverse a sequence for display

Call the function and then print both the original list and the returned value. Has the original list changed?

def reversed_copy(data):
    """Return a reversed copy of data, leaving the original unchanged."""
    data.reverse()
    return data


if __name__ == "__main__":
    original = [1, 2, 3, 4, 5]
    result = reversed_copy(original)
    print(f"Result:   {result}")
    print(f"Original: {original}")

Show explanation

The bug is using list.reverse() (which mutates in place) instead of reversed() or slicing, so the original list is also reversed after the call.

Shows: aliasing and the difference between in-place and copy operations.

To find it: print id(original_list) before the call and id(returned_list) after. If both print the same number, the function returned the same object rather than a copy, and reversing it also mutated the original.

Track items added to each shopping cart

Create two account objects, add different transactions to each, and then print the transaction history of each. Does each account show only its own transactions?

class BankAccount:
    history = []

    def __init__(self, owner):
        self.owner = owner

    def deposit(self, amount):
        self.history.append((self.owner, "deposit", amount))

    def withdraw(self, amount):
        self.history.append((self.owner, "withdrawal", amount))


if __name__ == "__main__":
    alice = BankAccount("Alice")
    bob = BankAccount("Bob")

    alice.deposit(100)
    bob.deposit(50)
    alice.withdraw(30)

    print(f"Alice's history: {alice.history}")
    print(f"Bob's history:   {bob.history}")

Show explanation

The bug is that history = [] is defined at class level, so all instances share the same list object instead of each having their own, and every account's transactions appear in every other account.

Shows: the difference between shared mutable class attributes and per-instance attributes initialized in __init__.

To find it: create two accounts, add one transaction to the first and a different one to the second, then print account1.history and account2.history. If both lists contain both transactions, confirm the sharing with print(Account.history is account1.history) — it will print True.

Check whether a running total hits a target

Run this script and examine the two computed values. Are they exactly equal? Try printing each value with many decimal places.

def running_total(amounts):
    """Return the total by accumulating amounts one at a time."""
    result = 0.0
    for a in amounts:
        result += a
    return result


def direct_total(amounts):
    """Return the total by summing all amounts at once."""
    return sum(amounts)


if __name__ == "__main__":
    amounts = [0.1] * 10   # mathematically equals 1.0

    t1 = running_total(amounts)
    t2 = direct_total(amounts)

    print(f"Running total: {t1!r}")
    print(f"Direct total:  {t2!r}")

    if t1 == 1.0:
        print("Running total is exactly 1.0")
    else:
        print(f"Running total is NOT exactly 1.0 (off by {abs(t1 - 1.0)!r})")

    if t1 == t2:
        print("Totals match.")
    else:
        print("Totals differ!")

Show explanation

The bug is using == on floats computed by different routes, so the comparison returns False even when the values should be equal.

Shows: floating-point representation errors and how to use math.isclose.

To find it: print both values with 20 decimal places using f"{val:.20f}". You will see that one ends in ...000001 rather than matching the other exactly, revealing the rounding difference that makes == return False.

Load settings from a configuration file

Run the scraper with the provided URL list, which includes one malformed URL. Does it process all the valid URLs? Check whether anything is silently discarded.

from urllib.parse import urlparse


URLS = [
    "https://example.com/page1",
    "https://example.com/page2",
    "not-a-valid-url",
    "https://example.com/page4",
    "https://example.com/page5",
]


def fetch_title(url):
    """Return a simulated page title; raises ValueError for malformed URLs."""
    parsed = urlparse(url)
    if not parsed.scheme:
        raise ValueError(f"Invalid URL: {url!r}")
    return f"Title from {parsed.netloc}{parsed.path}"


def scrape_all(urls):
    """Fetch a title for each URL and return the list of results."""
    titles = []
    try:
        for url in urls:
            title = fetch_title(url)
            titles.append(title)
    except Exception:
        pass
    return titles


if __name__ == "__main__":
    results = scrape_all(URLS)
    valid = sum(1 for u in URLS if urlparse(u).scheme)
    print(f"Got {len(results)} titles (expected {valid}):")
    for t in results:
        print(f"  {t}")

Show explanation

The bug is wrapping the fetch-and-parse loop in try/except Exception: pass to tolerate network timeouts. The ValueError raised by the URL parser is also caught and discarded, so the scraper silently stops processing after the first malformed URL.

Shows: how overly broad exception handlers swallow unrelated bugs, and how to use logging.exception to record errors instead of ignoring them.

To find it: replace pass in the except block with logging.exception("caught"). Run the script again and read the log output. You will see a ValueError from the URL parser printed for the malformed URL, proving the except block was swallowing the wrong exception type.

Read addresses from a spreadsheet export

Run the script with the provided CSV file and read the traceback. Which line in the file triggers the error? Examine that line carefully.

import sys


def top_scorer(filename):
    """Return the (name, score) pair with the highest score."""
    best_name = None
    best_score = -1
    with open(filename) as f:
        next(f)  # skip header
        for line in f:
            parts = line.strip().split(",")
            name = parts[0]
            score = int(parts[1])
            if score > best_score:
                best_score = score
                best_name = name
    return best_name, best_score


if __name__ == "__main__":
    filename = sys.argv[1] if len(sys.argv) > 1 else "commas.csv"
    name, score = top_scorer(filename)
    print(f"Top scorer: {name} ({score})")

name,score
Alice Johnson,95
Bob Smith,87
Martinez, Carlos,92
Wong, Jennifer,88
Diana Prince,79

Show explanation

The bug is that names containing a comma (e.g., "Smith, John") cause line.split(',') to produce three fields instead of two, so the index used for the score points at the wrong element and the script crashes with an IndexError.

Shows: why hand-rolled CSV parsing fails on real data and when to use the csv module.

To find it: open the CSV in a text editor and find the row that triggers the IndexError. Count the commas on that line — there are two, not one. The extra comma sits inside a quoted name field that line.split(',') does not recognize as quoted.

Rank files by version number

Run the sort function and examine the output order. Where does file10 appear relative to file2?

def sorted_logs(filenames):
    """Return filenames sorted by their embedded sequence number."""
    return sorted(filenames)


if __name__ == "__main__":
    files = ["file10.txt", "file2.txt", "file1.txt", "file20.txt", "file3.txt"]
    result = sorted_logs(files)
    print("Sorted order:")
    for f in result:
        print(f"  {f}")
    print()
    print("Expected numeric order: file1, file2, file3, file10, file20")

Show explanation

The bug is using the default sort(), which gives lexicographic order and places file10 before file2.

Shows: the difference between lexicographic and numeric sort order and how to write a key function that extracts the embedded integer so the files sort as file1, file2, file10.

To find it: run the sort and search for file10 in the output. It appears immediately after file1, before file2. That placement is the signature of alphabetical order, where "10" < "2" because "1" < "2" at the first character.

Process a pipeline of records twice

Run the script and look at both outputs. Does each one produce the values you expected?

def positive_numbers(data):
    """Yield each positive number from data."""
    return (x for x in data if x > 0)


if __name__ == "__main__":
    data = [-1, 2, -3, 4, -5, 6, -7, 8]

    gen = positive_numbers(data)
    total = sum(gen)
    count = sum(1 for _ in gen)

    print(f"Total: {total}")
    print(f"Count: {count}")

    if count > 0:
        print(f"Mean:  {total / count}")
    else:
        print("Cannot compute mean: generator was exhausted before count was taken")

Show explanation

The bug is that generators are exhausted after one pass, so the second use of the generator in the same expression produces no results.

Shows: that generators are single-use iterators and when to use lists instead.

To find it: print type(pipeline) to confirm it is a generator. Then assign results = list(pipeline) and print results a second time — the second access returns an empty list, proving the generator was exhausted on the first pass.

Cache results of an expensive calculation

Call the cached function twice with the same positional argument but a different keyword argument each time. Do both calls return the correct result?

def memoize(func):
    """Cache the results of func calls."""
    cache = {}

    def wrapper(*args, **kwargs):
        key = args
        if key not in cache:
            cache[key] = func(*args, **kwargs)
        return cache[key]

    return wrapper


@memoize
def power(base, exponent=2):
    print(f"  (computing {base}^{exponent})")
    return base ** exponent


if __name__ == "__main__":
    print(f"power(3)            = {power(3)}")
    print(f"power(3, exponent=3)= {power(3, exponent=3)}")
    print(f"power(3, 3)         = {power(3, 3)}")

Show explanation

The bug is that the cache key does not include all function arguments (e.g., ignores keyword arguments), so the decorator returns the same result for different inputs.

Shows: how to construct correct cache keys and test with varied inputs.

To find it: call the cached function twice with the same positional argument but different keyword arguments — e.g., f(1, multiplier=2) then f(1, multiplier=3). If both return the same value, add print(key) inside the decorator to show the key is identical for both calls.

Extend a base class with new attributes

Create an instance of the subclass and try to access an attribute that is set in the parent's __init__. Does it exist?

class Animal:
    def __init__(self, name, sound):
        self.name = name
        self.sound = sound

    def speak(self):
        return f"{self.name} says {self.sound}"


class Dog(Animal):
    def __init__(self, name):
        self.tricks = []

    def learn_trick(self, trick):
        self.tricks.append(trick)

    def show_tricks(self):
        return f"{self.name} knows: {', '.join(self.tricks)}"


if __name__ == "__main__":
    dog = Dog("Rex")
    dog.learn_trick("sit")
    dog.learn_trick("shake")
    print(f"Tricks: {dog.tricks}")
    print(dog.speak())

Show explanation

The bug is forgetting super().__init__(), so the parent's __init__ is never called and required attributes are missing when a subclass method tries to use them.

Shows: Python's method resolution order and how to use super() correctly.

To find it: instantiate the subclass and immediately try to access a parent-defined attribute. The AttributeError names the missing attribute. Search the subclass __init__ for a call to super().__init__() — its absence is the cause.

Write results to disk when processing fails

Run the script so that it raises an exception part-way through writing. Then open the output file. Does it contain complete data?

import os
import sys


def summarize(input_file, output_file):
    """Write uppercased non-blank lines from input_file to output_file."""
    out = open(output_file, "w")
    with open(input_file) as f:
        for line in f:
            line = line.rstrip("\n")
            if not line:
                raise ValueError(f"Unexpected blank line in {input_file!r}")
            out.write(line.upper() + "\n")
    out.close()


if __name__ == "__main__":
    input_file = sys.argv[1] if len(sys.argv) > 1 else "unclosed.txt"
    output_file = "unclosed_out.txt"

    try:
        summarize(input_file, output_file)
    except ValueError as e:
        print(f"Error: {e}")

    if os.path.exists(output_file):
        with open(output_file) as f:
            lines = f.readlines()
        print(f"Output has {len(lines)} line(s) — may be incomplete (buffer not flushed)")
        for line in lines:
            print(f"  {line!r}")

first line
second line

fourth line after blank
fifth line

Show explanation

The bug is using open() without a with statement. When an unhandled exception occurs midway through, the output file is left partially written because the write buffer is never flushed and close() is never called.

Shows: why context managers guarantee file cleanup even when exceptions occur, and how to use with open(…) as f to prevent data loss.

To find it: run the script so that it raises an exception midway through writing, then open the output file in a text editor. Count the records. If the count is less than expected, the buffer was never flushed — close() was never called because the exception skipped it.