class: slide-title
Software Design by Example
Serving Web Pages
chapter
--- ## The Problem - Uploading and downloading files (
Chapter 21
) is useful, but we want to do more - Don't want to create a new
protocol
for every interaction - Use a standard protocol in a variety of ways --- ## HTTP -
Hypertext Transfer Protocol (HTTP)
specifies what kinds of messages clients and servers can exchange and how those messages are formatted - Client sends a
request
as text over a socket connection - Server replies with a
response
(also text) - Requests and responses may carry (non-textual) data with them - *Server can respond to requests however it wants* --- ## HTTP Requests - A
method
such as `GET` or `POST` - A
URL
- A
protocol version
```txt GET /index.html HTTP/1.1 ``` --- ## Headers - Requests may also have
headers
```txt GET /index.html HTTP/1.1 Accept: text/html Accept-Language: en, fr If-Modified-Since: 16-May-2023 ``` - A key can appear any number of times --- ## HTTP Response - Protocol and version - A
status code
and phrase - Headers, possibly including `Content-Length` (in bytes) - Blank line followed by content ```txt HTTP/1.1 200 OK Date: Thu, 16 June 2023 12:28:53 GMT Content-Type: text/html Content-Length: 53
Hello, World!
``` --- ## Requests ```py import requests response = requests.get("http://third-bit.com/test.html") print("status code:", response.status_code) print("content length:", response.headers["content-length"]) print(response.text) ``` ``` status code: 200 content length: 103
Test Page
test page
``` --- ## HTTP Lifecycle
--- ## Basic HTTP Server ```py from http.server import BaseHTTPRequestHandler, HTTPServer PAGE = """
test page
""" class RequestHandler(BaseHTTPRequestHandler): def do_GET(self): content = bytes(PAGE, "utf-8") self.send_response(200) self.send_header( "Content-Type", "text/html; charset=utf-8" ) self.send_header("Content-Length", str(len(content))) self.end_headers() self.wfile.write(content) if __name__ == "__main__": server_address = ("localhost", 8080) server = HTTPServer(server_address, RequestHandler) server.serve_forever() ``` --- ## Running the Server ```sh python basic_http_server.py ``` - Displays nothing until we go to `http://localhost:8080` in our browser - Browser shows page - Shell shows log messages ``` 127.0.0.1 - - [16/Sep/2022 06:34:59] "GET / HTTP/1.1" 200 - 127.0.0.1 - - [16/Sep/2022 06:35:00] "GET /favicon.ico HTTP/1.1" 200 - ``` --- ## Serving Files ```py class RequestHandler(BaseHTTPRequestHandler): def do_GET(self): try: url_path = self.path.lstrip("/") full_path = Path.cwd().joinpath(url_path) if not full_path.exists(): raise ServerException(f"{self.path} not found") elif full_path.is_file(): self.handle_file(self.path, full_path) else: raise ServerException(f"{self.path} unknown") except Exception as msg: self.handle_error(msg) ``` --- ## Read and Reply - Translate path in URL into path to local file -
Resolve
paths relative to server's directory ```py def handle_file(self, given_path, full_path): try: with open(full_path, 'rb') as reader: content = reader.read() self.send_content(content, HTTPStatus.OK) except IOError: raise ServerException(f"Cannot read {given_path}") ``` --- ## Handling Errors ```py ERROR_PAGE = """\
Error accessing {path}
Error accessing {path}: {msg}
""" ``` ```py def handle_error(self, msg): content = ERROR_PAGE.format(path=self.path, msg=msg) content = bytes(content, "utf-8") self.send_content(content, HTTPStatus.NOT_FOUND) ``` - Use `try`/`except` to handle errors in called methods -
Throw low, catch high
--- ## Problems - Client can escape from our
sandbox
by asking for `http://localhost:8080/../../passwords.txt` - `send_content` always says it is returning HTML with `Content-Type` - Should use things like `image/png` for images - But we got
character encoding
right --- ## Test Case - Want to write this ```py def test_existing_path(fs): content_str = "actual" content_bytes = bytes(content_str, "utf-8") fs.create_file("/actual.txt", contents=content_str) handler = MockHandler("/actual.txt") handler.do_GET() assert handler.status == int(HTTPStatus.OK) assert handler.headers["Content-Type"] == \ ["text/html; charset=utf-8"] assert handler.headers["Content-Length"] == \ [str(len(content_bytes))] assert handler.wfile.getvalue() == content_bytes ``` --- ## Combining Code - Use
multiple inheritance
--- ## Mock Request Handler ```py from io import BytesIO class MockRequestHandler: def __init__(self, path): self.path = path self.status = None self.headers = {} self.wfile = BytesIO() def send_response(self, status): self.status = status def send_header(self, key, value): if key not in self.headers: self.headers[key] = [] self.headers[key].append(value) def end_headers(self): pass ``` --- ## Application Code ```py class ApplicationRequestHandler: def do_GET(self): try: url_path = self.path.lstrip("/") full_path = Path.cwd().joinpath(url_path) if not full_path.exists(): raise ServerException(f"'{self.path}' not found") elif full_path.is_file(): self.handle_file(self.path, full_path) else: raise ServerException(f"Unknown object '{self.path}'") except Exception as msg: self.handle_error(msg) # ...etc... ``` --- ## Two Servers ```py if __name__ == '__main__': class RequestHandler( BaseHTTPRequestHandler, ApplicationRequestHandler ): pass serverAddress = ('', 8080) server = HTTPServer(serverAddress, RequestHandler) server.serve_forever() ``` ```py class MockHandler( MockRequestHandler, ApplicationRequestHandler ): pass ``` --- ## Summary