The previous article in this series introduced a simple security model based on authentication, authorization, and access control, then described how DrProject implements the first of these. That still leaves two important pieces, though: how to represent who's allowed to do what, and how to enforce those rules. The key concept here is that of a permission, which combines a user profile (e.g., an account) with the capability to perform some action. A simple authorization system associates every operation with a capability, such as EDIT_WIKI_PAGE or DELETE_USER_ACCOUNT. When a user U attempts to perform an operation that requires a capability C, the authorization system looks for the pair (U,C) in its database (Figure 1). If that pair exists, the operation is allowed; if it doesn't, the operation is denied. But what about the thing the user is operating on? A filesystem can't simply say that Alan is allowed to view files; it has to make that decision on a file-by-file basis. Should permissions therefore be subject-verb-object triples? We thought about doing this in DrProject, but decided against it. There might actually be situations in which we'd want to control which wiki pages a particular user could edit, but we couldn't think of any. Plus, the fact that we're saving old versions of pages means that if anyone ever does something we don't like, we can undo it [1]. DrProject's authorization database does actually contain triples, though, and the reason is that each installation may support multiple projects. I'll discuss this in more detail in a future posting, but essentially, each project is a logically separate entity, with its own independent collection of wiki pages. DrProject therefore has to record (U,P,C) triples, where U is the user, P is the project, and C is the capability. The table that stores wiki pages also has an extra column to show which project a page belongs to (Figure 2). We could stop here---DrProject's predecessor Trac did [2]. However, if an installation has 20 projects and 60 users, and that six capabilities are required to define what can be done to the wiki, the system administrator potentially has to manage 7200 distinct permissions. Figuring out who can do what (or worse, who was able to do what, when) quickly becomes an administrative nightmare. In practice, permissions are usually granted in a few stereotypical ways. Someone who can create pages can almost always edit and delete them; someone who can't create user accounts usually can't change their settings, either. DrProject makes this concrete by using a level of indirection in its authorization system. A role is a uniquely named (and possibly empty) set of atomic capabilities. Each user has a role with respect to each project. The number of distinct roles is surprisingly small: the DrProject we're using to manage undergraduate projects this term only has three, and I'd be surprised if any system ever needed more than a dozen. Implementing roles is a little bit trickier than it first seems. To see why, think about how the system defines what someone who hasn't yet logged in is able to do. We represent this by creating a user profile for the anonymous user, and then defining the anonymous user's role with respect to each project. That part is simple. What isn't simple (or at least, it didn't appear simple to me) is that a user's actual permissions must be the union of the permissions defined by her role (if any) with respect to a project, and the permissions that the anonymous user has for that project. If this weren't true, then it would be possible that a user could accomplish more by logging out than by being logged in. The last thing we have to do with roles is decide which of them count as "being a member" of a project. People use this phrase all the time, but flattening a complex collection of capabilities into a Boolean decision is actually quite hard. To jump ahead a bit, it seems sensible to say that only the members of a project are automatically on that project's mailing list, but does that mean someone whose role allows them to file tickets, but not to update or work on those tickets, gets mail or not? What about someone who's allowed to view and comment on the wiki, but not allowed to create new pages? We played around with various rules, and finally decided that the simplest thing was to throw out the idea of "membership" per se. If a role has any capability in the set {MAIL_POST, MAIL_VIEW, MAIL_DELETE}, messages sent to the project mailing list will be forwarded to users with that role. After all this, access control is fairly straightforward to implement. Every time it receives an HTTP request, DrProject constructs a request object that stores references to user profile and a project descriptor (Figure 3). The user and project IDs are used to look up a role; that role's capabilities are merged with those that the anonymous user has with respect to that project to create a description of what the user is allowed to do. Every method that requires a capability then checks that its capability is in that set, and throws a permission exception if it's not. Next time: tickets.
[1] Well, almost. DrProject's wiki does allow users to completely erase all record of a page from the database. If a malicious or clumsy user does this, the only way to recover the page is from the last system backup. We may get around this in future by associating an "existence" flag with each page to signal whether it should be treated as being there or not, but this is a pretty minor concern.[2] Trac was actually a little smarter than this, but only a little. Trac managed authorization by storing raw (user, capability) pairs, such as (ghopper, WIKI_EDIT). Both fields were strings; to determine a user's actual capabilities, Trac recursively expanded the "raw" capabilities by looking for them in the USER column. For example, if ghopper had the capability ADMIN, and the "user" ADMIN had the capabilities WIKI_EDIT and WIKI_DELETE, then ghopper was given those capabilities as well. DrProject's strict two-level separation of users from roles, and roles from capabilities, has been a lot easier to administer, test, and debug.