Archive

Archive for April, 2010

The Chilling Effect of the GPL

April 12th, 2010

Distributed version control systems have finally passed my two-year test [1], so while I was at PyCon in February, I asked a few questions about what it would take to add a Mercurial repository browser to Basie. Two months and a couple of dozen email messages later, the answer seems to be that it can’t be done—at least, not unless we’re willing to use someone else’s definition of “freedom”.

Like Django itself, Basie uses the MIT License, which does little except disclaim liability. In particular, it allows people to create closed-source derivatives and extensions if they think there’s a market for them. Mercurial, on the other hand, uses the GNU Public License (GPL), which requires that source remain open, and (crucially) that derived works also be GPL’d. The question is, if Basie dynamically loads a plugin module that interfaces with Mercurial, does that make Basie a derived work or not? Matt Mackall, who is one of its developers, believes that it does:

My position is that Mercurial extensions are very probably derived works as they are potentially (and generally) significantly more intimate with Mercurial internals than a typical library user, and thus are subject to the GPL.

Dirkjan Ochtman, another Mercurial developer, suggests a workaround:

So then the solution is to have a clearly defined plugin API on the other side (e.g. Basie in this case) that can serve multiple VCSs, and then the user (Basie) should be free from the hassles of the license on the other side (Mercurial).

But Matt doesn’t think this solves the problem:

I’m afraid that doesn’t work.

Let’s imagine I’m Evil Corporation and I want to build commercial product X that’s clearly a derived work of Mercurial. It contains secret sauce that would benefit the Mercurial community, but I don’t want to share it. I’ve written to the Mercurial folks and they say “sorry, no, you have to share”. So then I decide, hey, I can put some of my bits in a third package Y, and make it GPL, but explicitly disclaim the API so that product X containing the secret sauce can still do its thing without being disclosed.

The problem is package Y, as a derived work, is not allowed to weaken the license of X+Y. In short, there are no work-arounds, and attempts to make technical work-arounds will end up being fairly transparent to a jury. The GPL is intentionally designed such that sharing the secret sauce is the price of admission.

Unfortunately, all this means it’s also a problem for the good guys too.

Van Lindberg, a lawyer who has written a book on open source licensing, and who disputes the Free Software Foundation‘s maximalist interpretation of the GPL, disagrees:

[long discussion elided]

Relative to Greg’s specific use case, a couple principles:

1. Every situation is fact specific.

2. The GPL goes only as far as copyright does. Someone with a first GPL’d program (e.g., Mercurial) can only enforce GPL licensing on a second program if the second program is a derivative work of the first, giving the first program’s author a copyright interest in the second program.

2. A plugin or extension /may be/ a derived work of the host program. In this case, Matt’s contention that plugins/extensions to Mercurial are derived works may be true in general.

3. It is legally and factually improbable that a host program would ever be considered a derivative work of a plugin. Rehashing of the legal analysis available on request – the key here is that a derivative work according to the statute is based upon a “preexisting” work.

4. A plugin that bridges between two separate host programs may be a derivative work of both host programs. However, due to the principles above, the derivative works relationship is probably not transitive from one host, to the plugin, to the second host. Note that this would not necessarily hold if the plugin architecture was a transparent dodge around GPL compliance, and there were other soft factors (such as bundling the plugin together with the host, addressing them as a single program, etc) that indicated as such.

5. Even in the event that principle #4 is incorrect, forcing the user to separately download and install the plugin would make the end user the person who directed and created the combined work, a position that is allowed under the GPL.

Therefore, a separate VCS plugin that allows Basie to interact with Mercurial could be a derivative work of both Basie and Mercurial and thus carry GPL licensing. However, this plugin should be 1) not bundled with Basie; 2) not necessary for Basie’s operation; 3) part of a clean extension API, preferably with multiple plugin implementations; and 4) clearly referred to as a separate program from Basie. This position would be strengthened if Basie adopted the position I advocate for Mercurial above, that plugins not be considered derivative works of Basie.

This leaves me in an uncomfortable position: I don’t feel qualified to disagree with a lawyer, but I also don’t feel qualified to disagree with one of the authors of the software I’m hoping to use. The only options I can see are:

  1. build the plugin anyway and dare Mercurial’s developers to make an issue of it (which would be pretty rude);
  2. give up (which rankles because so many developers are switching to Mercurial or Git, and I don’t want to be left behind); or
  3. change the license on Basie (which rankles because I don’t think I should have to close off options on someone else’s say-so).

George Orwell argued that the real purpose of censorship is to make people worry about what writing something troublesome might cost them, so that they never actually write anything that would have to be censored. Every time someone puts the GPL on something, they are forcing other developers to make the same kind of decision: accept someone else’s idea of what “free” means, or run the risk of not being able to use their “free” software without a whole lot of trouble [2]. Having to make that choices rankles too…

[1] Having watched dozens of bandwagons roll by, I’m not interested in a technology unless it’s still hot two years after first hitting my radar. Students have been telling me since at least 2008 that it’s time to put Subversion in the closet and start using Git or Mercurial; I’d still be happier if there was a clear leader, so that I knew which horse to back, but I’ll accept now that DVCS’s aren’t going to subside into nicheness like Java applets, XSLT, and cowboy rap bands.

[2] In subsequent discussion, Matt, Van, and Dirkjan have suggested workarounds, such as shelling out to the Mercurial command-line client and parsing its text output, or reimplementing the Mercurial communication protocol. Both would be exactly the kind of “trouble” to build and maintain that free/open software was supposed to save us from.

Update: Fog Creek’s Benjamin Pollack tells me that their lawyers agree with Matt and Dirkjan: anything linking to Mercurial was going to have to be GPL’d, so all the components they have built to do that are too:

In Kiln, to avoid running afoul of the GPL, while still making a viable commercial product, what we did was to design the product so that the closed-source browsing/repository management/code review component was kept entirely separate from the open-source code storage component. The former’s written in C#; the latter’s in Python. The storage component links with and is backed by Mercurial, is fully GPL’d, and provides a VCS-agnostic REST API to create and destroy repositories, get file listing and contents, get history, and get diffs. The front-end browsing/management solution has nothing more than an abstract concept of a GUID-identified repository that can be asked for file contents, directory listings, diffs, etc. To ensure that there was no way that the closed-source web site could be considered a derived work, we even whipped up proof-of-concepts where the backend was powered by Git instead of Mercurial. The design can be frustrating, but we think it’s legally sound, and adheres pretty well to the spirit of the GPL.

Uncategorized

Sigma Xi Lecture in Toronto: Managing Without Growth

April 12th, 2010
Comments Off

Without Growth. Slower by Design, Not Disaster
Prof. Peter Victor
Environmental Studies
York University

Wednesday, April 14, 2010, 1:00pm
Senior Combination Room, Trinity College
6 Hoskin Avenue
University of Toronto

Economic growth is the over-arching policy objective of governments worldwide. Yet its long-term viability is increasingly questioned because of environmental impacts and impending and actual shortages of energy and material resources. Furthermore, rising incomes in rich countries bear little relation to gains in happiness and well-being. Growth has not eliminated poverty, brought full employment or protected the environment. Results from a simulation model of the Canadian economy suggest that it is possible to have full employment, eradicate poverty, reduce greenhouse gas  emissions and maintain fiscal balance without economic growth. It’s time to turn our attention away from pursuing growth and towards specific objectives more directly relating to our well-being and that of the planet.

All students, faculty, and the general public are welcome.

Uncategorized

On the Failure of Inquiry-Based Teaching

April 12th, 2010

The full title of Kirschner, Sweller, and Clark’s paper is “Why Minimal Guidance During Instruction Does Not Work: An Analysis of the Failure of Constructivist, Discovery, Problem-Based, Experiential, and Inquiry-Based Teaching“. It was published in Educational Psychologist in 2006, but the whole text is available online. If you’re a parent, teacher, or student, it’s well worth a read. From the abstract:

Although unguided or minimally guided instructional approaches are very popular and intuitively appealing…these approaches ignore both the structures that constitute human cognitive architecture and evidence from empirical studies over the past half-century that consistently indicate that minimally guided instruction is less effective and less efficient than…approaches that place a strong emphasis on guidance of the student learning process. The advantage of guidance begins to recede only when learners have sufficiently high prior knowledge to provide “internal” guidance.

A few selections from the main body:

Minimally guided instruction appears to proceed with no reference to the characteristics of working memory, long-term memory, or the intricate relations between them. The result is a series of recommendations that most educators find almost impossible to implement…because they require learners to engage in cognitive activities that are highly unlikely to result in effective learning. As a consequence, the most efefctive teachers may either ignore the recommendations or, at best, pay lip service to them. (pg. 76)

Inquiry-based instruction requires the learner to search a problem space for problem-relevant information. All problem-based searching makes heavy demands on working memory. Furthermore, that working memory load does not contribute to the accumulation of knowledge in long-term memory because while working memory is being used to search for problem solutions, it is not available and cannot be used to learn… The consequences of requiring novice learners to search for problem solutions using a limited working memory or the mechanisms by which unguided or minimally guided instruction might facilitate change in long-term memory appear to be routinely ignored. The result is a set of differently named but similar instructional approaches requiring minimal guidance that are disconnected from much that we know of human cognition. (pg. 77)

None of [this] would be important if there was a clear body of research…indicating that unguided or minimally guided instruction was more effective than guided instruction. In act…the reverse is true. Controlled experiments almost uniformly indicate that when dealing with novel information, learners should be explicitly shown what to do and how to do it. (pg. 79)

After a half-century of advocacy associated with instruction using minimal guidance, it appears that there is no body of research supporting the technique. In so far as there is any evidence from controlled studies, it almost uniformly supports direct, strong instructional guidance rather than constructivist-based minimal guidance during the instruction of novice to intermediate learners. Even for students with considerable prior knowledge, strong guidance while learning is most often found to be equally effective as unguided approaches. Not only is unguided instruction normally less effective; there is also evidence that it may have negative results when students acquire misconceptions or incomplete or disorganized knowledge. (pg. 83)

There are well over a hundred references into the literature. I’m going to have to think hard about how I’ve been teaching, and how I should teach. I’m also going to keep an ear open when my daughter starts junior kindergarten to see whether her teachers are basing their methods on evidence or wishful thinking.

Learning

Professors *Can* Teach Open Source

April 12th, 2010
Comments Off

Over at opensource.com, Red Hat’s Greg DeKoenigsberg has a post about a new collaboratively-authored textbook on open source software aimed squarely at undergrad courses. (I blogged about the initial announcement a couple of weeks ago.) As Máirín Duffy points out in the first comment, it’s very code-centric, but in my experience, that’s the right approach: students won’t be ready for discussion of design until they’re proficient in coding [1]. I’m looking forward to borrowing lots from the book for Software Carpentry

[1] This is, by the way, why I believe that attempts to teach “computational thinking” without first teaching programming are doomed to fail, but that’s a rant for another time.

Uncategorized

Perpetuating Imbalance?

April 12th, 2010

On March 24, a post appeared on the Code Anthem blog titled “Don’t Judge a Developer by Open Source“. Since it starts by saying that the authors are big fans of 37Signals, I skipped over it (I’m not), but when links to it started appearing elsewhere, I went back to have a read. The post’s thesis is that judging developers by looking at their open source contributions is a bad idea. I’ve been doing that for several years (and telling my students that they should contribute to open projects in order to get noticed), so I expected to disagree with the post, but that’s proving hard. In order, the author’s points are:

  1. It’s an arbitrary distinction.
  2. There are smarter ways to spend your time.
  3. Requiring open source contributions is sexist.

The first is moot, and the second is arguable, but the third hits home. Open source is overwhelmingly male: depending on how you count, only 1-2% of OS developers are women, compared to 12-15% in the industry as a whole [1]. That means that if OS is your selection pool, in the long run you’re going to drive the proportion of women in programming down.

My “solution” is to address the underlying imbalance by evening up gender ratios in open source, but (a) that’s going to take a long time (particularly because so many men in open source still refuse to acknowledge that there’s even a problem to address) and (b) even the way I’ve phrased it is a sign that I’m reluctant to admit the problem too. As another poster says elsewhere:

If you insist on a lot of experience in a particular male-dominated sub-culture as a prerequisite for a job, that reads as “we prefer [a subset of] men, basically, or at least people willing to work hard to minimise all the ways in which they aren’t [part of the subset of] men” even if you didn’t intend it to and even if you didn’t want it to.

I hope that course projects like those in UCOSP will prove to be a workable middle ground, i.e., a place where young programmers can build their portfolios and reputations without having to worry that some crank is going to be allowed to sneer, bully, or troll without being held accountable. We hope to know soon whether we’ll be able to run the program again this fall…

[1] The article’s 28% is much higher than any number I’ve ever seen quoted elsewhere, and the source the article cites doesn’t cite an original source itself.

Equity, Learning

Another Software Carpentry Update

April 11th, 2010
Comments Off

From the last week and a half:

Lots of decisions to make in not very much time—as always, feedback and input would be appreciated.

Software Carpentry

PSF Membership

April 9th, 2010

I am pleased to announce that I am now a member of the Python Software Foundation. I’m flattered to have been nominated, and grateful to everyone who didn’t point out that I’m still using 2.6 :-)

Python

Summer School on Mining Software Repositories

April 8th, 2010
Comments Off

Summer School on Mining Software Repositories

http://msrcanada.org/school/

June 9-12, 2010.
Queen’s University, Kingston, ON, Canada.
Sponsored by MITACS.

The Mining Software Repositories (MSR) field analyzes the rich data available in software repositories to uncover interesting and actionable information about software systems and projects. It has gained popularity since 2004 with the first instance of the MSR workshop (now conference) and continues to be one of the fastest growing fields in the area of software engineering.

This summer school will provide students with opportunities to learn the background needed to excel in this emerging and important field. For researchers, the summer school offers a platform to discuss and collaborate on the future of the MSR field. The summer school is also an opportunity for industry to learn how to adopt MSR ideas in practice. The speakers are leading experts on MSR from academia and industry.

Lecturers

  • Tim Menzies, West Virginia University, USA
  • Audris Mockus, Avaya Labs Research, USA
  • Tao Xie, North Carolina State University, USA
  • Ahmed Hassan, Queen’s University, Canada
  • Daniel German, University of Victoria, Canada
  • Thomas Zimmermann, Microsoft Research, USA

For topics and a schedule visit the school’s web-page at http://msrcanada.org/school/.

Announcements, Research

How Do You See Maps?

April 7th, 2010
Comments Off

Back in the 1990s, I did a bit of volunteer work with the Canadian National Institute for the Blind, and one of the things I learned was that computers often make life even harder for people whose lives are already hard enough. Remember when classified ads went online? It was several years before screen readers like JAWS caught up, which meant that for several years, finding a job or an apartment was even harder for the visually impaired than it had been. And just when things had settled down, AJAX appeared and broke screen readers again.

Another recent(ish) development that has made life harder for the visually impaired is the increased use of maps on the web. One of my grad students, Alecia Fowler, is trying to address that problem by finding out how best to describe maps to people who can’t see them. If you’re willing to give her 30 minutes of your time, please head over to http://www.cs.utoronto.ca/~aleciaf/maps/ and give her little “game” a try.

Research

Communication Matters Most

April 6th, 2010
Comments Off

Tania Samsonova has posted an interesting article discussing the importance of communication skills to job success for junior developers. Drawing on the work of people like Andrew Begel and Beth Simon (who are contributing a chapter to our upcoming book on evidence-based software engineering), Tania talks about how the ability to ask questions and share ideas is a lot more important than specific technical skills. I particularly like this quote:

- Anna, congratulations: your understanding of spoken English improved a lot.

-How do you know? You rarely talk to me anyway.

-You’ve stopped smiling and nodding all the time when people talk to you.

Making Software