Archive

Archive for August, 2006

Chris Lenz on Django

August 22nd, 2006

Chris Lenz has been a major contributor to Trac, and spent a month working on DrProject for us earlier this year.  His post in the wake of Guido’s endorsement of Django is a thoughtful look at the framework’s strengths and weaknesses.

Uncategorized

Revised List of Project Students

August 20th, 2006

I’ve updated my roster of project students (appears below the divide): 116 students, of whom 28 have done two projects. I still need to update my map, though, and I don’t have birthplaces for the following people:

  • Lillian Angel
  • Daniel Charles
  • Jim Clarke
  • Victor Glazer
  • John Greene
  • Robert Liu
  • Jason Nolan
  • Andrey Petrov
  • David Scannell
  • Jane Shen
  • Thuan Ta

Details would be welcome…
Read more…

Uncategorized

Rome In Its Later Days

August 20th, 2006

A survey of 32 European countries, the US, and Japan, reveals that the only people less willing to believe in evolution than Americans are those living in Turkey:

I’m surprised no one has cast this as a national security issue. America isn’t the world’s only superpower because its soldiers are braver than everyone else’s, or its politicians smarter, but because of its economic and technological supremacy. The former depends on the latter, which in turn depends on having a large pool of scientifically literate innovators. Lysenkoism and its cousins crippled the Soviet Union during the Cold War; will historians half a century from now point to the “equal rights for superstition” movement as one of the reasons America was surpassed by India and China?

Uncategorized

BarCamp Earth in Toronto

August 18th, 2006
Comments Off

BarCamp Earth is a simultaneous compendium of Barcamps around the world to commemorate the one-year anniversary of the first-ever BarCamp taking place August 25-27.  Local editions will be held in Toronto, Waterloo, and Sudbury (and I still sometimes feel a pang that Vancouver isn’t local).  There are only 70 spaces in TO, so sign up early.

Uncategorized

When I Rule the World #173: Google’s Summer of Code

August 18th, 2006

The second edition of Google‘s Summer of Code is winding down, so I figured it’d be a good time to post my (unsolicited ;-) ) thoughts on what they should do next time around. Here are my starting points:

  1. The world will be a better place if everyone studying software engineering at college or university has an opportunity to get involved in a real open source project.
  2. It’s no good to say, “If they want to volunteer, there’s nothing stopping them,” because the rational economic decision for an undergrad is to invest time in things that will earn them grades — the network benefits of involvement in open source are too fuzzy.
  3. Google can’t afford to give $5000 to 20,000 students a year. Even if they could, the open source community couldn’t handle 20,000 newbies every September.

The solution is to give students course credit for their work on open source projects, but of course, it’s not that simple. First, there has to be a quality filter: someone, somehow, has to filter out those students who would be a net drain, or open source projects will (sensibly) choose not to play. Second, someone has to evaluate the students; at most universities, this can only be done by people who have some official standing.

I think both issues can be resolved by bringing the ACM, the IEEE, and other professional bodies into the loop. Everybody wins:

  1. Students who want to get course credit for open source projects have to be student members of [name of professional body goes here], and have a GPA of [arbitrary threshold around a 'B' goes here]. If there are still too many candidates, it would be simple to restrict entry the top N students in a one-day programming competition. The professional bodies win by getting more members; the students win (even if they don’t realize it right away) by becoming involved in the organization(s) that are going to shape their careers.
  2. Instead of giving money directly to a couple of hundred students, Google spends part of its money on a staff of 6-8 full-time ematchmakers. Google wins by lowering its recruiting costs: as I discovered at EPCC and U of T, projects like this are the best interview tool there is. I predict that if Google takes my advice, a headhunting service called “Google Jobs” will launch within two years, and become self-supporting almost immediately.

That leaves the problem of assessment. Sadly, most computer science professors don’t know enough about modern software to be able to judge whether someone’s end-of-term patch is worth an A or an F. However, many open source projects have at least one committer with an advanced degree of some kind or another. With a bit of prodding from [name of professional body goes here], I could see at least a few universities giving these people adjunct appointments (which are usually unpaid).

Looking at this year’s sponsor organizations, I could see 12-15 groups being willing to give this a trial run. The open source organizations win by getting (a little) more work done, and (more importantly) by growing the pool of potential committers. The universities win because their programs look more relevant (an important issue in an era of declining enrolment), and because this sort of cross-campus evaluation gives them a good way to assess the strengths and weaknesses of those programs.

Would it work? I have no idea, but it’d be fun to try. The biggest hurdle to overcome is probably Google‘s well-deserved belief that Summer of Code isn’t broken, and therefore doesn’t need to be fixed. There’s also the complementary problem of getting the attention of the ACM and other bodies, which are not famous for their sense of adventure. Getting universities and open source projects on board would also require some work, but probably not much. As for students, I think they’d have to be beaten off in droves ;-) .

Uncategorized

Oh My God It’s Django!

August 17th, 2006

Guido just pronounced: Django is the web framework

  • Won’t be part of the core, but will be as “standard” as PIL or NumPy
  • This was not what I expected the outcome of my talk would be, but hey, I’ll take it ;-)
  • He hopes that Django and TurboGears will converge

Eric Jones: Enthought Tool Suite

  • Enthought does open source software and consulting for scientific computing with Python
    • One of the sponsors of this conference
    • Providing hosting for Software Carpentry
    • Walked through traits and other offerings

    Chris Mueller: Synthetic Programming with Python

    • A Python library for generating assembly code for the Power PC
      • Get rid of the many layers between Python and the hardware
    • Great performance for great effort (up-front about that)
    • Q: How do you debug? A: insert an illegal instruction, back up a few bytes, single-step in GDB
      • Interested in multi-language debugger, but interested in a lot of other things as well

    Prabhu Ramachandran: 3D Visualization

    • Author of MayaVi, a better (more Pythonic) wrapper for VTK
    • Impressive — hides as much of the guff in complex 3D scientific visualization as it can

    Andrew Straw: Realtime Computing with Python

    • The Grand Unified Fly: a computational simulation of a fruit fly
    • Use Python for high-level stuff, and real-time for sub-millisecond control of motors, etc.
    • Program $20 microcontrollers with Python
      • E.g., Flydra is a multi-headed camera+FireWire system to track fly motion

    Lightning Talks

    • Mike Ressler: Prototyping Mid-Infrared Detector Data Processing Algorithms
      • Classic data crunching with NumPy
    • Brian Granger: The State of IPython
      • Sales pitch
    • Travis Oliphant: Array Interface BOF
      • Please for people to help put together PEP for arrays in Python
    • Travis Vaught: Enstaller
      • Enhancements to Python Egg system (with GUI)
      • Worth tracking
    • Michel Sanner: The Current State of Vision
      • Update on a visual builder for image processing pipelines
      • Very cool — but lots of overlap (it seems) with MayaVi
    • Peter Wang: Quick Overview of Chaco
      • 2D plotting library
      • Repeat of slides from yesterday’s tutorial

    William Stein: Software for Algebra and Geometry Experimentation

    • SAGE bundles together lots of other packages used for algebra and exact computation
    • Very particular about getting his whole half hour, despite the late hour ;-)

    Alex Clemesha: Mathematica-like Plotting for SAGE

    • Slow cruise through SAGE’s graphics
    • As with web frameworks, Python has too many plotting packages for its own good

    Diane Trout: BioHub

    • There’s a lot of sequence data out there
      • And collection is accelerating rapidly
    • BioHub is a Python interface for large-scale genomic analysis
      • A database to link diverse annotation sources
    • I didn’t know that genes have version numbers… ;-)

    Greg Wilson: Software Carpentry

    • Last talk of the day — some locals had already headed home, but there were about 70 people present
    • Went well, but no one’s offering to teach the course at their institution this fall
    • Slides available online

Software Carpentry

SciPy’06: First Morning

August 17th, 2006

Guido van Rossum’s Keynote

  • Python 2.5 coming Real Soon (Sept 12)
  • Python 3000 is a brand-new revision of the language
    • Name chosen as a dig at Windows 2000, and so that it couldn’t possibly be late
  • Fix design bugs dating from 1990-91 + get rid of deprecated features
  • First time Guido has allowed himself to be backward incompatible
  • Need process, but don’t want to become C++ or the next Perl 6
  • Alpha early 2007, final a year later (early 2008)
  • Cares a lot about bringing users with him
    • Will go as far as 2.9 (run out of digits)
  • Changes:
    • New keywords allowed
    • dict.keys(), range(), zip() won’t return lists
    • All strings Unicode; mutable ‘bytes’ data type
    • Binary file I/O redesign
    • Drop <> as an alias for !=
    • Etc.
  • See PEP 3099 for things that won’t happen (e.g., programmable syntax)
  • Can’t do perfect mechanical translation (dynamic languages)
    • Use pychecker-like tool to handle 80% of cases
    • Create instrumented Python 2.x that warns about “doomed” constructs
  • See PEP 3100 for the laundry list
  • Small points
    • Kill classic classes
    • Exceptions must derive from BaseException
    • int/int will return a float
    • Remove last differences between int and long
    • Absolute import by default
    • Kill sys.exc_type and friends
    • Kill dict.has_key, file.xreadlines()
    • Kill apply(), input(), buffer(), coerce()
    • Kill ancient library modules; more stdlib cleanup
    • exec becomes a funciton again
    • Kill `x` in favor of repr(x)
    • Change except clause syntax to exception E1, E2, E3 as err
      • Means “as” becomes a keyword
    • [f(x) for x in S] becoms sugar for list(f(x) for x in S)
      • General trend in Python away from lists toward more abstract structures
    • Kill raise E, arg in favor of raise E(arg)
    • zip becomes izip
  • lambda lives!
  • String types reform (bytes and str instead of str and unicode)
    • All data s either ibnary or text (conversions happen at I/O time)
    • Different APIs for binary and text streams
  • New standard I/O stack
    • C stdio has too many problems
    • Borrow from Java streams API (bleah)
  • Print becomes a function (boo)
    • See mailing list thread for justification
    • But I think that putting the output file at the end in print(x, y, file=z) is going to trip people up
  • Dict views instead of lists
    • dict.keys() and dict.items() return a set view
    • dict.views() will return a bag (multiset) view
    • Can delete from (but not add to) a view
      • Modifies the dict accordingly
  • Drop default implementations of comparison operators
    • <, <=, etc., currently compare by address --- will raise TypeError
    • == and != should remain (useful)
  • Generic and overloaded functions (see his blog — running out of time)
  • Python sprints coming up (Aug 21-24)
  • Q&A
    • Py3K team is smaller than Perl6 — GvR optimistic that people will get the work done
    • Taking advantage of multicore?
      • GvR not a big fan of threads
      • Prefers loose coupling (one process per core)
      • Last attempt to get rid of the GIL slowed Python down by 2X
      • But neither Jython nor IronPython have a GIL
    • Will C-Python API change much?
      • Yup — just like the language
    • PyPy/type inference?
      • Python 4.0 or a sibling language

Travis Oliphant on the State of NumPy

  • Chair thanked him for everything he’s done to fix numerical Python — standing ovation (well deserved)
  • NumPy 1.0 rc1 will be out in a few weeks
  • Walked through design — tradeoffs between flexibility, portability, and performance very well thought through
  • One part I enjoyed was the way he flipped back and forth between PowerPoint and the interpreter
    • Clear that for him, Python is a tool for thinking with
  • Showed off weave, an Enthought tool for embedding C in Python for array programming
  • Also shows off Pyrex (another tool for the same purpose)

Fernando Perez: Python for Modern Scientific Algorithm Development

  • “Why is Python more than ‘free MATLAB’?”
    • Power of built-in datatypes, higher-level programming, etc.

Michael Aivazis: “Building a Distributed Component Framework”

  • Described a medium-sized framework called pyre
    • 1200 classes, 75K lines of Python, 30K lines of C++
    • Has been running in various incarnations for almost ten years
  • Good discussion of architectural issues — perfect example of the kind of researcher I’d like Software Carpentry to produce

Software Carpentry

Ambient Tech Talks

August 16th, 2006
Comments Off

Ambient Vector has started a series of tech talks.  Their first guest was Nick Koudas, from U of T.  May seem like a small thing, but there haven’t been as many ties between the university and local software companies (other than giants like IBM) as there ought to be — I’m pleased to see that changing.

Uncategorized

The Trouble With Normal

August 16th, 2006
Comments Off

We were listening to Bruce Cockburn on the way up to the cottage last Friday.  “Lovers in a Dangerous Time”, “Rumors of Glory”, “Wondering Where the Lions Are”, and so many others are gorgeous, but the line that stuck in my head was, “The trouble with normal is it always gets worse.”

Amnesty International has been fighting that particular kind of entropy for a long time now.  A few weeks ago, they launched a new campaign to fight repression on the Internet. Having spent an hour on the phone trying to find out what I’m actually allowed to take on a plane these days (see Kung Fu Monkey for an accurate summary of how I feel about the current wave of “security theater”), I’d like to ask you to check out AI’s site.

Uncategorized

SciPy and Software Carpentry

August 16th, 2006
Comments Off

I’m flying down to CalTech this evening to give a talk on Software Carpentry at SciPy’06. There’s been a fair bit of traffic on the web site in the last couple of weeks, and I’m looking forward to hearing how else we can help scientists and engineers become more productive programmers.
Software Carpentry Usage

Total visits: 4354 in 15 days, from 731 distinct domains (excluding obvious robots).

Software Carpentry