Archive

Archive for the ‘Extensible Programming’ Category

The Larch Environment

February 9th, 2013
Comments Off

G.W. French, J.R. Kennaway, and A.M. Day: “Programs as visual, interactive documents.” Software – Practice and Experience (2013), DOI: 10.1002/spe.2182.

We present a novel approach to combined textual and visual programming by allowing visual, interactive objects to be embedded within textual source code and segments of source code to be further embedded within those objects. We retain the strengths of text-based source code, while enabling visual programming where it is beneficial. Additionally, embedded objects and code provide a simple object-oriented approach to adding a visual form of LISP-style macros to a language. The ability to freely combine source code and visual, interactive objects with one another allows for the construction of interactive programming tools and experimentation with novel programming language extensions. Our visual programming system is supported by a type coercion-based presentation protocol that displays normal Java and Python objects in a visual, interactive form. We have implemented our system within a prototype interactive programming environment called ‘The Larch Environment’.

This is cool: see their site for more information.

Extensible Programming

What Will Programming Look Like in 2020?

December 29th, 2012

Over on Lambda the Ultimate, Sean McDirmid has asked:

What will programming look like in 2020? Keep in mind that programming in 2012 mostly resembles programming in 2004, so could we even expect any significant changes 8 years from now in the programmer experience? Consider the entire programming stack of language, environment, process, libraries, search technology, and so on.

Most of the respondents believe things will look pretty much like they do today, while a few brave souls hope that more sophisticated type systems, AI-in-the-IDE, power-sensitive computing, and other things that are no longer science fiction will have gained ground. Personally, I think they’re all looking where the light is better. I think there will be big changes over the next seven years, though they won’t actually have moved into the mainstream by 2020: if we’re lucky, they’ll be where transactional memory is today (and if we’re unlucky, they’ll be marooned on the fringe with Prolog and literate programming).

Change #1: we’ll start treating programming language design as a usability problem, and use empirical techniques from HCI to decide how the “normal” bits of languages ought to be presented. Yes, truly novel language features have to be built a few times before it makes sense to do any comparative studies, but 90% or more of any programming language is stuff that’s been built many times before, so we actually can do A/B testing to see whether one way of presenting iteration is easier for people to master and debug than another. For my money, the PLATEAU conference series will be the place to be in 2020, just as MSR and CHASE are where the cool kids in software engineering hang out today. And no, this doesn’t have to be heavyweight or labor-intensive: I did a couple of field tests back in 2000 that took less than an hour each—personally, I’d like to see every enhancement to every programming language tried out this way as part of the proposal and review process.

Change #2: we’ll (finally) separate models from views in programs. Yes, I know, I’ve been predicting this since 2004, but I really do think someone will build a proper CAD system for software in the next two or three years. If you’re new to the concept, the idea is that we should separate the storage and presentation of programs, just as we separate the storage of the plan for a building (as a set of geometric “things” connected by constraints) from its presentation (as shaded isometric drawings, parts lists, wiring diagrams, and so on). Once we free ourselves from the legacy limitations of ASCII as both storage and presentation, we’ll be able to build intentional programming systems, which will, I believe, lead to an explosion in creative problem-solving the likes of which we haven’t seen since the first REPLs and spreadsheets appeared.

Both of these ideas are currently outside the mainstream of programming language research, i.e., they aren’t currently discussed on LtU with any frequency :-) . That leads to an interesting follow-on question: if we look back eight years to Ehud Lamm’s re-launch post, how many of the things discussed then have moved from discussion to implementation to adoption, and how many of the things that have done so were missing from those long-ago discussions?

Extensible Programming

I Have Seen the Future…

February 15th, 2012
Comments Off

…and its name is Bret Victor. (Jump ahead to 7:00 and watch for a couple of minutes if you need to be persuaded…)

Extensible Programming

Where’s My Shell?

November 30th, 2011

My first programming language was PL/1. (Look it up if you need to, kid.) My second was Pascal, and then in the summer of 1982 I was introduced to two more: C, and the Unix shell. I realized that C was a programming language right away, mostly because that’s what people said it was.  It took me longer to realize that the shell was also a programming language; I still remember the first time the sys admin responsible for our VAX 11/780 wrote a ‘for’ loop on the command line to try my program for each of several input files.

But now I’m programming on the web, which means I’m writing Javascript. OK, that’s the equivalent of C: full of traps for the unwary, but able to do quite a bit once you get your head around it.  What I want to know is, where’s my shell?  Where’s the tool that’s not quite as expressive (try manipulating tree-shaped data in the Bourne shell), and not as fast, but lets me do with two or three commands what would take half an hour to code and debug at the base level?

JQuery isn’t what I’m looking for (though I’m very, very grateful that it exists). JQuery isn’t a watertight layer of abstraction above Javascript: it’s leaky, in the sense that users still have to pay attention to Javascript-y things.  (“Wait, I forgot to say ‘var’…”)  CoffeeScript and other compile-to-JS languages are leaky too, but the shell is as close to being leak-free as any abstraction layer I can think of. In almost 30 years of nearly constant use, I’ve almost never had to think at the C level in order to debug a problem at the shell script level [1].

So what would a shell-like tool for in-browser programming look like? Well, I think it would look something like Microsoft PowerShell. There’d be lots of more-or-less orthogonal tools, each of which did one thing, but instead of communicating via streams of text, they’d send one another streams of objects. Those tools would probably be written in Javascript, and the shell might be as well (just as bash and its ilk are written in C), but users wouldn’t see that.  What they would see is that they could package combinations of tools into scripts that could then be used as tools in their own right. The syntax for simple operations would be much simpler than that of Javascript, at the cost of some expressive power, but that’s OK: 80/20 splits are a good thing.

And hey, it would give us a chance to choose command names that are more mnemonic than ‘ls’ and ‘mv’… :-)

[1] And the times I did were my own fault, because one of the components in my pipe was a program I’d written myself, which didn’t fork properly, and—never mind, it’s not important.

Extensible Programming

Two Steps Forward, Two Steps Back?

November 14th, 2011
Comments Off

The November/December 2011 issue of IEEE Software has a good article by Markus Völter titled, “From Programming to Modeling—and Back Again“. In it, the author asks, “What’s the difference between programming and modeling? And should there be one?” His answer to the second question is no: instead of today’s sharp divide between describing the problem domain, and telling a computer what to do, we should use extensible environments that support a continuum between the two.  I like his description of what’s wrong with things today: we shouldn’t use concrete syntax as a storage format.  But as with so many other articles on extensible programming, I think he glosses over the biggest practical obstacle such systems face: debugging.  An abstraction’s usefulness is limited by how fixable it is when it breaks; if you give me a way to program in pictures, in natural language, or in some domain-specific notation, but then require me to wade through automatically-generated spaghetti code to figure out why my description of what I want isn’t doing the right thing, you haven’t really helped me very much. I’m still trying to figure out how to reconcile this with the “pile of crap” problem I ranted about in my previous post, though. If my editor/compiler/debugger are all extensible, then everything I try to do will trip over installation and configuration snags…

Extensible Programming

Extensible Programming: A New Hope

September 16th, 2011

Back in 2004, I wrote an article for ACM Queue titled “Extensible Programming for the 21st Century“. In it, I argued that it was time for programming languages to break free from their textual shackles—that we should separate models (the program’s content) from views and controllers (how that content is displayed), just as we have with CAD systems, word processors, and other tools that work with rich, highly-structured content. Before I look at some recent signs of progress, let me back up and explain both the problem and why I think model-view separation is the solution. What I want is a programming “language” that can be extended in a wide variety of ways. That’s pretty conventional—every modern language lets people define new functions, create new types, overload operators, etc.—but I think that’s much too limited. I want to be able to embed a table in a program as a first-class object, so that instead of:

if x:
    if y:
        z == 0
    else:
        z == 1
else:
    if y:
        z == 2
    else:
        z == 3

I can write

z =
y
True False
x True 0 1
False 2 3

Actually, scrap that: what I want is to be able to embed a full-blown spreadsheet in amongst other statements so that I can write:

def func(a, b):
x = a > b
y = something(a, b)
z =
y
True False
x True a + 1 b + 1
False (a + b) – 1 a + b

I want to be able to embed other things as well: bubble-and-arrow diagrams of finite state machines, before-and-after diagrams of data structures, and everything else that appears in programming textbooks. I can do this if I’m writing in Word or LibreOffice, or giving a lecture in front of a whiteboard. Why can’t I do it when I’m programming?

The standard answer is, “Because programs are text.” To which I can only reply, “No: that’s a storage format for programs—a very limited (and limiting) one.” I don’t know how LibreOffice stores documents, or how SQLite stores a database, and I don’t care. In fact, I shouldn’t care unless something goes wrong (about which more later). In every domain except programming, programmers accept that there should be a division between:

  1. models (the things that make up whatever we’re working with)
  2. views and controllers (how we display and interact with those things)
  3. storage formats (how we store things on disk)

The only place where we don’t do this is when we’re building things for ourselves. We insist that programs be stored as text streams because that’s what Emacs, Vi, and the Unix command-line tools know how to work with, and what our compilers know how to process, but it doesn’t have to be like that. We could store programs as XML documents, blobs of JSON, serialized Python data structures, whatever your favorite database uses, or something else entirely, then teach our controllers and views (i.e., our editors), and our processing tools (compilers and interpreters) to display that however we wanted and let us manipulate it in whatever way made the most sense for the content.

I thought at the time that doing this would require us to combine three specific technologies:

  1. Programming languages that allow programmers to extend their syntax.
  2. Compilers, linkers, debuggers, and other tools that are frameworks for plug-ins, rather than monolithic applications.
  3. Programs that are stored as XML documents, so programmers can represent and process data and meta-data uniformly.

Looking back seven years later:

  1. #1 was the goal of the exercise, so long as “extending syntax” is understood to mean “with entirely new displays with associated processing rules”. Scheme’s hygienic macros and Python’s operator overloading are great, but they still require you to squeeze your thoughts into a single display form. You don’t have to do that on a whiteboard, when authoring a textbook, or when writing a plugin for Firefox (though in each case, what you’re building will be more comprehensible if it rhymes with things users already understand); why should you have to do it with programs?
  2. #2 is a corollary of #1: if I’m going to add X to my language, I need a way to tell tools how to handle X. The compiler or interpreter needs to know how to translate it into primitives; the optimizer might need special rules for making it fast; the debugger needs to know how to display its operation or internals, my coverage tool probably needs something as well, and so on. This is straightforward engineering: every new “thing” should be a bundle that respects various tools’ APIs, and each tool should provide an API so that people can plug things into it.
  3. I was wrong about #3, partly because I still hadn’t escaped the “storage format equals model” trap, and partly because I thought that XML processing tools like XSLT would catch on and become the basis for a new generation of pipe-and-filter tools to replace the venerable Unix toolset. What I’d say today is, “Programs are stored in some format that’s amenable to low-level inspection by people who are developing new plugin forms, in the same way that SVG diagrams can be viewed as angle-bracketed text if absolutely necessary.”

So, how has reality measured up? Well, as Bill Gates once said, “We always overestimate the change that will occur in the next two years and underestimate the change that will occur in the next ten.” There has been renewed interest in programming language design in the last few years, mostly focused around the meme of functional languages being the only way to cope with massive parallelism (they aren’t, but that’s a different post). Unfortunately, all of these languages have remained trapped in the textual tarpit: whether it’s Scala, Clojure, Haskell, or something more esoteric, their authors all take for granted that a program has to be a stream of characters that can be edited by Pico or Notepad. There are, however, a few signs of progress:

  • Something—no one is quite sure what—is happening at Charles Simonyi’s company Intentional Software. This talk seems to show a working implementation of many of the ideas discussed above, focused on letting business people build things for themselves. That will fail unless someone teaches those business people the principles of computational thinking, but the tool itself shows a lot of promise.
  • Mathematica’s Computable Document Format doesn’t provide the extensibility, but I still think it’s worth watching. As I said in the original article, the biggest obstacle to getting from here to there is that everything has to change at once: editors, compilers, the language, etc. Vendors who own the whole toolchain, like Wolfram (Mathematica), The MathWorks (MATLAB), and Microsoft (.NET) can do this more easily than systems with distributed ownership (the C/Linux/Vi-or-Emacs/GCC toolchain). In that light, Microsoft’s compiler-as-a-service project is very interesting as well…
  • Steve Witten’s TermKit is also interesting: he’s replacing the plain text streams that classic Unix tools use to communicate with streams of JSON and a rich display. This grassroots, bottom-up approach is an interesting complement to the corporate, top-down approaches above.

But the project that has me really excited is Geoffrey French’s Larch, which leverages Python and Jython to give people a forward migration path. The tutorial videos don’t really do Larch justice, and you have to remember that it’s really a proof of concept, but I’m still very excited. If nothing else, it could finally bring software visualization into the mainstream: as several studies have showed, visualizations are more useful and powerful if people create their own, but existing IDEs and languages make this hard. By making “draw this” a natural part of programming, Larch or something like it could be the next big step forward in programming.

Extensible Programming, Uncategorized

Grown-Up Languages

July 12th, 2011

A few days ago, after browsing the Coffeescript docs and examples, I tweeted, “I will take your new language seriously when you have a symbolic debugger for it. For it, not for the C/JavaScript/whatever it compiles to.” So what exactly did I mean by that?  Well, a debugger is a program that lets you watch and control another program.  (If you’ve never used one, have a look at this 6-minute video.)  Instead of staring at your code, trying to figure out why it’s broken, or adding ‘print’ statements left and right to display values, debuggers let you stop the program at any point and look at the values, or tell the program to execute one line at a time so that you can see which “if/else” branches it’s taking, what parameters are being passed to function calls, and so on.

Debuggers make programming much less painful and much more productive, but a lot of students never pick up the habit of using one.  Personally, I think this is because teachers have never figured out how to put questions about using debuggers on mid-terms: most computer science programs don’t have an equivalent of the “lab exams” that are common in chemistry and biology, and if students are never examined on their ability to do things the right way, they never have to climb the learning curve.  But that’s just a guess, and tangential to the main point of this post.

What I really want to talk about is languages like Coffeescript, and why I won’t use them.  If I write a program in C, Java, Python, or Javascript, it is translated into instructions for some kind of machine (either real hardware, in the case of C, or a virtual machine in the case of the other three).  When that program runs, the debugger can match up the source code of the program and the instructions that are being executed, so that when I say, “Go to the next line,” the debugger can do what I ask.  These debuggers are WYSIWYG: the code I wrote (i.e., the code that’s in my mind) and the operations and data that I’m debugging, line up neatly.

But if I write a program in Coffeescript, it isn’t executed directly.  Instead, it is translated into Javascript, and then that is what’s run.  If I want to debug that Javascript, I can, but I didn’t write it: a computer program did.  Yes, the bits and pieces of that Javascript correspond to my Coffeescript, but the match is not obvious, and may not even be one-to-one.  It’s as if I wrote a contract in English in terms of Ontario law, then had to defend it in court in French under Quebec law.  Without near-expert understanding of how the one translates to the other, it’s hard or impossible for people to reverse engineer what the debugger is showing them and figure out which bits of the code they actually wrote needs to change.

I first ran into this problem in the 1980s, when the C++ compiler I was working with translated my object-oriented programs into a tangle of strangely-named C functions for me to compile and run.  After a while, I figured out that if the error was in XX_@_YY_@@_ZZ, I should look for a method called XX.YY taking a parameter of type ZZ, but the C that implemented overloaded operators was ugly enough that I never really wrapped my head around it.  Debuggers eventually appeared that could handle C++ “in source”, and I’m sure that if Coffeescript proves popular, a native debugger for it will appear as well.  ‘Til then, as much as I prefer its syntax to Javascript’s, I will (regretfully) turn away…

Extensible Programming

Usability of Programming Languages

March 9th, 2011
Comments Off

Alan Blackwell’s course at Cambridge on the usability of programming languages has as its text a selection of chapters from a 1990 book on the psychology programming.  There’s a ton of great material here: I’d love to see a revival of interest in the topic.

Extensible Programming

Ground Up, in No Particular Order

February 11th, 2011
Comments Off

I’ve said for years that extensible programming systems wouldn’t be designed per se; they’d emerge from the ground up as a younger generation left lines of ASCII text behind without ever really thinking about it.  This new tool from Alex McLean is a case in point: yeah, it’s fragile, and primary colors on black isn’t my favorite look and feel, but man, isn’t it cool?

Extensible Programming

Bottom-Up, Top-Down, and Back to the Future

March 13th, 2010
Comments Off

I just (finally) watched the demo video for Andrew Bragdon’s CodeBubbles. You’ve probably already seen it, but if you haven’t, check it out: it rocks. Like Kael Rowan’s Code Canvas (a Microsoft research project), it imagines an IDE that is more than just a bunch of 1970s-era TTYs in a frame. I think of these as bottom-up efforts: both still accept that source code must be ASCII tokens, and do the best they can from there. In contrast, Intentional Software’s still-in-beta product (described in this talk) goes further toward treating source “code” as a model in the model-view-controller sense, so that rendering and interaction can be comprehensively customized.  I still believe that sooner or later, the maker of a proprietary language (most likely Wolfram Research or The MathWorks, but Microsoft is still in the running) will proudly announce a breakthrough in this area, and that everyone else will scramble to catch up, while graybeards on the sidelines point to the original Smalltalk of the late 1970s and grumble that it has all been done before.

Extensible Programming