Would you like your new programming language to have a million users in a couple of years? You would? Cool—here’s how to do it. Instead of asking yourself, “How will people write loops?” or, “Is this statically or dynamically typed?”, ask yourself, “What can I do in the language to make packaging and installation a zillion times easier?” Because that’s the biggest headache most people have these days: getting a thousand and one bits of code to install on their machine and play nicely once they’re there.
This is why I find Julia uninteresting, and why I think we’d be further ahead if the effort that was put into Python 3 had been put into rationalizing the standard library and sorting out P&I. I don’t know a single person who switched to Python because of the improvements in 3.*, but I know a lot who would trade their eyeteeth for a P&I system that just plain worked. Similarly, I don’t see a single feature in Julia that’s expressly designed to make P&I easier, which means I don’t see any reason to get excited about it. I’d be interested in pointers to languages and systems that do a significantly better job…
95% of my Python package installation-and-dependency needs are met with “apt-get install $PKG_NAME”, thanks to Debian’s exhaustive packaging ecosystem, regardless of language. Sure, Debian doesn’t have every package on PyPI, but by Sturgeon’s Law it’s probably got most of the ones you actually care about.
The other 5% of my Python package-installation-and-dependency needs are packages I’ve written myself, that are too small for Debian to have noticed. For that tiny subset of cases, I’m more than happy to just untar them somewhere and make sure my $PYTHONPATH is configured appropriately.
Wow, pick on a prerelease language (Julia) which is right in the middle of getting a packaging system going.
Pkg is now in the base system and we’re working out the kinks. Last week I fixed up the scaffolding so creating a minimal package is one call. Tools to publish the packages to public repositories don’t exist yet, but making it easy to get stuff up on GitHub (then into the primary package index) is on my to-do list.
Maybe I shouldn’t take it so personally, but considering that the focus of most of us right now is on exactly this the timing is very curious.
How about this? Is there something we’re not thinking of that leads you to this conclusion? Can you provide a constructive critique on making what we’re doing better?
Aren’t virtual machines, config tools like Vagrant, and admin s/w like Puppet improving this?
Screwtape, respectfully, that is just not the case for the kind of work that I do. Atp-get, portage, pacman, eventually they are all insufficient… and in most cases difficult to extend. Then once you need to go to Mac or Windows (Python is supposed to be cross-platform, right?) the packaging situation becomes 2-3 orders of magnitude more difficult and much more fragile.
That’s what I attribute Java’s success to, as a matter of fact.
Not that it gets everything right — and a lot of it only comes through the use of tools supporting maven-style dependency management — but even the original Java had a far superior library management experience than anything else I’d heard of at the time.
And Java experience has evolved since then, and some of the most important upcoming changes are directly related to that.
Not for me.
As it happens, automation tools for Linux are usually based on Python or Ruby themselves, which pretty much gives up all advantages the use of virtual machines gain you. For example, Debian’s package management tools are written in Python, and Puppet is written in Ruby.
For ruby, rvm takes care of the problem in an extremely awkward manner, and Python’s virtualenv is even worse. They work perfectly, as long as you are doing everything by hand and with root access. Try automating it with limited access, and the way they work gets in the way pretty quick. At that, at least that’s better than C/C++, where the only practical option is to rely on static binaries (fortunately, there *is* that option).
Compared to that, Java’s experience is relatively easy — though it relies to some extent on a glacial evolution of the language itself, as opposed to it’s libraries — and nodejs is a snap. In fact, nodejs does a lot of things right, and I can barely wait to see what will become of it once harmony improves things on Javascript’s language side.
Julia has a somewhat unique git-based package system built into the language, which might count as a language feature for P&I, but if you care about P&I, rather than language features, you are probably not Julia’s target audience. Package management can be annoying, but it’s not as annoying as waiting days rather than minutes for your code to run because your language is 100 times slower than Julia.
Screwtape: meaning no offense, but… you’re looking at the problem exactly backwards. It’s not too hard to do one install on one machine one time, especially if you’re intimately familiar with your OS, configs, and the program you want to install. But if I put you in a room with 100 random people with laptops, and told you to get Python 3.whatever up and running before the “real” part of the meeting/class/demo started, you’d be quoting Murphy’s Law, not Sturgeon’s Law. Because 15 of the 100 people would have some baffling roadblock stopping them from just getting on with it– they’d be using an old OS, or a new one, or just one you aren’t familiar with; you’d change the WiFi settings to work with your program, and break it for connecting to their corporate network by accident; or they’d have a different directory structure than you were expecting, and you’d have to dig through God knows how many sub-directories to find where they put something you needed; or on this particular configuration, there’s a dependency on one DLL that you didn’t expect (and updating it could cause a chain reaction with other programs, which you don’t know about either); or any of a zillion other problems.
Oh, and then somebody else fiddles with something else later, and now your program suddenly stops working, or worse, corrupts something important.
Can you imagine putting up with that kind of *&%%^ in any other area of your life? Replace your car’s air filter, then find out you accidentally broke the muffler because it’s not “compatible”? Pay a package delivery company to ship you something, but if it doesn’t arrive you have to go find it yourself? “License” a stove instead of buy it, and have to accept a total denial of responsibility for the 2% of the time that these stoves leak gas and blow up your house?
Getting one very technically proficient guy to install something once on a computer he’s thoroughly familiar with, is too easy. But try and get something to work ALL the time? Work for the guy who has, you know, another job to do– pharmacist/plumber/student/**not a software developer** — and doesn’t have any idea what $PYTHONPATH even is?
Neil: VMs can help, but A) that just pushes the problem back one step, as you have to get the VM working, which can be very non-trivial with some hardware; B) that makes for a HUGE install, totally impractical for most software; and C) there are significant security concerns with “homebrew” VMs. “Meta” software– config tools, admin tools, and so on– can help ***within a given organization*** and ***if there is somebody who can configure and maintain the admin software***. Even then, every layer of complexity means there is another batch of “Heisenbugs” that are vastly more difficult to troubleshoot.
PHP: terrible language, compelling multi-tenant deployment.
If you’re designing a new language, why not look at how PHP makes it simple and cheap to offer PHP hosting in a bargain-priced shared hosting plan, and do that? Requiring a long-lived process per $2/mo hosting customer doesn’t work.
Hosting isn’t the same thing as packaging and installation: someone, somehow, has to get the dozen and one libraries I want to use onto one machine and playing nicely together.
The Enthought package has been vital for getting people to adopt python at my workplace. Without it, I think everyone would still be working with Matlab. It’s a shame that we have to rely on a company to provide this functionality.
The issue I see is the gap between a packaging and dependency management system provided by a language, and that provided by the system. If I have Perl installed as a system package, how do I install modules? Via the system package manager? Or via CPAN? If the latter, will the former get in the way, updates to Perl breaking the set of modules I’ve installed? Likewise Python with tools like Pipi…
Not limited to languages, of course. One of the worst examples of this is the Eclipse IDE – ideally I’d like to treat it the same way as every other package on my system, installed as an RPM, plugins added to it in the same way, and updated automatically from a central server. Except, of course, that Eclipse has it’s own mechanism for doing exactly the same thing, and the two really don’t co-exist well.
That’s an impossible bar. We don’t know until we fail. And “is often painful” is so vague as to be useless.
I don’t think you get this without going to ML or Haskell. You’re not getting it in a dynamic language, because there’s no way to capture that behavior to begin with (as far as a dynamic language is concerned, everyone’s happy.) And even Haskell doesn’t get this right; Hackage is a mess. (ML, I’m not so sure.)
So in the end I’m still confused. You say that Python 3 should have “fixed” packaging, but they can’t fix the problems you state without having a language that’s no longer Pythonic. To know if a function changed it’s type (in your example, from a partial function to a total function) you either need static types or a breakthrough in dynamic languages that I, a lowly engineer, am unlikely to deliver. And you want that to break the downstream package. But my application never hits that condition, and v1.6 also fixes a critical security bug, so why should it break?
I think you’re asking for a sufficiently smart compiler, in the end. I think we all want that. I’m sure there are some very smart people working on it. The rest of us need to get stuff done in the meantime, and I’ll take any improvement over MATLAB’s utter failure to make code reusable.
A half-solution is light years better than no solution at all, and incremental improvements on that better still.
Ugh, I actually said “light years better.” I think you got me worked up a bit!