A More Clearly-Stated Version of My Argument

I was perhaps a little poorly stated and a little (intentionally) combative in my last post, so I figure I should clarify a few points.  Thanks to everyone who commented or otherwise pointed out errors in my reasoning, over-reaching statements, or other ways in which I’m wrong.  I think this is a subject upon which highly-intelligent, reasonable people will disagree, so this is really just my take on the direction things will go in the future (and not necessarily which direction they should go), based on my (sometimes more poorly-informed than I’d like) opinions about things like the ease of mastering particular languages, irregardless of the actual utility of those languages.

My real targets were the arguments I’ve heard time and again lately, the first of which is basically that the coming increase in multicore processors and the basic halt to clockspeed improvements will change programming languages fundamentally, finally pushing high-concurrency languages like Erlang or functional programming in general into the mainstream while leading to a slow abandonment of existing languages that don’t have that level of concurrency support.  The second argument I was targeting is the general subcurrent of “But language X is so much more powerful, so everyone should use it” that always seems to exist in the programming community.

So to make my first argument a bit more explicit, it goes something like this:

  1. Small-scale parallelism (i.e. parallelizing what was previously a single thread of execution across mulitple cores on a local box) and large-scale parallelism (i.e. scaling an application across multiple cores on multiple boxes to handle large user/data volumes) are different problems
     
  2. Small-scale parallelism isn’t really helped too much by just using a functional language.  While in theory a pure functional language can allow for parallelism of certain operations without the programmer having to do anything, in reality sustained usage of multiple cores requires more explicit parallelism on the part of the programmer, where the algorithm is broken down into explicitly independent pieces.  In other words, if you want your HTML template rendering system or your video-encoding program to use multiple cores, you’ll have to design the algorithms with that goal in mind.  Using a functional language might help with the implementation, or it might help with the thought process, but mere use of the language won’t be any kind of a magic bullet, and the same algorithmic approach is generally translatable to an imperative language as well.  The algorithm is the important part if you really want to scale on that level, not the language (though, again, programming in a functional language might help get you thinking the right way).
     
  3. In general, with more software moving off of individual desktops, the need for small-scale parallelism is even more minimized, as per-user/per-request parallelism makes it easy to saturate an N-core box.  If you really care about efficiency at that level, you’ll be more worried about absolute language/framework performance rather than concurrency support.
     
  4. Large-scale parallelism is helped by using functional techniques to parallelize and distribute work, but doesn’t necessarily require a functional language, as either per-request parallelism or explicit work-queue-like models can often allow horizontal scaling up to fairly large volumes regardless of the language.
     
  5. More advanced techniques might be required above a certain performance threshold, but the vast majority of applications never make it there and those techniques can impose a huge tax on development, so it’s more important for most developers to focus on getting the application to work rather than getting it to scale infinitely.
     
  6. Therefore, cloud computing/multicore processors aren’t going to be the “killer app” that switches people over to more concurrency-friendly languages.  General developers will switch to those languages or not based on their merits as a problem-solving language and not due to their concurrency support.  There’s a place for high-concurrency languages, but support for concurrency won’t be enough to propel a language into the mainstream.

So will functional languages go mainstream on their own merits as programming languages?  I honestly don’t think so, though that’s an even-more-contentious argument.  My reasoning there was essentially:

  1. Functional programming techniques don’t come naturally to most people; in my opinion, the human brain is designed for imperative algorithms to solve problems sequentially, since people are really only capable of doing one thing at once.  Some people are probably wired a bit differently and have an easier time with functional programming, but most people naturally think in imperative terms, which will always make functional languages more difficult for most people to grasp.
     
  2. Many functional languages tend to be a bit more academic in nature, and as a result include even-more-obscure language features that can be powerful but further increase the barrier to entry for most developers.  Monads already throw most people for a loop for a while, but the syntax and type system in Haskell further add to the difficulty of understanding it, as does the infix syntax in most Lisp varients.  As a result, in general the most popular functional languages tend to be harder to learn and to master than the most popular imperative languages (at least if you count Python, Ruby, and Javascript as imperative).
     
  3. Being harder to learn and master means that the languages are also restricted to a subset of the current development community.  There are a large number of people out there that can be reasonably competant in Java or Ruby that would flail if they were asked to learn Haskell or Scheme or OCaml.  More “advanced” languages tend to be magnifiers; while they can make really-talented developers more productive, they can make less talented developers far less productive.
     
  4. Network effects, community size, talent pool, and barriers to entry matter for a lot of projects.  Most software companies have a significant turnover or growth rate among their staff, making the ability to recruit people and bring them up to speed important.  The best companies will hire for general ability rather than particular skills, but out-sourcing and contract work generally isn’t done that way, and ramp-up time still matters even in the optimal case.  In addition, most companies can’t restrict themselves to the top 5% or 10% of development talent and can’t afford to limit their talent pool by choosing a language many developers won’t ever be able to master.  As a result, the languages that are hardest to learn will inherently be a bit marginalized, since fewer people will already know them, the prospective talent pool will be smaller, and the ramp-up times will be longer.
     
  5. The scalability of certain languages to large, long-lived codebases with large development teams is suspect due to a small sample size.  Most developers and project leads would rather choose a proven technique that they’ve seen work and that’s been used by hundreds or thousands of other teams, or that at least is similar enough, rather than trying to push the envelope with a radically different approach.  There will always be outliers, but most developers are going to choose a well-trod path that they know can work rather than a less-clear path that might lead to increased developer productivity.
     
  6. Mainstream languages will continue to pull in more functional concepts (like closures), eroding some of the advantages that functional languages might otherwise have, meaning that the functional concepts become more mainstream but the languages that they originated in won’t.

So again, my arguments aren’t around what should happen or which languages are better in any sense, but rather they’re observations about what I think is happening and what will continue to happen in the future.  It also is part of the reasoning that informs the direction we’ve gone with GScript; we’ve tried to emphasize ease of use, readability, speed of development, and suitability for building tools and frameworks, and we’ve avoided adding in features that we feel like will complicate the language but which might allow for better concurrency support, more flexible syntax, avoidance of side-effects, or more extensible syntax.


11 Comments on “A More Clearly-Stated Version of My Argument”

  1. Raoul Duke says:

    @imperative fits people’s heads better.

    i know you are saying “is” not “should” so this is off topic ish, but i’d like to reference this interesting little ditty i came across which linked up in my head to what you’ve been discussing :-)

    http://tinyurl.com/2n7rj5 (see expression vs statement oriented bit of that post).

  2. Raoul Duke says:

    wow i wish the smileys of the world would just remain as pure unadulterated ascii. so much less… AOLish. :-P

  3. Ewan says:

    liked the original post and thought both amusing and thought provoking and this post is though maybe less amusing absolutely bang on the money.

    As a project manager I would not advocate the adoption of a development language that excluded a large percentage of the available programmers. What is important is the development of a product that serves a business need, not a conceptually pretty design and execution of that design.

    Pragmatism must rule in the selection of development tools not evangalism. The tools are just that, a means to an end and not the end in themselves.

    Having been in the business for over two decades I have seen many fads in development tools and methodologies many professing to be the silver bullet that will solve all our ills or the way that things will inevitably be in the future.

  4. Greg M says:

    “While in theory a pure functional language can allow for parallelism of certain operations without the programmer having to do anything, in reality sustained usage of multiple cores requires more explicit parallelism on the part of the programmer, where the algorithm is broken down into explicitly independent pieces.”

    What’s your basis for this? It seems at odds with the nature of functional programs – being made of implicitly independent operations whose order is only constrained by dataflow and hence amenable to automatic parallelization.

  5. Alan Keefer says:

    @Greg

    My basis for that is that even functional programs impose requirements on dataflow and operation dependence, and even within a functional programming language there are many different algorithms for solving the same problem, some of which increase the parallelism and some of which reduce it.

    For example, in my trivial example of making a PB&J sandwich in the previous post, the naive functional algorithm:

    eat(slice(addTopSlice(addJelly(addPeanutButter(getIngredients()))))

    is not as parallelizable by the compiler/runtime as:

    eat(slice(addHalves(addJelly(getTopSlice()), addPeanutButter((getBottomSlice()))

    In the former, the compiler/runtime doesn’t have much of a chance to parallelize because of the data dependencies. In the second case, the addJelly() sequence and the addPeanutButter() sequence can execute in parallel. So even in this trivial example, the choice of algorithm determines the degree of parallelism.

    More to my point, I was thinking of situations where the choice is really between structuring the algorithm at the high level with the intention of it being independent versus ignoring that and relying on the language to do it for you.

    Take a more difficult real-world problem like constructing the output for a HTML response from a web server. Do you do that using a template that’s then compiled and processed? Do you do it all as nested function calls? Is it a single function call that packages up the input to the rendering function, meaning everything is dependent on that function returning, or is the input-gathering split out into independent function calls to allow for parallelization?

    Or to take another example, like video encoding. In order to parallelize it, you’ll need to make sure the compression algorithm itself isn’t serial, i.e. that it’s not the case that every frame depends on the previous frame. If you really want to parallelize things, you’ll choose an algorithm that uses key frames or similar to prevent that, allowing you to process multiple segments at once, though even then the degree of parallelism will be limited by how many segments you can split things into, and you might have to do some clever work to stitch things back together at the end.

    If you just go ahead and write your ray tracer or your video encoder or your HTML renderer in a pure functional language assuming it’ll parallelize just because of that, you’ll end up being disappointed. You might get some boost from the compiler or runtime, but you won’t see sustained usage of all the available cores without understanding the dependencies and explicitly parallelizing things yourself.

    To extend the argument: superscalar processors already do look for data-dependencies in instructions so that they can understand what can be executed in parallel within a single core, which helps the process go faster. But they still need help from the compiler to split things on larger-scale lines into multiple threads of execution that are fully independent on a larger scale, and even the best compilers or runtimes still need help from the programmer to split the algorithm at a higher level than that to ensure that there are large swaths of execution that can be independent.

  6. In theory, functional languages[1] should allow a vast array of automatic transformations. In practice.. not so much. What they do tends to be simple or hardcoded (or both). Essentially, you’re expecting to code up a bogosort and get it compiled into an adaptive natural merge sort — not gonna happen.

    [1] Look a little deeper and you’ll find you can do the same with imperative languages, you just need a little more work. You need to apply something similar to escape analysis, but on side-effects instead, and once you know a given block is contained you’ve got free reign. Even if not contained it’s no different from a monad.

  7. Raoul Duke says:

    as folks might already have seen:

    http://mags.acm.org/queue/200809/

  8. Greg M says:

    PB&J… if there really are data dependencies, then you can’t do the second one. If there are not, then you can write the first and a contemporary Haskell runtime will perform the second.

    If your point is that these are external actions and therefore possibly can’t be reordered, then OK. But when your external actions are inherently parallelizable (ie web server), then you’re talking about concurrency, not parallelism. It’s parallelism that’s relevant to multi-core, and parallelism that gives an edge to functional programs.

  9. Alan Keefer says:

    Fair enough . . . perhaps that was an overly-trivial example. As Rhamphoryncus points out, an imperative language VM or compiler could do that in the trivial cases, since it can just as easily inline the function calls all the way down and then re-order independent instructions, etc. So what a pure functional language buys you is the ability to avoid side effects that would prevent that parallelization from happening, but even a language that allows for side-effects can do the same transformations if the compiler or VM detects that there aren’t actually side effects within a particular trace of code.

    But again, my point was really about the more complex examples I mentioned. Or, as Rhamphoryncus again states, around things like sorting: no language is really going to fix your code if you use the wrong sorting algorithm. So if you choose an algorithm that’s inherently unparallelizable, the language isn’t going to bail you out, and if you really want parallelism on any kind of sustained level I think you still have to think about the algorithms you’re using and design them appropriately, and you might even have to go so far as to think of alternative ways to use the extra cores in the event that the core part of your application really isn’t that parallelizable.

    To use another example, if you’re developing for the PS3, just writing in Haskell won’t help: you’ll have to figure out how to unwind your central loops to decouple AI, physics, core control and game play, special effect rendering, etc. to take advantage of all the processors. And you’ll have to do that regardless of what language you use. In that case, the programmer has to design the algorithm with the explicit goal of avoiding certain kinds of data dependencies.

    I see the value of functional languages as basically guiding people along the path towards choosing more parallelizable approaches. And there are definitely problem domains where a pure functional approach is far cleaner. Which is one big reason why I think languages that allow for a hybrid of OO and functional techniques will really continue to be far more popular than pure functional approaches, because they can get some of the advantages of functional programming by using closures and first-class functions where that works best while also retaining the ability to use OO/imperative techniques in areas where that’s a more natural fit.

  10. Another point is that automatic parallelization will be fragile. Small, unrelated changes will make your code sequential, rather than parallel. Functional languages don’t consider performance to be part of the behaviour, so they don’t consider drastic changes to be side-effects (even though they are).

    Imperative languages, simply because they don’t like the large transformations that functional languages allow, make performance more predictable and explicit.

    Err, I hope this doesn’t turn into a language war. I like functional programming, I really do. >.>

  11. Brian McGinnis says:

    I suspect we will see closures added to many languages (e.g. like Groovy and JRuby added them upon a Java VM platform). These are useful constructs. I hope java adds it to the language. A good example of functional additions as you say is C# adding lambda functions in 3.0. I hope Java follows suit.

    For example, with erlang you get concurrency features such as message passing, very lightweight processes, OTP, mnesia, and a host of other goodies not found in elsewhere such as Java. I can write lightweight, fault tolerant, distributed, concurrent server easily in erlang that would I wouldn’t even bother to try in Java because it would be way too much work. That being said, Groovy and Grails is my goto kit for web apps because its a lot more efficient for me that say stock J2EE. So I think functional v.s. non-functional language is not as important as how well the language’s environment helps along the type of application you are coding.

    Now days, a “language” isn’t just syntax and a compiler – it’s the entire ecosystem that it comes with. I think the real question is, the best choice for programming language for your situation depends more on the non-language aspects that are associated with it than the programming language itself.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 39 other followers