Quicksort is the new Hello World

alextgordon · on Dec 11, 2010

This article is misguided.

Hello World serves two (and only two!) purposes:

    1. It's traditional
    2. It lets you check everything is working

It isn't a way to evaluate or compare languages. How could it be? It only uses a very small fraction of the language. Note that the Hello World for C is rather long and inelegant. But C itself is definitely an elegant language. So its predictive power is poor.

QuickSort is neither traditional nor something a beginning programmer could use to check everything is working. Therefore it is not a substitute for Hello World.

- - -

If you want to get a feel for a language with one code snippet, allow me to introduce the Trabb-Pardo Knuth algorithm:

    In their 1977 work "The Early Development of Programming Languages",
    Trabb Pardo and Knuth introduced a trivial program which involved
    arrays, indexing, mathematical functions, subroutines, I/O,
    conditionals and iteration. They then wrote implementations of the
    algorithm in several early programming languages to show how such
    concepts were expressed.

http://en.wikipedia.org/wiki/Trabb_Pardo%E2%80%93Knuth_algor...

    ask for 11 numbers to be read into a sequence S
    reverse sequence S
    for each item in sequence S
        do an operation
        if result overflows
            alert user
        else
            print result

bmcleod · on Dec 11, 2010

Hello world is valid for checking that you have everything installed properly and it's the fastest way to make sure you get the most basic syntax of the program.

It's not like the first thing you're going to do is write out all of quicksort to see if it's easy. You're going to fiddle with numbers an arrays a bit first, particularly if your language has some kind of interactive console. In fact, I wouldn't be surprised if the first thing you do in the language is actually 1+2.

There's quite a bit of syntax you need to learn before quicksort becomes a relevant test.

If you're looking for a good language to build a quick website backend in you probably don't even care about that, you'll care more about having nice ways of passing information around and checking that it has a decent inbuilt sort method.

T-R · on Dec 11, 2010

I don't think "Hello World" will ever cease to be useful for the simple task of making sure everything's installed and operating correctly. Quicksort is certainly beautifully concise in Haskell, but it's not in all languages, and I want a low probability of errors when I'm just getting things up and running for the first time.

dshankar · on Dec 11, 2010

For a veteran programmer learning a new language like Haskell, I agree completely.

For a novice programmer learning their first language, they probably don't understand the quicksort algo and probably have no concept of arrays or any data structure. Visually displaying "Hello World" is very simple and teaches you two basic parts of a language: - Syntax - Printing stuff

On another note, someone learning HTML for the first time will have no reason to do quicksort.

_3u10 · on Dec 12, 2010

I don't believe HTML is Turing complete, and thus it's difficult to argue that it is indeed a language for computing.

l0stman · on Dec 11, 2010

    qsort :: Ord a => [a] -> [a]
    qsort []     = []
    qsort (p:xs) = qsort lesser ++ [p] ++ qsort greater
        where
                lesser = filter (< p) xs
                greater = filter (>= p) xs

I know nothing about Haskell but I don't think this code implements the original quicksort algorithm which sorts the input in-place. Moreover the two-pass of filter over the list and the concatenations cause unnecessary overhead. Thus, even if the sample code is simple and elegant, a real-world implementation would surely be a bit longer than the given example.

dagw · on Dec 11, 2010

I find this a recurring problem when trying to learn Haskell. On the one hand you have lots of books and tutorials talking about how simple and elegant Haskell is. Showing all the awesome things you can do with two lines of code. Then I try to write simple and elegant Haskell like that and find it runs an order of magnitude slower than my Python code.

I wish more Haskell proponents would spent less time showing off simple and elegant code, and more time showing fast and correct code.

aristidb · on Dec 11, 2010

This is the real "sort" used by GHC 7: http://hackage.haskell.org/packages/archive/base/4.3.0.0/doc...

It's pretty elegant, too, if less than the inefficient pseudo-qsort shown by the OP.

sizzla · on Dec 11, 2010

Correct me if I am wrong but they are actually using mergesort (mergeAll) as it is significantly faster in Haskell than in-place quicksort (qsort).

That means there is a lot of overhead for doing something simple like an in-place qsort that should be much faster than a mergesort.

I am at a loss on why the provided code is any more elegant than: http://en.literateprograms.org/Merge_sort_%28C_Plus_Plus%29

or this (also taken from rosetta code)?

  #include <iterator>
  #include <algorithm> // for std::partition
  #include <functional> // for std::less
  
  template<typename RandomAccessIterator,
           typename Order>
  void quicksort(RandomAccessIterator first, RandomAccessIterator last, Order order)
  {
    if (last - first > 1)
    {
      RandomAccessIterator split = std::partition(first+1, last, std::bind2nd(order, *first));
      std::iter_swap(first, split-1);
      quicksort(first, split-1, order);
      quicksort(split, last, order);
    }
  }

  template<typename RandomAccessIterator>
 void quicksort(RandomAccessIterator first, RandomAccessIterator last)
  {
    quicksort(first, last, std::less<typename std::iterator_traits<RandomAccessIterator>::value_type>());
  }

The above code is more verbose?

merijnv · on Dec 11, 2010

> That means there is a lot of overhead for doing something simple like an in-place qsort

Well, the problem not so much that there is a lot of overhead for doing an in-place qsort. The problem is that "in-place" anything is impossible in Haskell, it violates referential transparency.

Merge sort however needs no in-place update and can trivially (in Haskell world) be parallelized (merge sort is associative, you can use it as the reduce step of MapReduce). So you are correct in concluding that mergesort is used in Haskell.

You state "something simple like an in-place qsort", your statement about qsort being simple makes some assumptions about your programming languages world, which are not the same for all languages especially not Haskell. No whether all this is a pro and a con depends on what sort of coding you're doing of course.

I think that people call Haskell's code more elegant because they (like me) find C++ template and type annotations to be annoyingly verbose and messy for no good reason. For example for the two quicksort functions:

"template<typename RandomAccessIterator> void quicksort(RandomAccessIterator first, RandomAccessIterator last)"

vs

"quicksort :: (Ord a) => [a] -> [a]" (where "Ord a" means that a is an instance of the typeclass of ordered items).

sizzla · on Dec 12, 2010

If you want to be terse, you can just use typedefs but most of ISO C++03 libraries just assume your editor can autocomplete quickly. The culture is different and absolute clarity is preferred over terseness.

There's a lot of infrastructure behind MapReduce and I bet you any implementation is going to have a lot more code in it than what is in Data.List.

Once things are in RAM (say, less than 1 GB), quicksort is the fastest algorithm (Edit: or some other optimized in-place variation on the theme such as introsort). You can parallelize across cores all you want but all that is going to do is crap up your caches and cause bus contention, making your code more complex and slower. The bottleneck is your databus not the CPU, after all what is the cost of a compare in clock cycles? This is also why mergesort is slower and parallelizing is not going to fix it.

If you want to see things that are parallelizable check out vectorizing libraries such as Thrust that run on massively parallel GPGPUs that have the bus design and RAM to handle that kind of parallelization. They are written in C++ and both the implementation and the code to use them looks almost the same like the code I posted. The compiler technology of other languages is years behind and currently can't even target these kinds of architectures.

I've seen GPU massive parallelization in Haskell and all it did was to generate C code and then invoke the CUDA compiler. This was in a research paper. In the meantime you can fire up nvcc with Thrust and sort bazillion of keys with less than 10 lines of code in C++ using a well-tested library.

Edit: BTW, anyone (and I really don't know who would be doing this) who is even thinking of running a MapReduce Haskell cluster must have lots of free time and $$$ to burn on wasted CPU cycles.

merijnv · on Dec 12, 2010

Well, I didn't mean an actual MapReduce cluster, that's just silly. I was just pointing out that Haskell compilers can convert trivially convert these sorts of things to multicore code. Parallel Haskell basically does this, converting normal code to easy multicore code.

My point wasn't about using Haskell for high performance computing, my point was that Haskell's compiler can convert normal Haskell code to multicore code without (much, if any) work from the programmer. There is a large gap in between sequential single threaded program and a full MapReduce cluster and I think there is a significant amount of software written in this gap where Haskell-like languages could help with the parallelism/concurrency. The only reason I referenced MapReduce is since its a well known example even for people who are not well versed in functional programming (its a very common concept in functional programming).

aristidb · on Dec 12, 2010

No, because it is solving a different problem. Quicksort cannot sort linked-lists, not even in C++. I think most implementations of std::list<>::sort (in C++) use some variant of mergesort.

And it is not true that in-place modification is not possible in Haskell. It is just not considered elegant, but it can be encapsulated in a perfectly Haskell-y way. (Hint: ST monad.)

Also I did not claim that my linked code is more elegant than some other code. Where did I make that claim? So I don't have to defend the claim I did not make, right?

pmjordan · on Dec 11, 2010

In fairness, in-place sorting usually isn't what you want in functional programming. Immutable data structures help enforce the pure functional constraints. Of course, in practice, your library code will probably make a mutable copy, sort that in-place and then return an immutable copy/representation of the result, which is hopefully more efficient. (Disclaimer: I'm no Haskell programmer; my FP experience is limited to Lisp dialects)

EDIT: As a curious example, Clojure's sort function actually converts the sequence to a (Java) Array and calls java.util.Arrays.sort() on it, then turns it back into a sequence:

https://github.com/clojure/clojure/blob/b578c69d7480f621841e...

sizzla · on Dec 11, 2010

Advancing compilers is hard. When people argue efficiency as a compiler implementation detail that is going to get worked out, they forget about many who have fallen before them.

You can argue some older languages were written in a way in which it was (reasonably) easy to write a compiler that generates code with little performance overhead when compared to assembly (at worst a factor of 2 to 4, back in the 80s). Some popular languages keep adding abstractions and constructs that the compiler can actually deal with without some new compiler research breakthrough.

neilc · on Dec 11, 2010

When people argue efficiency as a compiler implementation detail that is going to get worked out, they forget about many who have fallen before them.

Yes; see also http://prog21.dadgum.com/40.html

sizzla · on Dec 11, 2010

Thanks. Beatifully written and I love the print-shop analogy.

It's nice he is into the C&C Portland Wiki, there's golden nuggets of wisdom everywhere. I am reposting the link from his blog. http://c2.com/cgi/wiki?SufficientlySmartCompiler

Personally of the higher-level languages, I found SBCL to be crazy good at optimizing, of course given a few nudges with a compiler directive or two. By inspecting the dissasembly, I could validate it was doing the "right-thing" (TM), but then again, I do the same to check the C/C++ compiled code.

lelele · on Dec 11, 2010

> Thus, even if the sample code is simple and elegant, a real-world implementation would surely be a bit longer than the given example.

Nice thought. What's the purpose of showing code which you won't be using in a real-world application? Making you think the language is more expressive than what it really is?

Many people say Haskell code is elegant and concise. Would this be the case if code we were shown would be real-world Haskell?

srparish · on Dec 11, 2010

Not only is it making two-passes for filtering, it also has to do another pass to append "lesser" onto [p] and "greater". And that is for a single recursion so these multiple passes are happening per-recursive call.

Here's a page with some actual haskell quicksorts:

http://www.haskell.org/haskellwiki/Introduction/Direct_Trans...

hristov · on Dec 11, 2010

Ok, I am still learning Haskell, but I thought I would try to fix the above so that there is only one filter pass:

    qsort :: Ord a => [a] -> [a]
    qsort []     = []
    qsort [x]    = [x]
    qsort (p:xs) = qsort lesser ++ [p] ++ qsort greater
        where
            (lesser, greater) = foldl split ([],[]) xs 
        where 
            split (left, right) x = if x<p then (left:x, right), else (left, right:x)

Does this work?

l0stman · on Dec 12, 2010

I had to look up what the `:' operator does and it adds an element to the beginning of the list. So the last line should be

    split (left, right) x = if x<p then (x:left, right), else (left, x:right)

amalcon · on Dec 11, 2010

It would be pretty difficult to write an in-place sort in Haskell, given that it only has mutable state through monads (and I doubt anyone wants to go into a monad just to sort something). GHC is pretty good at optimizing list concatenations, so that's probably not a big deal either. You're absolutely right about the two-pass filter, though.

sizzla · on Dec 11, 2010

It's not that good, which is why they gave up on using quicksort in Data.List .

brazzy · on Dec 11, 2010

Indeed. It's amazing just how difficult it is to write a correct in-place quicksort that handles all edge cases. It won't be short and elegant either. Just go ahead and try it.

xentronium · on Dec 11, 2010

http://pastie.org/1367726 -- it's not that hard, really :)

* especially with naive pivot from the middle of array

quadhome · on Dec 11, 2010

Don't ever use that pivot code for more then 2^30 elements.

http://googleresearch.blogspot.com/2006/06/extra-extra-read-...

jacquesm · on Dec 11, 2010

Your code:

  program qsortTest;
  const N = 1000;
  var
    a: array [0..N] of integer;
    i: integer;

  procedure quicksort(var a: array of integer; l, r: integer);
  var
    i, j, temp: integer;
    pivot: integer;
  begin
    pivot := a[(l+r) div 2]; i := l; j := r;
    while i < j do 
    begin
      while a[i] < pivot do inc(i);
      while a[j] > pivot do dec(j);
      if i <= j then 
      begin
        temp := a[i];
        a[i] := a[j];
        a[j] := temp;
        inc(i); dec(j);
      end;
    end;
    if j > l then quicksort(a, l, j);
    if i < r then quicksort(a, i, r);
  end;

  begin
    { populating the array }
    for i := 0 to N do
      a[i] := random(N);
    writeln('initial array');
    for i := 0 to N do write(a[i]:6);
    writeln;
    quicksort(a, 0, N);

    writeln;
    writeln('sorted array');
    for i := 0 to N do write(a[i]:6);
    writeln;

  end.

Do for loops in delphi/pascal iterate 'up to' or 'up to and including' ?

If they iterate 'up to' then:

It would seem to me that on the initial call to 'quicksort' the '1000' gets passed in to 'r', the value of 'r' then gets passed in to 'j'.

Now if "'a[j]' < p" will refer to the uninitialized value in a[1000] and compare it to p, the first decrement will not take place, i<=j and now the code will swap with a[i] with a[1000].

This should cause the uninitialized value (0?) to end up as the first element in the sorted output array.

I hope I analyzed that correct, I don't have access to a pascal/delphi compiler here.

When you run your code do you see the output starting with a '0' element ?

Is there another element from the input array missing in the output ?

If pascal/delphi loops iterate up-to-and-including did you actually intend to sort an array of 1001 elements ?

xentronium · on Dec 11, 2010

For loops are "Up to and including".

Arrays boundaries are [0..N] -> [0..1000] (1001 element actually, sorry if that confused you).

[add]: I could rewrite it in C, it should be trivial bijection, I believe.

jacquesm · on Dec 11, 2010

I suspected as much, but indeed it confused the hell out of me, I've never written a line of pascal so my buffer overflow detector false triggered :)

dzorz · on Dec 11, 2010

Although this version is very simple, concatenating and creating lists at runtime ruins performance (and it takes extra memory). Please make it a good example by doing a version of Quicksort with in-place partition.

Nitramp · on Dec 11, 2010

That's actually a nice nab for functional programmers. They run around all the time telling how nice their language of choice is, showing off quicksort in three lines of code.

But please don't have the audacity of pointing out that original quicksort was supposed to be an efficient in place sort, not a mental exercise in beauty of a language. Then you get a Haskell version that is in place, and supposedly fast. It's about 10 lines longer than the C version, uses Monads, ST, recursion, "unsafeRead/Write", "rscan", "lscan", "swap", "sloop", etc., all together about 20 concepts, built-in functions, etc.

Also, of course, if it really should be fast, you have to use several compiler hints and macros to make it type specialize and all. Obviously.

The C version is shorter, uses nothing but array accesses, do/while, if, int comparisons, and recursion. If you want to explain the C version to a novice programmer, it will maybe take you half a day. The inplace Haskell version? Uhhh, yeah.

I can't help it, but things like these leave me with the impression of Haskell being a theoretically really nice language whose utility in actually getting things done seems somewhat questionable.

mwill · on Dec 11, 2010

I still write and run Hello World in languages I am well versed in when I start working in a new environment, purely as a sanity check. In many cases it is the simplest starting point to ensure you have a working environment.

As far as familiarizing myself with a new language, I usually write a Hello World, then add some variables, and some functions or whatever is relevant, and just focus on the syntax of the language, or in the case of a platform or framework, take a look at building trivial components.

Not a language, but a platform example: I recently explored ExpressJS (If you're not familiar, its a framework for node.js web apps.) and my workflow was: Make a hello world server, Add a few routes and a static handler, Add some views, Add some middleware, Build it into a basic blog. (Because it's always a blog first, right?)

The whole process was quite quick and gave me a very strong jumping point for ExpressJS. Implementing quicksort or something similar as my first crack would probably have been pretty useless. So for me its all about context; I tend to start at Hello World and quickly grow something simple, but relevant to the language or platform.

DanielBMarkham · on Dec 11, 2010

Yet another article that completely misses the point. The point of "Hello World" is to show a noob how to fire up the editor, compile, and see something happen.

And btw, here's the answer in F#:

  let rec qsort = function
       | [] -> []
       | x::xs -> let smaller,larger = List.partition (fun y -> y<=x) xs
                  qsort smaller @ [x] @ qsort larger

sizzla · on Dec 11, 2010

This looks like yet another Haskell "variant." From what I can tell this sort does not occur in place and thus would not make a particularly good quicksort. Memory use, number of accesses per element, etc. This style of programming is more suitable for heapsort or maybe mergesort.

zbanks · on Dec 11, 2010

You can't exactly sort "in place" in functional languages like F#/Haskell... at least trivially.

_3u10 · on Dec 12, 2010

Given the level of abstraction in most modern programing languages/ operating systems it's really hard to say that any modern language could implement quicksort in the most strict definitions.

For example, is a quicksort 'in-place' if your memory gets swapped to disk?

noamsml · on Dec 11, 2010

I think I'm not alone when saying that the first thing I write in a new language I learn is some sort of project that requires a good understanding of the language's function-call model, of its object model, of a few specifics (strings, regices, data structures, arithmetic [depending on what I find interesting]), and that requires me to use some part of the standard library not readily available or to decompose my code into multiple files (or both). This helps me understand better what I'm doing, what I like about the language and what I don't. Examples of such projects include IRC bots, Lisp Interpreters, twitter clients, etc. (The one exception is Haskell, where I haven't gotten far enough to understand Monads well so I just wrote a completely functional RPN calculator).

sgt · on Dec 11, 2010

I'm proficient in Hello World in 24 languages. And now you say my knowledge is obsolete?

sb · on Dec 11, 2010

tl; dr, but a quick note: the presented algorithm takes the first element as the pivot element p:

   qsort (p:xs) = ...

This is not recommended as it results in worst-case behavior on sorted input lists. (Another commentor correctly pointed out that it requires extra memory for the intermediate lists, too.)

Deestan · on Dec 11, 2010

It is just a toy implementation. If we're going to worry about practicals, then it's not recommended to use your own general purpose sorting implementations at all. For general purpose sorting, the standard library "always" has the best implementation.

dagw · on Dec 12, 2010

Except it is also the canonical look-how-awesome-Haskell-is example that keeps getting trotted out all the time. Why aren't there more canonical examples of Haskell being both simple and awesome as well as fast and correct?

sizzla · on Dec 11, 2010

Any knowledge about your data set, how much of it varies, etc can vastly improve your performance over any canned standard library solution. Sorting ints? Radix-sort that mofo.

yoyar · on Dec 11, 2010

Hello World is there to validate that you can setup, compile and run a program in your new language of choice. Complicating that process by trying to write something non-trivial defeats the purpose of the exercise. I call BS on this article.

arvinjoar · on Dec 11, 2010

Someone already pointed out that you try out "Hello World" to see if everything is working correctly. Anyway, the author fails to make the distinction between is and ought to be, it's sloppy.

seanmcq · on Dec 11, 2010

This test gives an extreme advantage to lazy functional languages that they don't enjoy in any realistic context.

Trying to come up with a single test that is good turns out to be surprisingly hard. In modern languages I'm looking for the ability to do clean functional code, stateful object code, async programming, a solid module system, and a type system that doesn't get in my way. I really can't imagine a single 5 line program that would encapsulate all of these areas.

pauljonas · on Dec 11, 2010

The only way to learn a new programming language is by writing programs in it. The first program to write is the same for all languages:

Print the words

    hello, world

This is a bug hurdle; to leap over it you have to be able to create the program text somewhere, compile it successfully, load it, run it, and find out where your output went. With these mechanical details mastered, everything else is comparatively easy.

The C Programming Language [Kernighan, Ritchie]

_tef · on Dec 11, 2010

It is like fizzbuzz/binary search in that most people think they can write it and they get it wrong.

Quicksort requires careful selection of pivots, and is unstable. (Many scripting languages (perl, python) are opting for stable sorts)

Check out Bentley&McIllroy's Engineering Quick Sort for a guide about the problems implementing a production ready quicksort.

If you're going to teach them something easy, simple and relatively hard to implement badly, teach them merge sort.

Then teach them adaptive merge sort.

sizzla · on Dec 11, 2010

I would rather teach them:

1) how to write a program that guesses a number you thought of by doing binary search.

2) solve 2-queens

3) solve n-queens

4) solve knapsacking or some other combinatorial problem requiring memoization or DP. Of course, I wouldn't tell them any of those buzzwords that are meant to scare them away from just hacking on it.

Their mind will be blown regardless of the language they used and they'll get a feel for O() without even opening Cormen.

nkassis · on Dec 11, 2010

After I write my "Hello World!" program I usually graduate to writing a tictactoe game. Tictactoe has enough stuff to learn the basics of most languages.

gsivil · on Dec 11, 2010

I think your point is valid and of course interesting for programmers, but KR's "Hello, world!" is still relevant for students and novices. Maybe reading/writing on a file would be more useful nowadays, since it does not demand knowledge of algorithms.

ollysb · on Dec 11, 2010

A large part of the value of a hello world is to check that you've actually installed the language correctly and can compile/run code. Printing hello world is just the simplest way of proving it.

ctdonath · on Dec 11, 2010

It is very relevant for total novices. When someone has no concept about programming, this is the fastest way to "here are the barest basics, and you have already written a program."

Quick sort means nothing to someone who does not yet grok compiling and output, much less sorting as an algorithmic process.

Question isn't whether "hello world" is obsolete, it's the level of knowledge of the author. A rank noob needs to see the simplest possible program, not the highest density functionality.

billmcneale · on Dec 11, 2010

QuickSort is still a very small problem that is not going to show much about the language.

Gosu's example code does a good job at covering a lot of aspects in ~50 lines of code:

http://gosu-lang.org/intro.shtml

LostInTheWoods2 · on Dec 11, 2010

I usually implement a tic-tac-toe game as my first project when learning a new language. Tic-tac-toe, properly implemented, will give you a pretty good tour of a language. But with that said, Hello world, still serves it purpose.

sizzla · on Dec 11, 2010

This is a great example of making something that is challenging for a novice, yet not so challenging they give up. The reward is much higher than watching some numbers show up in the correct order and you tend to cover more of the language (2D array or your own abstraction of such, I/O, control flow, etc.).

quadhome · on Dec 11, 2010

"For every complex problem, there is an answer that is clear, simple-- and wrong."

sizzla · on Dec 11, 2010

Beautiful. We really need more cynicism on HN.

yoshiks · on Dec 12, 2010

Do we all agree to teach from quicksort, to complete newbies of programming, while most experienced people had hard time to understand? Or simply just read this thread and imagine if you are just 12.

eliben · on Dec 11, 2010

Sadly, many programmers don't understand how quicksort works. This makes it somewhat unsuitable to be a hello-world.

sid0 · on Dec 11, 2010

I think something simpler would be a better Hello World equivalent. The first program I generally write in any functional language (usually as a sanity check) is one to calculate the factorial of a number.