Talk:Quicksort (Python)

From LiteratePrograms
Jump to: navigation, search

[edit] Cool

Cool article, Allan. I'm just learning Python and it's neat to see some different ways of leveraging the language features for this simple problem. Deco 03:10, 4 June 2006 (PDT)

Also, the download page version of your program revealed several problems with the Python syntax highlighting, such as missing support for triple quoted strings and busted string highlighting in general. I fixed it (I think). Deco 03:30, 4 June 2006 (PDT)
Honestly, I just learning Python too :-) I think this may have been the first complete Python program I've written from scratch myself. Glad you like the article though! Thanks for fixing the highlighting issues (and my spelling error)! --Allan McInnes (talk) 09:10, 4 June 2006 (PDT)

[edit] Optimized versions slower?

Hey Allan. I did some timing (just using "time python" and some small editing) on a list of 10,000 items and it appears that in fact the first version is much faster than the other two, and the second is faster than the third, with times of 0.19 sec, 2.2 sec, and 3.8 sec. In each case I checked that the list was sorted correctly. Could it be that your intuition about which program is faster was defeated by some clever compiler optimizations? Deco 16:57, 4 June 2006 (PDT)

Yeah, I've been made aware of those problems by User:Leland McInnes. It seems likely that there are several factors in play here:
  • Python does not do tail-call optimization
  • List comprehensions are probably heavily optimized in the Python runtime
  • Python uses "call-by-sharing", which is something like "call-by-value", so memory usage skyrockets on the heavily recursive versions
Leland has said that he'll rework the article with his performance results sometime soon. Oh well — at least I learned something about Python... --Allan McInnes (talk) 19:44, 4 June 2006 (PDT)
Also: I did some experiments and it seems like random access in Python lists is fast. Could it be that they're actually stored as arrays, not linked lists as you might suppose? This is supported by my experiments at Quicksort (Python, arrays) which show that the naive implementation here is competitive with an array-based one, which I didn't expect since arrays should have a big speedup factor from the locality of reference. Deco 17:56, 4 June 2006 (PDT)
From what I understand, Python lists are stored as doubly linked lists, but have fast random access. Leland tells me that the list comprehension quicksort was faster than even the in-place sorts he tried. --Allan McInnes (talk) 19:44, 4 June 2006 (PDT)
If we have fast random access, should we do random pivots? Would eliminate the nasty stack overflow problem on sorted input. Deco 23:19, 4 June 2006 (PDT)
That might be a good idea. Leland suggested something similar after he saw your performance figures in the array article. I suspect it wouldn't be that hard to do - randomly select a an element, then pop that element from the list and make it the pivot. --Allan McInnes (talk) 23:26, 4 June 2006 (PDT)
Looks good, good job. I would also consider switching to a more naive list sorting algorithm for small sublists, like insertion sort - this might be overkill here. An alternate way of addressing the problem of mutating the original list might be what the in-place partition does, exchange it with the last element then use a comprehension over the rest. Deco 00:27, 5 June 2006 (PDT)
Thanks! I appreciate the feedback.
I suppose it might be interesting to see what kind of difference it makes to include something like an insertion-sort for small sublists. I wonder how much of a performance gain we'd actually see. A project for another night perhaps...
--Allan McInnes (talk) 00:59, 5 June 2006 (PDT)
Regarding mutation: I don't think doing an exchange would help. I'd either have to do a pop/append, or a pair of assigments (using s[i] = x), to make the exchange. Both of those operations mutate the list, so I'd still need to make a local copy to prevent mutation of the original list. --Allan McInnes (talk) 01:11, 5 June 2006 (PDT)
Oops, silly me, forgot lists are immutable. Deco 01:24, 5 June 2006 (PDT)
Well, they don't have to be immutable. But maintaining immutability in the randomized-pivot version keeps the behavior consistent with the other implementations. --Allan McInnes (talk) 10:29, 5 June 2006 (PDT)
Oh, I see what you're saying now - all these implementations are out-of-place. That makes sense. Another option I thought of was to split each comprehension into two comprehensions, one for the range of elements below the pivot element and one for the range above it. Deco 11:09, 5 June 2006 (PDT)