Python faster than compiled Haskell?

前端 未结 7 1263
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-11-29 23:16

I have a simple script written in both Python and Haskell. It reads a file with 1,000,000 newline separated integers, parses that file into a list of integers, quick sorts i

相关标签:
7条回答
  • 2020-11-29 23:49

    More a Pythonista than a Haskellite, but I'll take a stab:

    1. There's a fair bit of overhead in your measured runtime just reading and writing the files, which is probably pretty similar between the two programs. Also, be careful that you've warmed up the cache for both programs.

    2. Most of your time is spent making copies of lists and fragments of lists. Python list operations are heavily optimized, being one of the most-frequently used parts of the language, and list comprehensions are usually pretty performant too, spending much of their time in C-land inside the Python interpreter. There is not a lot of the stuff that is slowish in Python but wicked fast in static languages, such as attribute lookups on object instances.

    3. Your Python implementation throws away numbers that are equal to the pivot, so by the end it may be sorting fewer items, giving it an obvious advantage. (If there are no duplicates in the data set you're sorting, this isn't an issue.) Fixing this bug probably requires making another copy of most of the list in each call to qs(), which would slow Python down a little more.

    4. You don't mention what version of Python you're using. If you're using 2.x, you could probably get Haskell to beat Python just by switching to Python 3.x. :-)

    I'm not too surprised the two languages are basically neck-and-neck here (a 10% difference is not noteworthy). Using C as a performance benchmark, Haskell loses some performance for its lazy functional nature, while Python loses some performance due to being an interpreted language. A decent match.

    Since Daniel Wagner posted an optimized Haskell version using the built-in sort, here's a similarly optimized Python version using list.sort():

    mylist = [int(x.strip()) for x in open("data")]
    mylist.sort()
    open("sorted", "w").write("\n".join(str(x) for x in mylist))
    

    3.5 seconds on my machine, vs. about 9 for the original code. Pretty much still neck-and-neck with the optimized Haskell. Reason: it's spending most of its time in C-programmed libraries. Also, TimSort (the sort used in Python) is a beast.

    0 讨论(0)
提交回复
热议问题