Migrated WordGap from Python to Go 1

waterside81 · on March 24, 2013

We're going through the same thing right now. Porting a few thousand lines of Python to Go. It's almost a line for line port, with the odd exception here or there. One thing I really miss going to Go from Python is using getattr or dict.get(some_key, default_value). Also tuples. And marshalling from a JSON blob is kind of a pain in Go.

In fact, I started making a list of things Python devs need to know about when switching to Go. It's all a tradeoff, you lose some expressiveness for performance.

swaroop · on March 24, 2013

Please do post the list of things you mentioned in your second paragraph.

voidlogic · on March 24, 2013

"It's all a tradeoff, you lose some expressiveness for performance."

Not only do you gain performance but I think you gain maintainability and clarity. I value that when you read Go the run-time complexity of operations is very clear. In many other languages you often find O(n) or worse operations that appear to many programs in such a way the uninformed assume they are O(1) operations. In Go that does not happen- almost always what you see is what you get. I will gladly write a few more lines of simple code to gain clarity.

wging · on March 24, 2013

Can you post an example?

earthboundkid · on March 24, 2013

> Also tuples.

A tuple is just a makeshift struct. Think about it this way: by replacing it with a real struct, you're documenting your code.

derefr · on March 24, 2013

Imagine if every function was only allowed to take a single parameter--and in order to write a function that takes multiple, what you really had to do was declare a struct of type "the set of parameters foo() takes" and then construct one whenever you called foo().

This is what it feels like going from a language with tuples, to one without :)

(I would also say "imagine further that you had to do the same whenever you returned multiple values"--but you don't have to imagine ;)

Evbn · on March 24, 2013

Go has tuple return values and of course input values.

The only thing missing is giving a tuples a name. But if you do that, you should make a struct, or you will forget which field is which and have bugs.

georgemcbay · on March 24, 2013

You can do a sorta-like getattr in Go using reflect, though I'm not sure exactly what your use case is. Fields need to be exported (upper-case names) though:

http://play.golang.org/p/XPgHkKCz_D

But you're absolutely right, you trade in some easy expressiveness and lack of boilerplate for better performance and easy concurrency.

But Go is much better about the boilerplate required than most other static/compiled languages, IMO.

waterside81 · on March 24, 2013

I like your solution but that's kind of the rub about Go, at least at this stage of its still-early life. You have to re-invent the wheel a lot. It's frustrating to have to implement things that you know exist elsewhere, especially things like collections.defaultdict in Python.

Agree re: boilerplate. Much slimmer than Java or C++. And the build times, my God, the build times. It almost makes up for not having a true REPL.

earthboundkid · on March 24, 2013

If the original Python is d[key] = d.get(key, 0) + value, in Go it can be replaced with m[key] += value, since the default value for a map[string]int value is 0. This doesn't totally replace the need for defaultdicts, but it should greatly lessen it.

smagch · on March 24, 2013

Using bitly/go-simplejson would help the JSON marshalling problem. https://github.com/bitly/go-simplejson

viraptor · on March 23, 2013

I'd like to see the before and after code. Such difference is not impossible between languages, but it's possible that a better implementation/algorithm in the original code wouldn't bring a similar speedup...

coldtea · on March 23, 2013

>Such difference is not impossible between languages

Not only it's not impossible, but it's also quite common, even for programs having the same program-level algorithmic complexity.

Except if the programs are IO bound (which happens with most of them), you can see real differences between various languages. It doesn't even have to be compiled vs interpreted (well, if you exclude JITs). A V8 JS program can smoke a Python program (when the latter doesn't reach for some C builtin function).

pekk · on March 23, 2013

Of course, a Python program can also smoke a V8 JS program, this happens too. And it is not realistic to compare Python to JS while excluding any commonly used optimization strategy for the former, whether that be C or PyPy or using different language constructs.

kkowalczyk · on March 23, 2013

Neither C nor PyPy is "commonly used optimization strategy" so the comparison is very much correct.

People reach for C when they are desperate for performance, have the skills to do C, have the time to do C and are ok with loosing the cross-platform capabilities. Pure python code runs on mac/windows/unix (90% of the time). The moment you add a C module to the mix, you end up with the usual cross-platform mess (separate installers for each platform, complicated build processes etc.).

icebraining · on March 24, 2013

Reaching for C doesn't have to mean writing C; it can just be using a different library and adapting your code to call it.

coldtea · on March 24, 2013

>Of course, a Python program can also smoke a V8 JS program, this happens too.

A cleverly written Python program can indeed beat a naively written V8/JS program.

But if they have the same program-level complexity and the Python program doesn't delegate to C libraries, V8 wins. That is: when using the same algorithms in both languages, V8 is faster.

We have h264 decoders using Javascript, and even ports of the QT widget stuck into pure JS (using emscripten). And they run in acceptable speed. Those would be hopeless in pure Python without adding new ad-hoc C extensions.

>And it is not realistic to compare Python to JS while excluding any commonly used optimization strategy for the former, whether that be C or PyPy or using different language constructs.

Well, it's fair, in the sense that mainstream JS is fast, whereas mainstream Python (CPython) is not. And you don't have to resort to stuff like Cython, SWIG and such with JS to get that performance.

jerf · on March 24, 2013

This is the sort of algorithm that really beats on CPython's weaknesses. Strings are immutable, so you're sitting there and just creating little tiny string object after little tiny string object, themselves made out of characters that are fairly heavyweight. You could hope that PyPy or something could work out how to optimize it, but I'm not sure PyPy would optimize around the constant strings.

Meanwhile, since Go has true arrays, and since you can probably just assume we're in English only and let byte[] substitute for real unicode, you can write very nearly what you'd write in C, albeit more safely, and not be creating string objects by the thousands, and it wouldn't be particularly unnatural Go code. A 12x differential is very plausible. I'd believe you could even go higher.

This is not to say that Go is awesome or that (C)Python is a bad language in general. I like (C)Python. But it does have some bad cases in it if you need lots of performance, fiddly little string algorithms and lots of numeric computation being two of the big ones. (By which I mean numeric computation in pure CPython, not with Numeric Python or PyPy.) In both cases the sheer rate of object creation and destruction work dwarfs the real work being done.

earthboundkid · on March 24, 2013

This gist of what you said was correct, but you made a few technical errors.

First of all, Go's strings are also immutable. To get around this, you can use the []byte type, as you said, but Go is built with the assumption that all text is UTF-8. Frankly, I find the suggestion that ASCII is good enough, even for Americans, absurd.

Second, Python's "list" type and Go's slice type (what you're calling a "true array") are almost the same under the hood. Both are variable length arrays. (I'm not sure why Python calls it a "list" since it's not a linked list.) The only major difference is that Python's lists are always filled with pointers to the underlying objects. In Go, you have the choice of whether you want to use pointers to the underlying objects or to put the objects into a contiguous block of memory.

Speaking more broadly, there's no reason you couldn't use a lot of different Go practices in Python and end up with much more efficient Python code: pre-allocating lists of the proper size, for example, or making sure to use mutable buffers instead of immutable str objects. However, Go makes it natural to be concerned with these efficiencies, and Python makes it natural to ignore them. Each language has its strengths.

icebraining · on March 24, 2013

Strings are immutable, but bytearrays aren't and they support most string methods.

  g = bytearray('Do this')
  g
  >>> bytearray(b'Do this')
  g[:2] = 'On'
  g
  >>> bytearray(b'On this')

voidlogic · on March 23, 2013

"Such difference is not impossible between languages" You are correct in wanted to make sure the algorithms and data-structures used still have the same run-time complexity. But saying this is impossible is not accurate.

This kind of difference is not uncommon when BOTH the before and after program optimally handle IO (network,file,db,etc), BUT the before program is from the interpreted performance tier (PHP,Perl,Ruby,Python) and the after program is from the compiler tier (C/C++,Java,C#,Go,Scala,Haskell).

These micro-benchmarks, while not proving anything, at least suggest that you could be wrong as well: http://benchmarksgame.alioth.debian.org/u64q/benchmark.php?t...

viraptor · on March 23, 2013

We're in a complete agreement. As I wrote: "Such difference is -not- impossible between languages" :) I just find it rather unlikely.

voidlogic · on March 23, 2013

You find it unlikely that a compiled language might be 15x faster than a interpreted language doing a compute-bound workload, when said compiled language is often 30x faster in posted synthetic benchmarks... I think you are extremely skeptical...

I have ported node.js applications to Go and seen similar substantial speedups. In my cases though I was running in a multi-threading enviroment. not GAE and the performance difference (improvement) was much greater under concurrent load versus single request latency. But that seems reasonable since node.js is closer to Go in performance than Python is.

I guess I just believe it because I have seen it. :)

igouy · on March 24, 2013

>>posted synthetic benchmarks<<

The benchmarks game shows toy programs not synthetic benchmarks ;-)

drivebyacct2 · on March 23, 2013

You find it unlikely a compiled language is faster than an interpreted one...?

gngeal · on March 23, 2013

Sometimes, with smart data structures for specialized cases (persistency, better asymptotic behaviour etc.), it can happen that memory accesses dominate the actual execution. If that happens, and you can't saturate CPU's execution units, the difference between an efficient interpreter and a compiler gets smeared very quickly.

Evbn · on March 24, 2013

Python is compiled. That is the c in .pyc

drivebyacct2 · on March 24, 2013

Sorta, not really the point though. http://stackoverflow.com/questions/2998215/if-python-is-inte...

pjmlp · on March 24, 2013

It seems that Go is killing Python and Ruby projects instead of the languages the original authors thought of.

While in the enterprise everyone is happy using the available set of languages, on HN we continuously see posts about migrations from Ruby/Python to Go.

natefinch · on March 23, 2013

And the app engine only supports a single thread for Go, imagine if you could make it multi-threaded.

phasevar · on March 23, 2013

Multi-threaded wouldn't help the case of a single request, which is what this post is reporting latency numbers on.

voidlogic · on March 23, 2013

True, but it would help average request latency under load. I am hoping with Go 1.1 RC coming out next month that Google might enable multi-threading on the GAE.

Note: Even with Go only given a single core a developer can still write a Go program so they are concurrent. That way when one goroutine blocks on IO, another can use the CPU to make progress; furthermore, when multi-threading is enabled the performance upgrade will be substantial and seamless.

Encosia · on March 23, 2013

Assuming he's searching a trie to build the list of anagrams, he could distribute a single request's work over a thread per letter.

Evbn · on March 24, 2013

If he has multi core, which is often missing in a VM.

tansey · on March 24, 2013

If WordGap was written in pure Python, without the use of any matrix libraries like numpy/scipy, then I think you could have done better. 7x speedup is actually pretty low for algorithms that I've rewritten to use numpy.

bsaul · on March 23, 2013

Rewriting any code in any language pretty much always show improvements. Unless we're sure it's not a different data structure or algorithm it doesn't mean that much.

kkowalczyk · on March 23, 2013

Except it doesn't, not when improvement is so dramatic.

If he rewrote Python in Ruby, the perf would have been more or less the same.

If he rewrote in C, he would get even better perf but at much bigger cost (in dev time).

The point of his article is that Go is almost as easy to write in as Python but gives you 10x perf and that's why Go is attractive. It has the right perf/coding effort ratio.

bsaul · on March 24, 2013

i understand what you're saying, but my point is that even rewriting from python to python would show improvement. i've just coded the same project three times in different languages last year, and my design is getting better every time, and it has (almost) nothing to do with the language themselves. a benchmark is a much more reliable thing if you want to compare relative speed. or a "pet store" app. but not a "real world" project. altough that's still a really nice experience to read about.

Evbn · on March 24, 2013

This comment is making the exact mistake that parent was concerned about.

kristianp · on March 24, 2013

It would have been useful if there were some implementation detail. Did he go from hashes in python to maps in go? Or switch to maps of structs for example?