yobennett's comments

yobennett · on April 22, 2015

Note that Robert Yokota's addendum[1] points out that HBase "cannot achieve both consistency and availability." His earlier results did not take into account that HBase clients continuously retry failed ops. (According to Nicolas Liochon, upon its death a server's regions are moved to another node.) Failures start rolling in once network partition(s) extend beyond the configured timeout. Kyle [2] was not impressed:

"During the network partition, no requests are successful" is not the best result for a CP system, IMO."

HBase should provide partial availability in the face partitions.

[1] http://eng.yammer.com/call-me-maybe-hbase-addendum/

[2] https://twitter.com/aphyr/status/509841011816665088

krenoten · on April 22, 2015

No system can achieve 100% consistency and 100% availability in the presence of partitions. It's kind of wacky that Aphyr compared those replicated consensus tools and eventually consistent stores with HBase. HBase does not use a consensus protocol for replication, it uses HDFS. HBase is not eventually consistent. There is a single authoritative server for reads and writes of a lexicographic range of keys (a region) which writes immutable store files and a WAL to HDFS. Partial availability may be achievable for reads with a significant amount of effort and latency and limiting of total cluster size by allowing non-authoritative regionservers to read the HDFS WAL + Storefiles, but this really isn't realistic. I've personally been burned by the client retry thing though, and there's not really a better solution when you consider the types of workloads HBase is actually used for and it's incredibly variable latency (it aims only for consistency and very high AVERAGE throughput, at the cost of extremely high HIGHEST latency). One solution here could be the use of a configurable filesystem queue for clients. This is how you build resilient high-throughput pipelines. HBase is used in some places for OLTP, but only when the readload is very very low. So the effort to make reads more highly available would be in vain.

ddlatham · on April 22, 2015

No database can "achieve both consistency and availability" during a partition.

Also, if you follow the rest of the twitter conversation you may realize, as they did, that only requests to the minority partition are unsuccessful - which is exactly what you want from CP.

yobennett · on March 1, 2014

Blizzard - San Francisco, CA - Full-time

http://blizzard.com

Blizzard Entertainment is growing our presence in San Francisco and is seeking talented engineers to join the Battle.net team. We're looking for engineers and graphic designers. We build with tools like Java, JS (Node.js), and C++.

This is a perfect opportunity to play a role at one of gaming's most successful and enduring companies. We're tackling challenges of both scale and complexity that deal with millions of daily transactions and views. If you're interested to learn more about the openings and the new office ping me at bennett@blizzard.com.

More information at http://us.blizzard.com/en-us/company/careers/directory.html#...

Bonus points for reaching diamond league or higher in StarCraft 2.