Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Strictly speaking yeah. Practically speaking: Not really true.

Just because something can be done, doesn't mean it can be done easily or well. I've done a lot of work with relational databases, and I love them for a lot of data sets. But I also have done a lot of work with graph databases - and they make working with graph shaped data a pleasure. I could do a graph in SQL, it's even moderately straight-forward in postgres these days by using WITH RECURSIVE - but it's still not as simple as just loading orient or arango for those tasks.

It's the same reason I keep multiple knives in my kitchen. Sure I could do everything with an 8" chef's knife, but the paring knife and the boning knife just make some tasks easier.



Facebook has a very good paper descriptions on how they do graph on top of relational database. Google "facebook tao" for details.

I read that and implement my own version with SQL in < 500 lines of python code and found it just perfect for my own use cases. I can easily query any edges, notes in web speed ( < 10 ms) from databases with millions of nodes, edges, GBs of info.

I am curious what I might be missing with that approach as compare to a real graph database?


From my quick read of tao, it seems to be doing essentially what graph databases do but with the data storage layer being in sql rather than some other object store. And with the interface layer being not quite as feature complete. So the query layer in tao seems to lack a way to follow multiple edges without first returning to the application code, which graph dbs present as a native feature. The other thing that's lacking, which some graph dbs provide, is labeled edges - that is edges that contain data besides just an "edge type".


I implemented a table for Nodes, one for Edges. The Nodes table has an entry for JSON for that Nodes.

If I need more info for particular "Edge type", I just add new Node entry type "Edge_info" that link the Edge type to a JSON that content such info. I found that very flexible, but I have not used any real graph database.


Curious, since you seem to have experience with both graph and relational databases (and I have not really worked much with the former)... if I had a graph where I just want to compute the shortest weighted path between two nodes, which model would suit better and how much difference would it make in terms of performance?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: