Hi Nathan! this is awesome. I'm really excited to dive deeper.
some questions:
- I'm trying to understand the relationship between ZeroMQ and Kestrel in your architecture. is ZeroMQ used for message passing? and Kestrel used as a stream source/sink - aka a sprout? in other words, my assumptions are: zookeeper helps manage node discovery and coordination while message passing between nimble managed bolt processes' are through zeromq. kestrel queues are used for external integration (data stream sources). Is this correct or am I missing something?
- do you have any tutorials on using cascalog with Storm? are they compatible or have you developed a different clojure programming model/DSL for working with Storm?
You have it correct. ZeroMQ is used for message passing between components, and Zookeeper is used for node discovery and coordination.
Realtime processing is fundamentally different than batch processing, so you can't maintain the same semantics of Cascalog on top of Storm. Storm has a small DSL for writing topologies in pure Clojure, but it's not a higher level abstraction like Cascalog is. I've started thinking about what a great higher level abstraction would look like, but what that should look like is still an open question.
some questions:
- I'm trying to understand the relationship between ZeroMQ and Kestrel in your architecture. is ZeroMQ used for message passing? and Kestrel used as a stream source/sink - aka a sprout? in other words, my assumptions are: zookeeper helps manage node discovery and coordination while message passing between nimble managed bolt processes' are through zeromq. kestrel queues are used for external integration (data stream sources). Is this correct or am I missing something?
- do you have any tutorials on using cascalog with Storm? are they compatible or have you developed a different clojure programming model/DSL for working with Storm?
thanks and again - nice work!