I once had to get my head around over 120k lines of complex, concurrent, buggy spaghetti Java code (for a real-time trading system).
My first attempt was to reverse engineer the code into a UML diagram. For some reason I keep making this mistake. A messy code base will result in an extremely messy diagram. It can give a few insights, but between finding a tool which will work and trying to make sense of a tangled mess of lines, visual diagrams usually aren't worth the time.
I found that a tool called Chronon was somewhat useful (google "DVR for Java"). This tools just records a single run of a program. It is great for going forwards AND BACKWARDS and you step through the code, take a look at different threads, state of various objects, etc.
My strategy was to run the server and have it execute a small and simple bit of functionality (execute a single order). Follow it all the way from input to output. Make the scenario a bit more complex and follow that through to completion. This way you get to understand the core functionality, edge case code and start to get a sense of performance enhancements, etc.
I found myself making steady progress and fixing a number of bugs, until I hit heisenbugs, caused by overly clever concurrency/object pooling. It is enough to drain your soul :)
My first attempt was to reverse engineer the code into a UML diagram. For some reason I keep making this mistake. A messy code base will result in an extremely messy diagram. It can give a few insights, but between finding a tool which will work and trying to make sense of a tangled mess of lines, visual diagrams usually aren't worth the time.
I found that a tool called Chronon was somewhat useful (google "DVR for Java"). This tools just records a single run of a program. It is great for going forwards AND BACKWARDS and you step through the code, take a look at different threads, state of various objects, etc.
My strategy was to run the server and have it execute a small and simple bit of functionality (execute a single order). Follow it all the way from input to output. Make the scenario a bit more complex and follow that through to completion. This way you get to understand the core functionality, edge case code and start to get a sense of performance enhancements, etc.
I found myself making steady progress and fixing a number of bugs, until I hit heisenbugs, caused by overly clever concurrency/object pooling. It is enough to drain your soul :)