> "Global variables" is really language specific terminology for global data. St...

sparkie · on April 12, 2015

>But global variables and SL are not the same thing as side-effects.

They're examples of side effects. It's not good enough to set the values of a global variable or register some service purely for the purpose of a test, because the test then does not reflect the runtime behavior of the code. The benefit of a unit test is to assert that code behaves the same way all the time - not just for specific values you use at the time of testing.

> Your example (of testing a database) seems very confused to me. You're talking about coupling now. Not global variables. Why would you suddenly need a database connection? Why does the existence of global mutable state mean that nothing in your code can be tested independently? You seem to have a strange idea of how software works.

Global variables increase coupling - code which consumes a global variable now has a dependency on all of the code which mutates it. You simply cannot test the consuming code in isolation without regard for the code mutating the variable, unless your test is exhaustive of every possible value which the global variable may contain.

My example was not of testing a database, it was about testing algorithms or logic that might exist inside some class named "Person", but which has a data dependency on an actual person (held in a database). If one wants to test the logic only, then mock data must be supplied instead of the real data from the database - else you're not testing only the person, but also testing that the database is connected and querying it is successful. The correct way to test this is to decouple Person from the database, usually be means of a mock object, or by passing the mock data into the person directly. Either way, it seems the blog author does not do such unit tests, as he doesn't use mock objects.

> I do not use mock objects when building my application, and I do not see the sense in using mock objects when testing. If I am going to deliver a real object to my customer then I want to test that real object and not a reasonable facsimile This is because it would be so easy to put code in the mock object that passes a particular test, but when the same conditions are encountered in the real object in the customer's application the results are something else entirely. You should be testing the code that you will be delivering to your customers, not the code which exists only in the test suite.

The problem with the author's philosophy is that it means when problems do arise in his applications, he must perform whole system testing/debugging to find them. He is missing perhaps the main benefit of unit tests - which is that, when a bug arises, you can quickly eliminate many possible causes because unit tests against those parts of code have succeeded (unless your unit tests were wrong to begin with, which will more or less be the case if they're testing against code which depends on globals).

sago · on April 12, 2015

You're making me feel very dumb. Because several of these seem to be the opposite of what I've observed.

Testing with a mock object implies that the mock object can generate all the required output that the real object can generate that might have some effect on the consuming code. Not only that, but it assumes that the mock object generates the correct data in ways that cannot generate false positives in the test. This doesn't mean you're only testing the client logic. You're now testing the client logic using services that are ad-hoc and aren't guaranteed to behave like the real thing. You're testing a fantasy.

It is far better to test against the real database. Using a fixture, or a transaction, or some way to use the actual system with representative data. Mocks have their place in very complex services where this is practically impossible. But they don't suddenly make things better for testing, or more atomic. IMHO, when you have to use a mock, it should be as a last resort, when you have to sacrifice fidelity for tractability. Your code is coupled in behavior to the services it uses, pretending it isn't is just fooling yourself.

I have very much the same problem with people who write unit tests against, say SQLite databases, rather than the full DBMS. The complexity of 'masking sure the database is connected and can be queried' is pretty trivial compared to the complexity of mocking a whole RBMDS interface. Good software engineering will, of course, limit the number of places the database interfaces with (I'm not suggesting code with SQL statements in strings everywhere, that's a straw man). But I'd not accept mocked tests that exists just to avoid a database connection or because the developer doesn't understand how to write a transaction.

So I don't understand. Either you're advocating a very bizarre, and seemingly pathological development style, or you're consistently muddying the waters by comparing good programming in your chosen methodology with bad programming in mine, which just misses the point.

Here's an example then. In your Person object, on a platform with reasonable transaction/fixtures support (like Django). Is it better to write your unit test using a mocked ORM layer, or a fixture with the test data in it?

> He is missing perhaps the main benefit of unit tests - which is that, when a bug arises, you can quickly eliminate many possible causes because unit tests against those parts of code have succeeded

I've no idea why this is somehow impossible. I write unit tests at various levels of abstraction. If I have module A, calling module B which calls module C, then I need tests for C, B(+C) and A(+B+C). If I get a failure in A, I make sure that there is a test in B that corresponds to the way A is using B, if so, it is a problem with A, not B. If B and C were mocked, I'd have no way of knowing if the problem was with the mock logic without having to test C, C-mock, B+C-mock, B-mock, A+B-mock.

> now has a dependency on all of the code which mutates it

This seems a bizarre claim. Does your code have a dependency on everything else that can possibly change what's on the screen? If so, how do you deal with that?

That's why pretending 'global variables' = 'all central resources' seems foolish to me.

sparkie · on April 12, 2015

I probably have quite a fundamentalist view on unit testing because I write primarily in purely functional code these days - where a "unit" is a pure function, and it's clearly an isolated unit. Even when I'm back in OOP world though, I basically avoid static variables/globals like the plague. Even where the framework or some library makes use of them, I'll tend to wrap them up and pass them into my code via Main, to make sure that no statics are globally accessible throughout the code.

If I were testing a salary calculation which takes values from a database, and I named my test "Test_salary_calculation_correct", where instead of using some sample data which could easily cover the range of values I need to test against, I instead relied on a database connection, and this test failed because the database was not accessible - I've only confused the developer who picks up my shit where "Test_salary_calculation_correct" fails, and he thinks there's a problem with my calculation rather than a misconfigured firewall somewhere else. The firewall has nothing to do with my salary calcuation - why should it have any effect on the test passing?

The way I see unit tests is this: If you write a test and it passes on your machine, then some other developer takes your code and the same test fails - it's a fuckup on your behalf. Unit tests should not depend on the environment in any way. Actually, by definition, a unit test is a test of a single "unit" - including database access into this is well beyond the scope of unit testing, but into integration testing.

To me it seems you're skipping unit testing and just going onto integration testing with your unit testing framework. I'm not sure what you've observed or where, but I can tell you it's certainly not standard or best practice in the industry. It might possibly tell you something about your own code style though - are you writing units which can be treated in isolation? (Certainly not if you depend on a SL, which is a global context of services with no clear boundary)

Ideally a codebase should be designed to maximize unit-testability and reduce the need for integration testing to as little as possible - since this is where most of the "unexpected", or "out of my control" problems are most likely to occur. This testing is more a case of "am I handling all the relevant exceptions" than getting green lights to pass in a unit testing framework. It doesn't really help to make unit tests against code which is expected to fail out in the wild due to whatever circumstance - what matters here is that your code is prepared for the worst and knows how to recover.

It's these cases where mock classes are particularly useful - because you can forcefully simluate any behavior from the external service and make sure your code is working correctly for all the potential circumstances. Having to rely on divine intervention to trigger some event that may only happen 1% of the time in the real-world situation is hardly practical. Unfortunately testing in the wild is often like this - everything works fine 99% of the time.

Even for cases where you're arguing for a fixture with real test data in (from a database), then the reasonable thing to do is extract this data beforehand and encode it into the unit testing language (which is fairly trivial to do). Now you have a reliable test which will continue to work as you update the code. Testing against live data is giving a false sense of security to begin with anyway. Imagine the scenario where you have a bunch of data in the database, you run your unit test against it with all green flags - then after deployment, somebody inserts into the database a value which your code doesn't expect. The unit test shouldn't be testing against real world data, but against data representitive of the possible values it should accept (ie, include all the obvious edge cases which should fail too, but are not likely to exist in the real world DB).