Experiments in public: Mocking snakes

Many years ago, I was tasked with improving the performance of a suite of unit tests. They were taking ever longer to run, and were beyond 20 minutes when I started working with them. Needless to say, this meant people rarely ran them.

From Miško Hevery:

What I want is for my IDE to run my tests every time I save the code. To do this my tests need to be fast, because my patience after hitting Cntl-S is about two seconds. Anything longer than that and I will get annoyed. If you start running your tests after every save from test zero you will automatically make sure that your test will never become slow, since as soon as your tests start to run slow you will be forced to refactor your tests to make them faster.

The problem was that every test was descended from a common base class, and that class brought up fake versions of most of the application. Well, mostly fake versions. There was still a lot of I/O and network activity involved in bringing up these fakes.

The solution turned out to be mock objects, JMock in this particular case. For those unfamiliar, mock objects are objects that can stand in for your dependencies, and can be programmed to respond in the particular manner necessary for whatever it is you are testing. So if your network client is supposed to return "oops" every time the network connection times out, you can use a mock to stand in for the network connection, rather than relying on lady fortune to drop the network for you (or doing something terrible, like having code in your test that disables your network interface).

There are a couple of general drawbacks to using mock objects, but the primary one is that a mock object only knows what you tell it. If the interface of your dependencies change, your mock object will not know this, and your tests will continue to pass. This is why it is key to have higher level tests, run less frequently, that exercise the actual interfaces between objects, not just the interfaces you have trained your mocks to have.

The other drawbacks have more to do with verbosity and code structure than anything else. In order for a mock to be useful, you need a way to tell your code under test what dependency it is standing in for. In my code, this tends to lead to far more verbose constructors, that detail every dependency of the object. But there are other mechanisms, which I will explore here.

For a more verbose comparison of mock libraries in a variety of use cases, check this out:

http://garybernhardt.github.com/python-mock-comparison/

Hopefully this post will be a more opinionated supplement to that.

There are a couple of categories of things to mock:

Unreliable dependencies (network, file system)
Inconsistent dependencies (time-dependent functionality)
Performance-impacting dependencies (pickling, hashing functions, perhaps)
Calls to the object under test

The last item is certainly not a necessity to mock, but it does come in handy when testing an object with a bunch of methods that call each other. I'll refer to it as "partial mocking" here.

For this article, I'm going to focus on 4 mock object libraries, Mocker, Flexmock, and Fudge, chosen primarily because they are the ones I have experience with. I also added in Mock, but I don't have much experience with it yet. I believe, from my more limited experience with other libraries, that these provide a decent representation of different approaches to mocking challenges.

I'm going to go through common use cases, how each library handles them, and my comments on that. One important note is that I generally don't (and won't here) differentiate between mocks, stubs, spies, etc.

Getting a mock

Gist: "Getting mock objects from different libraries"

	"""Mocker:

	Mocks are obtained from a mocker instance, which is created
	automatically if you create a test case that inherits from
	MockerTestCase.
	"""
	from mocker import Mocker, MockerTestCase

	class TestCaseForMocker(MockerTestCase):

	def test_something(self):
	a_mock = self.mocker.mock()

	def mocker_test():
	mocker = Mocker()
	a_mock = mocker.mock()

	"""Flexmock:

	Mocks are obtained by calling flexmock(), which can take a
	variety of arguments to further detail their behaviour."""

	from flexmock import flexmock

	def flexmock_test():
	a_mock = flexmock()

	"""Fudge:

	There are two ways mocks are obtained in Fudge. The neater
	way, which I'll call the "implicit" way, is with the
	fudge.patch decorator, which performs both creation of the mock,
	and substituting it into the namespace of the test."""

	from unittest import TestCase

	import fudge

	class TestCaseForFudge(TestCase):

	@fudge.patch("time.sleep")
	def test_something(self, a_mock):
	pass

	#Then there's the explicit way:
	def fudge_test():
	a_mock = fudge.fake()

	"""Mock:

	mock.Mock() gives you a mock.
	"""

	import mock

	def mock_test():
	a_mock = mock.Mock()

view raw gistfile1.py hosted with ❤ by GitHub

Dependencies are usually injected in the constructor, in a form like the following:
Gist "Verbose dependency specification for mocking"

	import time

	def __init__(self, time_mod=time):
	self.time = time_mod

	def some_method(self):
	self.time.sleep(10)

view raw gistfile1.py hosted with ❤ by GitHub

This is verbose, especially as we build real objects, which tend to have many dependencies, once you start to consider standard library modules as dependencies. :)

NOTE: Not all standard library modules need to be mocked out. Things like os.path.join or date formatting operations are entirely self contained, and shouldn't introduce significant performance penalties. As such, I tend not to mock them out. That does introduce the unfortunate situation where I will have a call to a mocked out os.path on one line, and call to the real os.path on the next:
Gist: "Confusion when not everything is mocked"

	def delete_file(self, name):
	path = os.path.join(BASE_DIR, name)
	if self.os.path.exists(path):
	os.remove(path)

view raw gistfile1.py hosted with ❤ by GitHub

This can certainly be a bit confusing at times, but I don't yet have a better solution.

However, it is quite explicit, and avoids the need for a dependency injection framework. Not that there's anything wrong with using such a framework, but doing so steepens the learning curve for your code.

Verifying Expectations

One key aspect of using mock objects is ensuring that they are called in the ways you expect. Understanding how to use this functionality can make test driven development very straightforward, because by understanding how your object will need to work with it's dependencies, you can be sure that the interface you are implementing on those dependencies reflects the reality of how it will be used. For this and more, read Mock Roles Not Objects>, by Steve Freeman and Nat Pryce.

...anyway, verification takes different forms across libraries.
Gist: "Verifying mock expectations"

	"""Mocker:

	Setting expectations with Mocker is as simple as calling the
	methods on the mocks, and then switching the mocker instance
	into replay mode."""

	from mocker import Mocker

	def mocker_test():
	mocker = Mocker()
	a_mock = mocker.mock()
	a_mock.do_something()
	mocker.replay()
	a_mock.do_something()
	mocker.verify() #Unnecessary within MockerTestCases

	"""FlexMock:
	Setting expectations with FlexMock is quite straightforward, as
	long as you are using a test runner that FlexMock integrates
	with, otherwise you need to verify manually."""

	from flexmock import flexmock

	def flexmock_test():
	a_mock = flexmock()
	expectation = a_mock.should_receive("do_something").once
	#once is necessary to generate an expectation
	a_mock.do_something()
	expectation.verify()#Usually done automatically

	"""Fudge:

	fudge.verify() can be used to check whether mocks have been used
	as expected. When used within a decorated (@patch or @test)
	method, this is called automatically.
	"""
	import fudge

	def fudge_test():
	stub = fudge.Fake()
	stub.expects("do_something")
	stub.do_something()
	fudge.verify() #Implicit within a decorated method.

	"""Mock:

	Mock supports verifying expectations by interrogating the mocks
	afterwards. This can be rather verbose, but it does force you to
	take verification seriously, which is good."""

	import mock

	def mock_test():
	a_mock = mock.Mock()
	a_mock.do_something()
	a_mock.do_something.assert_called_once_with()

view raw gistfile1.py hosted with ❤ by GitHub

Partial Mocks

Partial mocking is a pretty useful way to ensure your methods are tested independently from each other, and while it is supported by all of the libraries tested here, some make it much easier to work with than others.
Gist "Partial mocks"

	#Need to create this because mocker.patch() and flexmock(object)
	#don't work with builtins.

	class Something(object):

	def do_something(self):
	pass

	"""Mocker:

	Mocker has two methods for partial mocking.
	mocker.proxy() will return a mock object that will forward
	unmatched calls to the original object.
	mocker.patch() is similar, but at replay() time, modifies the
	original object with whatever expectations have been set up on
	it."""

	from mocker import Mocker

	def mocker_test_proxy():
	mocker = Mocker()
	obj = Something()
	partial = mocker.proxy(obj, spec=None)
	partial.do_something()
	mocker.replay()
	partial.do_something()

	def mocker_test_patch():
	mocker = Mocker()
	obj = Something()
	partial = mocker.patch(obj, spec=None)
	partial.do_something()
	mocker.replay()
	obj.do_something()

	"""Flexmock:

	Flexmock uses flexmock(original) for partial mocking.
	If an expectation is set up that does not match an existing
	method on the original object, flexmock.MethodDoesNotExist
	is raised.
	"""

	from flexmock import flexmock

	def flexmock_test():
	obj = Something()
	flexmock(obj)
	obj.should_receive("do_something")
	obj.do_something()

	"""Fudge:

	Fudge has a bit of a roundabout approach to partial mocks.
	First you need to create a mock object, then create a
	PatchHandler with the original class and the method to mock
	and use that to patch() the mock you created. Finally you
	need to restore() the PatchHandler.

	There is also a shortcut using a context manager, which is much
	cleaner, but doesn't scale well if you need to mock out multiple
	calls."""

	import fudge
	from fudge.patcher import PatchHandler

	def fudge_test():
	a_mock = fudge.Fake().is_callable()
	patch = PatchHandler(Something, "do_something")
	patch.patch(a_mock)
	obj = Something()
	obj.do_something()
	patch.restore()

	def fudge_test_cm():
	with fudge.patched_context(Something, 'do_something',
	fudge.Fake().is_callable()):
	obj = Something()
	obj.do_something()

	"""Mock:

	You can create partial mocks with Mock by creating mocks and
	assigning them to the attributes you want to mock out. This
	seems like a fairly limited approach, because it overrides what
	was originally in that attribute."""

	import mock

	def test_mock():
	obj = Something()
	something.do_something = mock.Mock()
	obj.do_something()

view raw gistfile1.py hosted with ❤ by GitHub

Chaining Attributes and Methods

I'm of the opinion that chained attributes are generally indicative of poor separation of concerns, so I don't place too much weight on how the different libraries handle them. That said, I've certainly had need of this functionality when dealing with a settings tree, where it can be much easier to just create a mock if you need to access settings.a.b.c.

Chained methods are sometimes useful (especially if you use SQLAlchemy), as long as they don't impair readability.
Gist "Chaining methods and attributes"

	"""Mocker:

	Chaining attribute or method access in Mocker is trivial. During
	the record phase, every mock operation returns a mock."""

	from mocker import Mocker

	def mocker_test():
	mocker = Mocker()
	a_mock = mocker.mock()
	a_mock.clock().block.shock()
	mocker.replay()
	a_mock.clock().block.shock()

	"""Flexmock:

	Flexmock has a shortcut for simple chained method calls, but a
	verbose syntax for chained attributes, or complex method calls.
	"""

	from flexmock import flexmock

	def flexmock_test_attributes():
	a_mock = flexmock(clock=flexmock(block=flexmock(
	shock=lambda:None)))
	a_mock.clock.block.shock()

	def flexmock_test_methods():
	a_mock = flexmock()
	a_mock.should_receive("clock.block.shock")
	a_mock.clock().block().shock()

	def flexmock_test_methods_complex():
	a_mock = flexmock()
	a_mock.should_receive("clock").with_args("a").and_return(
	flexmock().should_receive("block").with_args("b")\
	.and_return(flexmock().should_receive("shock").mock
	).mock
	)
	a_mock.clock("a").block("b").shock()

	"""Fudge:

	Fudge has a somewhat verbose method of dealing with chained
	methods and attributes."""

	import fudge

	def fudge_test():
	a_mock = fudge.Fake()
	a_mock.provides("clock").returns_fake().has_attr(
	block=fudge.Fake().provides("shock"))
	a_mock.clock().block.shock()

	"""Mock:

	Mock provides a very concise syntax for chaining methods and
	attributes.
	"""
	import mock

	def mock_test():
	a_mock = mock.Mock()
	m_shock = a_mock.clock.return_value.block.shock
	a_mock.clock().block.shock()
	m_shock.assert_called_once_with()

view raw gistfile1.py hosted with ❤ by GitHub

Failures

An important part of any testing tool is how informative it is when things break down. I'm talking about detail of error messages, tracability, etc. There's a couple of errors I can think of that are pretty common. For brevity, I'm only going to show the actual error message, not the entire traceback.

Note: Mock is a bit of an odd duck in these cases, because it lets you do literally anything with a mock. It does have assertions you can use afterwards for most cases, but if an unexpected call is made on your mock, you will not receive any errors. There's probably a way around this.

Arguments don't match expectations, such as when we call time.sleep(4) when our expectation was set up for 6 seconds:

Mocker: MatchError: [Mocker] Unexpected expression: m_time.sleep(4)
Flexmock: InvalidMethodSignature: sleep(4)
Fudge: AssertionError: fake:time.sleep(6) was called unexpectedly with args (4)
Mock: AssertionError: Expected call: sleep(6)
Actual call: sleep(4)

When I first encountered Flexmock's InvalidMethodSignature, it threw me off. I think it could certainly be expanded upon. Otherwise, Mock and Fudge have very nice messages, and as long as you know what was supposed to happen, Mockers is perfectly sufficient.

Unexpected method called, such as when you misspell "sleep":

Mocker: MatchError: [Mocker] Unexpected expression: m_time.sloop
Flexmock: AttributeError: 'Mock' object has no attribute 'sloop'
Fudge (patched time.sleep): AttributeError: 'module' object has no attribute 'sloop'
Fudge: AttributeError: fake:unnamed object does not allow call or attribute 'sloop' (maybe you want Fake.is_a_stub() ?)
Mock: AssertionError: Expected call: sleep(6)
Not called

Mock doesn't tell you that an unexpected method was called. Mocker has what I consider the best implementation here, because it names the mock the call was made on. The second Fudge variant is good, but because you might encounter it or the first variant depending on context, Fudge overall is my least favourite for this. Flexmock simply defers handling this to Python.

Expected method not called:

Mocker: AssertionError: [Mocker] Unmet expectations:
=> m_time.sleep(6)
- Performed fewer times than expected.
Flexmock: MethodNotCalled: sleep(6) expected to be called 1 times, called 0 times
Fudge: AssertionError: fake:time.sleep(6) was not called
Mock: AssertionError: Expected call: sleep(6)
Not called

I think they all do pretty well for this case, which is good, because it's probably the most fundamental.

Roundup

So, having spent a bit of time with all of these libraries, how do I feel about them? Let's bullet point it!

Mocker

Pros

Very explicit syntax
Verbose error messages
Very flexible

Cons

Doesn't support Python 3 and not under active development
Performance sometimes isn't very good, especially with patch()
Quite verbose

Flexmock

Pros

Clean, readable syntax for most operations

Cons

Syntax for chained methods can be very complex
Error messages could be improved

Fudge

Pros

Using @patch is really nice, syntactically
Examples showing web app testing is nice touch

Cons

@patch can interfere with test runner operations (because it affects the entire interpreter?)
Partial mocking is difficult

Mock (preliminary)

Pros

Very flexible

Cons

Almost too flexible. All-accepting mocks make it easy to think you have better coverage then you do (so use coverage.py!)

Acknowledgements

Clearly, a lot of work has been put into these mock libraries and others. So I would like extend some thanks:

Gustavo Niemeyer, for his work on Mocker.
Kumar MacMillan, for his work on Fudge, and for helping me in preparing material for this post.
Herman Sheremetyev, for his work on Flexmock
Michael Foord, for his work on Mock, and for getting me on Planet Python

Additionally, while I didn't work with them for this post, there are a number of other mock libraries worth looking at:

Dingus
Mox
MiniMock, which I've used quite a bit in the past, and I'm delighted to learn that development is continuing on it!

4 comments:

Alec MunroSeptember 8, 2011 at 11:36 AM
Well, it's finally up. This ended up being a much bigger undertaking then I had expected, but it still feels incomplete.

I'm very interested in any feedback, to either expand this post, or perhaps create a follow-up.
Alec MunroSeptember 11, 2011 at 6:29 PM
Thanks to a quick chat with Michael Foord, I realized that my link, and some of my understanding, for Mock, was pointing to a different project.

I've updated the link, and removed some of the inaccurate commentary, but the new documentation he pointed me to has quite a lot to digest, so I'll probably have to update this again in a couple of days with further conclusions.
Ross ReedstromSeptember 12, 2011 at 8:42 AM
tldr; ;-) Yet, but I will! Thanks for putting this up: I've been planning to get around to investigating mocking in python for some time - your post looks like a good general overview and jumping off point.
Alec MunroSeptember 13, 2011 at 6:11 AM
As it turns out, FlexMock does indeed verify expectations automatically, as long as it can integrate with your test runner. Unfortunately, I've been using PyDev for development, and it uses Exceptions for flow control at a certain point, which ends up leaving an exception in sys.exc_info, which FlexMock interprets as a reason not to do the verifying.
I've filed a bug here:
http://sourceforge.net/tracker/?func=detail&aid=3408057&group_id=85796&atid=577329
and hopefully I'll be able to submit a patch shortly.

Thursday, September 8, 2011

Mocking snakes

Getting a mock

Verifying Expectations

Partial Mocks

Chaining Attributes and Methods

Failures

Roundup

Mocker

Flexmock

Fudge

Mock (preliminary)

Acknowledgements

4 comments: