Modifying tests after refactoring

This post is a little long, but ultimately what I would love to hear opinions on could be summarized as:

When refactoring leads to moving logic out of the system under test to another object, what do you do with the “old” tests?


I recently ran into a situation where a refactoring moved a block of code from the system under test to another object. This code contained a little bit of logic that, at the time the class was written, seemed specific to the class, but after another feature required very similar logic, it was moved in order to be shared. The refactor meant creating the new object with the existing collaborator using dependency injection, and all tests were green and everything seemed great.

Here is an attempt at a contrived example of what I’m describing - please disregard that it’s a bit silly and unrealistic. Pretend there are two third party APIs that is not under our control and we will use doubles for. A messages API that only has a method to get all messages and does not provide any filtering. Our first feature asks that send notifications of unread messages. Which we implement like so:

Original Test

it 'notifies of messages that are currently unread' do
  unread_message = Message.new(status: :unread)
  read_message = Message.new(status: :read)
  message_api = stub('message_api', messages: [unread_message, read_message])

  notifier = mock('notifier')
  expect(notifier).to receive(:notify).with([unread_message])
  
  ClassUnderTest.new(message_api, notifier).notify_of_unread_messages
end

Original Class

def initialize(message_api, notifier)
  @message_api = message_api
  @notifier = notifier
end

def notify_of_unread_messages
  @notifier.notify(unread_messages)
end

private

def unread_messages
  @message_api.messages.select { |m| m.status == :unread }
end

A feature later in the project also needed to use the concept of what was previously a private method. For this example, let’s say we now want a dashboard that shows unread messages. This, in addition to some other things, meant that introducing a facade on top of the API would be helpful. So the class was modified as such (ignore names - still just an example):

Class after refactoring

def initialize(message_api, notifier)
  @message_api_facade = MessageApiFacade.new(message_api)
  @notifier = notifier
end

def notify_of_unread_messages
  @notifier.notify(@message_api_facade.unread_messages)
end

Everything is still green. The private method is gone, and now this class is only concerned with the act of getting the right messages and sending the notification. I would count this as a good refactoring, but I have a few problems now.

  • Since we wrote tests for the newly introduced class (MessageApiFacade in this case), tests for the logic that it encapsulates are duplicated (the original class’ tests and the new ones)
  • When writing future tests we would want to stub the facade, but cannot do that at the moment since we create it in the initializer.
  • New developers may be confused when coming into the test class because of setup of objects/properties that we don’t directly care about

What I ended up doing in this instance is: committing and having the CI server build it to show that everything was green after the refactor. Then doing the following modification in a separate commit in an attempt to address the above three points.

New Test

it 'notifies of messages that are currently unread' do
  unread_messages = stub('unread_messages')
  message_api_facade = stub('facade', unread_messages: unread_messages)

  notifier = mock('notifier')
  expect(notifier).to receive(:notify).with(unread_messages)
  
  ClassUnderTest.new(message_api_facade, notifier).notify_of_unread_messages
end

New Class

def initialize(message_api_facade, notifier)
  @message_api_facade = message_api_facade
  @notifier = notifier
end

def notify_of_unread_messages
  @notifier.notify(@message_api_facade.unread_messages)
end

Now this particular test only tests that the interactions work as expected, it does not have any knowledge of what happens to determine whether a message is “unread”.

I did not feel good about this change, though, because I am simultaneously altering test and production code, and I’m not sure that I can still “trust my tests” since the new tests did not come from TDD, but was rather altered after the production code was written (or, in this case, refactored).

Obviously this type of refactoring is very common. What would be the preferred way of handling it from a testing perspective?

I like to test classes that an outsider can operate on, and not those that are implementation details.

In your example you have a test for the method notify_of_unread_messages which uses a private method, that you don’t test because it’s an implementation detail. When you extract that method into a different class, if that class has no interaction with the outside world (through an API, GUI, CLI, etc…), it is just used by your application’s classes, I’d say it is still an implementation detail, and you should not test it.

As for the second class that uses it, just test its behavior as you would normally do.

This is a concept that’s easier to grasp if you think of the unit of unit testing not as a method or a class, but as a certain behavior.

There’s a pretty nice talk about this by Ian Cooper: Ian Cooper: TDD, where did it all go wrong on Vimeo

1 Like

I definitely agree with your overall point - only testing the interactions that are available to the public. I’ll check out that video.

In this case, the extraction does include interaction with the outside world (via API), so I don’t feel that it is purely an implementation detail not to be tested, and that is the crux of the problem.

I came across a blog post yesterday morning from Justin Searls that was perfectly timed and essentially discusses the exact inner struggle that I am facing. The Failures of "Intro to TDD"

I think it was a good read and would recommend checking out the full text, but I’ll try to summarize what I felt were the key points:

  • Leaving redundant tests can cause false negatives in the parent tests. In my example, if the property value for unread messages changed from :unread to :ur, for example, suddenly tests are failing in multiple places instead of being isolated to the class that should own that concern. When introducing a new unit like this, both it and the existing classes should essentially be “re-TDD’d” and allow the higher level integration tests to ensure that the units work correctly together.

  • The reverse argument to the previous point is that “changing the test removes regression value” and that it now “doesn’t really verify anything”. I have heard TDD purists respond that the point of the unit level test is not for the regression value, but in order to derive an optimal design in the first place, which leads into the next point.

  • This test originally specified two things - interaction with an API in addition to logical behavior carried out on the results from those interactions. Juggling these two concerns makes tests difficult to read and change. Ideally this pain would have been felt during the TDD cycle and drawn attention to the fact that the private method really belonged somewhere else to begin with.

So I think that, if faced with my scenario, Justin’s advice would be to essentially do what I have done here, but to recognize that perhaps my approach to the intial testing should be altered to catch these before requiring the refactor. He describes his approach to TDD as being heavy in “reductionism” and relying heavily on the concepts put forth by Growing Object Oriented Software. This also seems very similar to Gary Bernhardt’s Functional Core, Imperative Shell. I’m interested to try implementing a feature completely in this style.

I’d love to get feedback on these ideas or thoughts on that blog post. Also: I’ll introduce a new question:

Is interacting with a collaborator as well as directly executing logical code “too much” for a class? Would a design where classes either organize messages between collaborators or encapsulate logic ultimately be easier to reason about and change, or is that too extreme?

@wpgreenway You might want to take a look at this article I wrote, as you seem to be interested in the topic.

I think the approach I follow in my article (largely based on Uncle Bob’s, I must confess), answers your new question. Do you agree?