Unit Testing Unordered Lists

Background

If you’ve read just about any of my earlier posts, you know that I write tests for Pulsar. This includes unit tests, functional tests, and performance tests, all in addition to the demo / visual tester applications that I usually post my screenshots from. The details on the structure to my codebase are probably worth outlining in a future post, but in this post now I’m going to focus in on the unit tests.

Implementation Note: I use UnitTest++ as my test harness. It is simple, lightweight, and quick to integrate into a codebase. If you don’t have a testing framework setup for your codebase yet, I highly recommend UnitTest++.

Unit Testing Unordered Lists

Every once in a while (primarily in my Geometry / Collision Detection library), I have run into the need for unordered lists in my test suite. In fact, it has come up often enough that I decided to share some thoughts and my solution here.

Although I’m going to explain this issue through a somewhat contrived example, I chose it because the premise is easy to understand and visualize; certainly this could be extended to other, more complex cases.

Let’s say you have the following Rectangle class that represents an 2D axis-aligned box:

class Rectangle
{
public:
   Vector2 mMin;
   Vector2 mMax;
};

Next, say we are writing a ComputeCorners() member function. As you probably guessed, this routine needs to return four Vector2s that are the coordinates of each of the corners on the Rectangle. This can be done in one of the following two flavors:

  • add a struct with four Vector2s, each with sensible member names
  • use an array: Vector2[4]

If your calling code requires explicit knowledge that maps each returned point to its corresponding corner on the Rectangle, you would probably choose to use the first option.

However, let’s assume that all we actually need are the points themselves and we don’t really care which point is which (maybe we’re just going to insert them into some larger list of points that will be processed, or whatever). Using the array approach, our test code might look something like this:

TEST(CanComputeCorners)
{
   Rectangle rect;
   rect.mMin.Set(1.2f, 1.4f);
   rect.mMax.Set(2.5f, 1.7f);
  
   Vector2 corners[4];
   rect.ComputeCorners(corners);
  
   CHECK_EQUAL(Vector2(1.2f, 1.4f), corners[0]);
   CHECK_EQUAL(Vector2(2.5f, 1.4f), corners[1]);
   CHECK_EQUAL(Vector2(1.2f, 1.7f), corners[2]);
   CHECK_EQUAL(Vector2(2.5f, 1.7f), corners[3]);
}

Without sweating the details of testing against an epsilon, this test looks pretty good at first glance, and it probably succeeds without trouble.

However, there is a subtle problem here: our test is assuming that there is a specific order to the points being returned. Nothing in our function dictates that we absolutely have to return the points in that order, but because of the way we have written our test any change to the order of the returned points in the implementation will result in a failing test.

So how exactly can we test this routine (and others like it) properly? In other words, how can we write our test to be order-independent?

My solution (as straightforward as it may be) was to implement a testing helper routine:

bool ArrayContainsItem(
   const Vector2* itemArray, const uint itemCount,
   const Vector2& itemToFind)
{
   for(uint idx = 0; idx < itemCount; ++idx)
   {
      const Vector2& curItem = itemArray[idx];
      if(curItem == itemToFind)
         return true;
   }
  
   return false;
}

TEST(CanComputeCorners)
{
   Rectangle rect;
   rect.mMin.Set(1.2f, 1.4f);
   rect.mMax.Set(2.5f, 1.7f);
  
   Vector2 corners[4];
   rect.ComputeCorners(corners);
  
   CHECK(ArrayContainsItem(corners, 4, Vector2(1.2f, 1.4f));
   CHECK(ArrayContainsItem(corners, 4, Vector2(2.5f, 1.4f));
   CHECK(ArrayContainsItem(corners, 4, Vector2(1.2f, 1.7f));
   CHECK(ArrayContainsItem(corners, 4, Vector2(2.5f, 1.7f));
}

Notice that this routine is intended for the purposes of simplifying the test and resides with the testing code, not packaged up with the library itself. It is perfectly acceptable to have to perform additional setup (mock objects, etc.) or implement a helper or two for testing.

Also, you could certainly generalize this unit test helper routine a little so it can be used in other unordered list instances -- I've found many uses for my own implementation while testing Pulsar libraries.

Final Thoughts

The real take away from this actually isn't about how to test an unordered set of objects; in fact, I think the lesson is much bigger and important: be careful to write tests for the functionality, not for the implementation.

This key point to this lesson is subtle and sometimes easy to miss / hard to catch.

To be clear, functionality is what the routine does that can be inspected from the outside view of the object, or, in the case of a non-member ("free") function, from the return value. On the other hand, implementation is much more tightly bound to the code itself (the internal process of a routine) and how it operates on an object or data.

In the example case above, we saw that the interface did not specify an ordering for the list of returned corner coordinates, yet our initial test definitively expected an ordering which happened to be based on our knowledge of the implementation.

I'll be the first to admit that it isn't always clear when implementation details have crept into the tests -- this has happened to me a lot more often than I care to reveal (especially while practicing Test-Driven Development).

In closing, I believe that the first line of defense against this problem is to simply be aware that it can show up when testing container return values. As a result, I encourage you to take some additional time to make sure that the tests really are testing the functionality and not the implementation.