## Maximum Weight Bipartite Matching implementation in Java

It’s been too long, but I’ve finally contributed to an open source project again. A number of years ago I implemented a polynomial-time algorithm for find the maximum weight matching of a bipartite graph, based on the Hungarian algorithm.

Well, the code wasn’t being used anymore, and my employer was kind enough to allow me to open source it. I ended up adding it to the jgrapht project. Feels good to be contributing to open source again. The jgrapht project is a very capable theoretical graph library, with a lot of handy algorithms, including some for finding spanning trees and shortest paths.

The jgrapht project is on github now here.

## Unit testing – Build up vs tear down approaches

Suppose we had to unit test the following method:

```public boolean result() {
return a() && b() && c();
}
```

There are a number of different approaches that could be used to test this method. In this case, there are 2^3 = 8 possible test cases, as each of a(),b(),c() could return either true or false:

```a | b | c | result
------------------
f   f   f   f
f   f   t   f
f   t   f   f
f   t   t   f
t   f   f   f
t   f   t   f
t   t   f   f
t   t   t   t
```

For a method like this, it’s not too much work to write a test case for all 8 possibilities. In real life though, we simply can’t test everything. We need to focus our testing to get the best coverage and catch the most likely errors.

Suppose we can’t test all 8 test cases. How should we test the method? What test cases should we focus on? One approach I often see is what I call a “build up” approach:

```@Test
public test_result() {
//assume a,b,c default to false
assertFalse(result());
setA(true);
assertFalse(result());
setB(true);
assertFalse(result());
setC(true);
assertTrue(result());
}
```

I classify this as a “build up” approach because the test case builds the required conditions, one by one, step by step, and tests the method during the build up until the conditions are all met.

The problem is, I don’t think this is great test coverage. To see why, think about the whole purpose of the method: return true when a(),b(),and c() are all true, and false otherwise. With the tests above, how much confidence should a tester have that all three conditions a(),b(),and c() are required for result() to return true? All we really know is that setting c() to true caused result() to return true. We don’t have any insight if c() by itself was the sufficient condition, or if all of a(),b(),and c() were.

```a | b | c | result
------------------
f   f   f   f *
f   f   t   f
f   t   f   f
f   t   t   f
t   f   f   f *
t   f   t   f
t   t   f   f *
t   t   t   t *

* = test cases covered
```

See how column c and the result column are identical for the test cases we covered? The method could be re-written as the following and still pass all tests:

```public boolean result() {
//this still passes the 4 test cases above
return c();
}
```

We want to avoid the direct correlation between the result and one of the conditions.

My preferred approach is the “tear down” method. Instead of building up the conditions one by one, I instead build up all the conditions at the beginning, and then tear down each condition individually, and ensure I get the desired result:

```@Test
public void test_result() {
setA(true);
setB(true);
setC(true);
assertTrue(result());

//now take each condition away individually and test
setA(false);
assertFalse(result());

setA(true);
setB(false);
assertFalse(result());

setB(true);
setC(false);
assertFalse(result());
}
```

You can see with this approach, we still have the same number of tests (4), but we cover the arguably more important test cases:

```a | b | c | result
------------------
f   f   f   f
f   f   t   f
f   t   f   f
f   t   t   f *
t   f   f   f
t   f   t   f *
t   t   f   f *
t   t   t   t *

* = test cases covered
```

There is now no direct correlation between and single condition (either a(), b() or c()) and the result. It would be much more difficult to re-write the result() method and have it pass the four test cases above. This is what we want; a unit test that documents a method’s intentions, and makes it difficult to make a mistake during any future changes, refactoring, or re-implementation.