September 2014

Volume 29 Number 9

The Working Programmer : Fun with C#, Part 2

Ted Neward | September 2014

Ted NewardWelcome back. In my last column, “Fun with C#” (msdn.microsoft.com/magazine/dn754595), I talked briefly about how familiarity with other programming languages can help clarify your thinking around a design problem that might otherwise seem intractable. I introduced a problem I encountered years ago during a consulting engagement in which I was required to reconcile a list of transactions stored locally with a supposedly equal list of transactions stored remotely. I had to either match them by transaction amount—nothing else was guaranteed to match for even the exact same transaction—or flag the unmatched items in the resulting list.

I chose to use F# for that exercise because it’s a language with which I’m familiar. Quite frankly, it could easily have been another language like Scala, Clojure or Haskell. Any functional language would have worked in a similar fashion. The key here isn’t the language itself or the platform on which it ran, but the concepts involved in a functional language. This was a pretty functional-friendly problem.

The F# Solution

Just to revisit, look at the F# solution in Figure 1 before looking at how it would translate into C#.

Look at the previous column to see a brief recap of the F# syntax used in Figure 1, especially if you aren’t familiar with F#. I’m also going to be going over it as I transform it into C#, so you could also just dive in.

Figure 1 The F# Solution for Resolving Disparate Transactions

type Transaction =
  {
    amount : float32;
    date : DateTime;
    comment : string
  }
type Register =
  | RegEntry of Transaction * Transaction
  | MissingRemote of Transaction
  | MissingLocal of Transaction
let reconcile (local : Transaction list) 
  (remote : Transaction list) : Register list =
  let rec reconcileInternal outputSoFar local remote =
    match (local, remote) with
    | [], _
    | _, [] -> outputSoFar
    | loc :: locTail, rem :: remTail ->
      match (loc.amount, rem.amount) with
      | (locAmt, remAmt) when locAmt = remAmt ->
        reconcileInternal (RegEntry(loc, rem) :: 
          outputSoFar) locTail remTail
      | (locAmt, remAmt) when locAmt < remAmt ->
        reconcileInternal (MissingRemote(loc) :: 
          outputSoFar) locTail remote
      | (locAmt, remAmt) when locAmt > remAmt ->
        reconcileInternal (MissingLocal(rem) :: 
          outputSoFar) local remTail
      | _ ->
        failwith "How is this possible?"
  reconcileInternal [] local remote

The C# Solution

At the starting point, you need the Transaction and Register types. The Transaction type is easy. It’s a simple structural type with three named elements, making it easy to model as a C# class:

class Transaction
{
  public float Amount { get; set; }
  public DateTime Date { get; set; }
  public String Comment { get; set; }
}

Those automatic properties make this class almost as short as its F# cousin. In all honesty, if I was going to really make it a one-to-one translation of what the F# version does, I should introduce overridden Equals and GetHashCode methods. For the purpose of this column, though, this will work.

Things get tricky with the discriminated union Register type. Like an enumeration in C#, an instance of a Register type can only be one of three possible values (RegEntry, MissingLocal or Missing­Remote). Unlike a C# enumeration, each of those values in turn can contain data (the two Transactions that matched for RegEntry, or the missing Transaction for either MissingLocal or Missing­Remote). While it would be easy to create three distinct classes in C#, these three classes must be somehow related. We need a List that can contain any of the three—but only those three—for the returned output, as shown in Figure 2. Hello, inheritance.

Figure 2 Use Inheritance to Include Three Distinct Classes

class Register { }
  class RegEntry : Register
  {
    public Transaction Local { get; set; }
    public Transaction Remote { get; set; }
  }
  class MissingLocal : Register
  {
    public Transaction Transaction { get; set; }
  }
  class MissingRemote : Register
  {
    public Transaction Transaction { get; set; }
  }

It’s not ridiculously complicated, just more verbose. And if this were production-facing code, there are a few more methods I should add—Equals, GetHashCode and, almost certainly, ToString. Although there might be a few ways to make it more idiomatically C#, I’ll write the Reconcile method fairly close to its F# inspiration. I’ll look for idiomatic optimizations later.

The F# version has an “outer,” publicly accessible function recur­sively called into an inner, encapsulated function. However, C# has no concept of nested methods. The closest I can approximate is with two methods—one declared public and one private. Even then, this isn’t quite the same. In the F# version, the nested function is encapsulated from everyone, even other functions in the same module. But it’s the best we can get, as you can see in Figure 3.

Figure 3 The Nested Function Is Encapsulated Here

class Program
{
  static List<Register> ReconcileInternal(List<Register> Output,
             List<Transaction> local,
             List<Transaction> remote)
  {
    // . . .
  }
  static List<Register> Reconcile(List<Transaction> local,
             List<Transaction> remote)
  {
    return ReconcileInternal(new List<Register>(), local, remote);
  }
}

As a side note, I can now accomplish the “hidden from everybody” recursive approach by writing the internal function entirely as a lambda expression referenced as a local variable inside Reconcile. That said, that’s probably a little too much slavish adherence to the original and entirely not idiomatic to C#.

It isn’t something most C# developers would do, but it would have almost the same effect as the F# version. Inside of Reconcile­Internal, I have to explicitly extract data elements used. Then I explicitly write it in an if/else-if tree, as opposed to the more succinct and terse F# pattern match. However, it’s really the exact same code. If either the local or remote list is empty, I’m done recursing. Just return the output and call it a day, like so:

static List<Register> ReconcileInternal(List<Register> Output,
              List<Transaction> local,
              List<Transaction> remote)
{
  if (local.Count == 0)
    return Output;
  if (remote.Count == 0)
    return Output;

Then, I need to extract the “heads” of each list. I also need to keep a reference to the remainder of each list (the “tail”):

Transaction loc = local.First();
List<Transaction> locTail = local.GetRange(1, local.Count - 1);
Transaction rem = remote.First();
List<Transaction> remTail = remote.GetRange(1, remote.Count - 1);

This is one place where I can introduce a huge performance hit if I’m not careful. Lists are immutable in F#, so taking the tail of a list is simply taking a reference to the second item in the list. No copies are made.

C#, however, has no such guarantees. That means I could end up making complete copies of the list each time. The GetRange method says it makes “shallow copies,” meaning it will create a new List. However, it will point to the original Transaction elements. This is probably the best I can hope for without getting too exotic. Having said that, if the code becomes a bottleneck, get as exotic as necessary.

Looking again at the F# version, what I’m really examining in the second pattern match is the amounts in the local and remote transaction, as shown in Figure 4. So I extract those values as well, and start comparing them.

Figure 4 The F# Version Examines the Local and Remote Amounts

float locAmt = loc.Amount;
  float remAmt = rem.Amount;
  if (locAmt == remAmt)
  {
    Output.Add(new RegEntry() { Local = loc, Remote = rem });
    return ReconcileInternal(Output, locTail, remTail);
  }
  else if (locAmt < remAmt)
  {
    Output.Add(new MissingRemote() { Transaction = loc });
    return ReconcileInternal(Output, locTail, remote);
  }
  else if (locAmt > remAmt)
  {
    Output.Add(new MissingLocal() { Transaction = rem });
    return ReconcileInternal(Output, local, remTail);
  }
  else
    throw new Exception("How is this possible?");
}

Each branch of the tree is pretty easy to understand at this point. I add the new element to the Output list, then recurse with the unprocessed elements of the local and remote lists.

Wrapping Up

If the C# solution is really this elegant, why bother taking the stop through F# in the first place? It’s hard to explain unless you go through the same process. Essentially, I needed the stop through F# to flesh out the algorithm in the first place. My first attempt at this was an absolute disaster. I started by iterating through the two lists using double “foreach” loops. I was trying to track the state along the way, and I ended up with a huge, steaming mess I would never have been able to debug in a million years.

Learning how to “think differently” (to borrow a famous computer company’s marketing line from a few decades ago) yields results, not the choice of language itself. I could’ve easily told this story going through Scala, Haskell or Clojure. The point wasn’t the language feature set, but the concepts behind most functional languages—recursion, in particular. That’s what helped break through the mental logjam.

This is part of the reason developers should learn a new programming language every year, as first suggested by one of the Pragmatic Programmers, Dave Thomas, of Ruby fame. Your mind can’t help but be exposed to new ideas and new options. Similar kinds of ideas emerge when a programmer spends some time with Scheme, Lisp or with a stack-based language such as Forth—or with a prototype-based language like Io.

If you’d like a lightweight introduction to a number of different languages all off the Microsoft .NET Framework platform, I highly recommend Bruce Tate’s book, “Seven Languages in Seven Weeks” (Pragmatic Bookshelf, 2010). Some of them you wouldn’t use directly on the .NET platform. Then again, sometimes the win is in how we think about the problem and frame the solution, not necessarily reusable code. Happy coding!


Ted Neward is the CTO at iTrellis, a consulting services company. He has written more than 100 articles and authored and coauthored a dozen books, including “Professional F# 2.0” (Wrox, 2010). He’s a C# MVP and speaks at conferences around the world. He consults and mentors regularly—reach him at ted@tedneward.com or ted@itrellis.com if you’re interested in having him come work with your team, and read his blog at blogs.tedneward.com.

Thanks to the following Microsoft technical expert for reviewing this article: Lincoln Atkinson