Overview: Understanding Functional Types

Visual Studio 2010

Applies to: Functional Programming

Published: January 2010

Authors: Tomas Petricek and Jon Skeet

Referenced Image

Get this book in Print, PDF, ePub and Kindle at manning.com. Use code “MSDN37b” to save 37%.

Summary: F# is a statically typed language, where every expression has a type. This overview explains why types don't have to be written explicitly in F# and also introduces the most important types used in functional programming.

This topic contains the following sections.

This article is an excerpt from Real World Functional Programming: With Examples in F# and C# by Tomas Petricek with Jon Skeet from Manning Publications (ISBN 9781933988924, copyright Manning Publications 2009, all rights reserved). No part of these chapters may be reproduced, stored in a retrieval system, or transmitted in any form or by any means—electronic, electrostatic, mechanical, photocopying, recording, or otherwise—without the prior written permission of the publisher, except in the case of brief quotations embodied in critical articles or reviews.

In statically typed languages such as C#, Java, or F#, every expression has a type. In functional languages, everything is written as an expression (for more information about this topic, see Overview: Programming with Expressions). Therefore, saying that every expression has a type is a very strong statement. It means that any syntactically correct piece of F# code has some type. The type says what kind of results the expression can produce, so it is valuable information about the code.

This overview discusses two aspects of F# types:

  • In C#, types are specified explicitly most of the time. When writing a method, the developer has to provide the types of its parameters and the return type. In F#, writing types is optional. F# uses a mechanism called type inference to deduce the types automatically at compile-time.

  • F# programs work with primitive types (like integer, string, or floating point number), which are the same as in any other language. They also work with types composed from these primitive types. Functional languages provide richer ways for creating compound types, and this article discusses some of the interesting types.

Many interesting programs and code snippets in functional languages are written by combining functions. For example, a certain list processing can be implemented by taking a function that filters a list and give it a function that specifies the filtering condition as an argument. One way to look at types is that they serve as grammar rules for composing functions and other primitives.

The type of a function specifies how it can be used and combined with other functions. This not only verifies that we use functions in a correct way, but it is also a useful guide when writing code. For reusable functions that work with some data structures (such as lists), the type often gives enough information to implement the function.

The next section introduces the first topic of this overview—type inference. This can be demonstrated using C#, but F# takes the idea further.

When most of the types have a simple name such as int or Random, there’s only a small need for type inference. Writing the type names by hand isn’t difficult. C# 2.0 supports generics, so type names become more complex. The types in functional languages like F# are also quite complicated, especially when they involve types of functions.

A simple form of type inference for local variables is available in C# 3.0. A type name in a declaration can be usually replaced with the var keyword:

var num = 10;
var str = "Hello world!";

The first line declares a variable called num and initializes its value to 10. The compiler can easily infer that the expression on the right-hand side is of type int so it knows that the type of the variable must also be int. Note that this code means exactly the same thing as if it explicitly included the type. During the compilation, the C# compiler replaces var with the actual type, so there is no additional work done at runtime. This is particularly useful when working with complex generic types. For example:

var dict = new Dictionary<string, List<IComparable<int>>>();

Without the var keyword, the same type would appear twice on a single line:

  • When declaring the variable.

  • When creating an instance of Dictionary class.

The type inference in C# is limited to local variable declarations. Many F# snippets don’t mention any types at all because the type inference works on function parameters and return types as well. (If it fails to deduce some type, it is possible to specify the type explicitly, but this is a relatively rare occurrence.)

To demonstrate how this works, the following snippet shows a simple function that takes two parameters, adds them, and formats the result using the String.Format method. The first part shows valid F# code. The second shows how the program would look in C# if implicit typing were extended to allow you to use the var keyword in other places:

// Pseudo-C# version that uses 'var' everywhere
var Add(var a, var b) {
   var res = a + b;
   return String.Format("{0} + {1} = {2}", a, b, res);

The F# syntax is designed so that writing types is optional. In the pseudo-C# version, all types were replaced with the var keyword. This is (in principle) what the F# compiler sees when compiling the F# code. If you paste the above code to F# Interactive, it will be processed correctly and F# Interactive will report that the function takes two integers as arguments and returns a string:

val add : int -> int -> string

How can the F# compiler figure this out?

  • The first hint it has is that the function is adding the values a and b. In F#, + can add any numeric types or to concatenate strings. If the compiler doesn’t know anything else about the types of values; it assumes that we’re adding two integers.

  • From the addition, the compiler can deduce that both a and b are integers. Using this information, it can find the appropriate overload of the String.Format method.

  • The method returns a string so the compiler can deduce that the return type of the add function is also a string.

Type inference makes it possible to avoid errors and use all of the other benefits of static typing (like hints to developers when writing the code) at almost no price because the types are inferred automatically in most cases. When using F# in Visual Studio, the type inference is running in the background. When hovering over a value with a mouse pointer, it instantly shows the type. The background compilation also reports any typing errors instantly, so it produces a similar experience as when writing C# code.

All programming languages use several primitive types (like integers, characters, or floating-point numbers) and more complicated types composed from these primitive types. The following section introduces some of the compound types.

Perhaps the most interesting compound type in functional programming is a function. This may sound a bit unfamiliar. In other programming languages, functions aren't treated as types. They are a special language construct that is vastly different from other types like integers, strings, or .NET classes. However, functional languages treat functions in the same way as other types. (An example can be found in Overview: Working with Immutable Data)

The following list gives a brief review of compound types in F#. They are not types in the same sense as, for example, int or System.Random. Instead, they are type constructors. They describe how to create types from other, more primitive, types:

  • Function types are written using arrows. They represent parameterized code that can be provided with necessary arguments (to specify parameters of the code) and executed to get a result. For example, a function that converts an integer to a string has a type int -> string. Functions can also have multiple parameters. A function taking two integers and returning another integer (e.g., mathematical operators such as * or +) has a type int -> int -> int.

  • Tuple types can be used to group multiple values of different types into a single value. In functional languages, code is written as an expression that returns a result. What if an expression needs to return two values, for example, the name and the price of a product? The simplest option to group values is to use a tuple. The type string * int is a tuple that contains both a string and an integer. (Another related type in F# is record, which is just like a tuple, but individual elements have labels.)

  • Discriminated union types can be used for representing values that can be chosen from several alternatives. An example may be the date of an event in a scheduling application. The event can eitherbe scheduled to happen Once or it can happen Repeatedly. In the first case, it carries just a date (DateTime) but, in the second case, it needs the starting date (DateTime) together with an interval between the repetitions (TimeSpan). Unlike functions and tuples, discriminated unions must be declared explicitly.

These types demonstrate the elementary ways of constructing complex types in functional languages. Of course, F# works with other types as well. Most importantly, it is possible to access any .NET types including classes, interfaces, and arrays.

Functions and tuples can easily be represented in C#. Functions roughly correspond to delegates (e.g., Func<int, string>) and tuples are available in .NET 4.0 and you can use them from C# (e.g., Tuple<string, int>). The last type, discriminated unions, is specific to functional languages. The next section looks at them in more detail.

This section demonstrates the usefulness of discriminated unions using a common example—an application that works with graphical shapes. It uses a simple representation of shape, so it will be a rectangle, an ellipse (defined by the corners of a bounding rectangle), or a shape composed of two other shapes.

Object-oriented approach would suggest to define an abstract class to represent a shape (called Shape) and three derived classes to represent the three different cases (Ellipse, Rectangle, and Composed). The representation of shapes now involves four classes. Also, it isn't yet known how the shapes will be used. The application will likely draw them but it is early to say what arguments are needed for the drawing. This means that it is difficult to write any abstract methods in the Shape class. Is there a more straightforward way to represent the idea?

Creating Discriminated Union for Shapes

The original idea was simpler than a full-blown class hierarchy; it described the representation of a shape with three different cases. The example just attempts to define a data structure for representing the shape and F# allows doing exactly that:

type Shape = 
    | Rectangle of Point * Point       
    | Ellipse of Point * Point
    | Composed of Shape * Shape        

This code creates a discriminated union type called Shape, which is closer to the original description of the problem. The type declaration contains three cases that cover three possible representations of the shape. When working with values of this type in F#, the expression Rectangle(pt1, pt2) creates a rectangle. Unlike with unions in the C language, the value is tagged "See discriminated union", which means that it is always known which of the options it represents.

This fact is quite important for working with discriminated union values. The next section introduces pattern matching, a concept that makes many typical functional programming tasks easy. Even though pattern matching doesn’t look like a concept related to types, there are some important connections. Among other things, it can be used for implementing functions that work with discriminated unions.

Processing Shapes with Pattern Matching

Functional data types reveal much more about the structure of the type. A nice demonstration of this property is a discriminated union—when working with this type, you always know what kind of values the type can represent. (In the previous example, it could be a rectangle, an ellipse, or a composed shape.)

A function for working with discriminated unions must specify what the program should do for each of the cases. The construct that makes this possible is in many ways similar to the switch from C#, but there are several important differences. The following listing shows the shapeArea function, which uses the F# pattern matching construct match to calculate the area occupied by a shape:

let rec shapeArea shape = 
    match shape with
    // Two cases to handle primitive shapes
    | Rectangle(pfrom, pto) ->
        rectangleArea(pfrom, pto)
    | Ellipse(pfrom, pto) ->
        ellipseArea(pfrom, pto)
    // Special case when one rectangle is nested inside another
    | Composed(Rectangle(from1, to1), Rectangle(from2, to2))
            when isNestedRectangle(from2, to2, from1, to1) ->
        rectangleArea(from1, to1)
    // General case when we need to calculate areas of 
    // shapes and then subtract the intersection area
    | Composed(shape1, shape2) ->
        let area1 = shapeArea shape1
        let area2 = shapeArea shape2
        area1 + area2 - (intersectionArea shape1 shape2)

The first important difference from the C# case statement is that F# match can deconstruct the matched value. In the example above, this is used in all cases. The different cases (separated by the | are usually called clauses.

The area of a rectangle or ellipse can be calculated using the two points that specify the bounding rectangle. The clause of match construct can just provide two names (pfrom and pto) and pattern matching assigns values to these names when the branch is executed.

The second case is similar to the first one, but the third case is more interesting. The pattern that specifies conditions under which the branch should be followed (which is specified between the bar symbol and the arrow ->) is quite complicated. The pattern only matches when the shape is of type Composed and both of the shapes that form the composed shape are rectangles. Instead of giving names for the values inside the Composed pattern, the pattern contains two other patterns (Rectangle, twice). This is called a nested pattern and it proves very useful. Additionally, this pattern contains a when clause, which can specify any arbitrary condition. The example calls the isNestedRectangle function to test whether the second rectangle is nested inside the first one. If this pattern is matched, the function gets information about the two rectangles. It also knows that the second one is nested inside the first one so it can optimize the calculation and return the area of the first rectangle.

The F# compiler has full information about the structure of the type, so it can verify that no cases are missed. If the snippet didn't include the last case, the compiler would report a warning that there are valid shapes that have not been handled (for example, a shape composed from two ellipses). The implementation of the last case is more difficult so, if a program often composes two rectangles, the optimization in the third case would be quite useful.

Even though there’s no simple way to create a discriminated union type in C#, the concept is still valuable even for C# developers. Many of the programming problems C# developer faces can be represented using discriminated unions and the knowledge of discriminated unions provides a simple mental concept for thinking about the problem.

Readers familiar with the object-oriented design pattern called the composite design pattern may have recognized it in the previous example. The example creates a more complicated shape by composing it from two other shapes. Functional programming often uses discriminated unions to represent program data, so, in many cases, the composite design pattern disappears.

When implementing the problem in C#, it is possible to encode a discriminated union as a class hierarchy (with a base class and a derived class for every case). The recursive processing function can be replaced with the visitor design pattern. Mentally, one can still work with the simple concept, which makes thinking about the application architecture easier. In functional programming, this kind of data structure is used frequently, which also explains why functional languages support more flexible pattern matching constructs. The example in the previous section demonstrated that the F# match expression can simplify the implementation of rather sophisticated constructs. This type of simplification is quite common when using functional languages: an appropriate model and a bit of help from the language can go a long way to keeping code readable.

This overview looked at functional types and it explained how they are used in F#. The F# language uses type inference, which is a mechanism that makes it possible to have fully statically type-safe code without actually writing the types by hand (in most cases). The compiler deduces the types from the way we use values.

Next, the article introduced different types used in functional languages. The most common types are functions (representing computations), tuples (for grouping multiple values into a single one), and discriminated unions (for representing alternatives). Discriminated unions are unique to functional programming, so they were discussed in more detail using a representation of shapes as an example. Working with discriminated unions (as well as tuples) is simplified by pattern matching. An example demonstrated how to use it to process the shape type, but pattern matching is useful in many other situations.

Most functional languages use similar types, although only some of them infer and verify them at compile-time (as F# does). However, types like functions, tuples, and unions are some of the several key functional concepts. The following articles discuss other concepts that form the functional programming style:

This overview discussed types and some of their benefits. More information that can help with understanding the benefits of other functional concepts can be found in the following articles:

To download the code snippets shown in this article, go to http://code.msdn.microsoft.com/Chapter-1-Introducing-F-c967460d

This article is based on Real World Functional Programming: With Examples in F# and C#. Book chapters related to the content of this article are:

  • Book Chapter 2: “Core concepts in functional programming” introduces the statically-typed nature of F# including a discussion about units of measure that can be used to make your code even safer.

  • Book Chapter 5: “Using functional values locally” gives a detailed overview of the most important functional types including tuples, functions, and discriminated unions.

The following MSDN documents are related to the topic of this article: