span.sup { vertical-align:text-top; }

Advanced Basics

The LINQ Enumerable Class, Part 2

Ken Getz

Code download available at:AdvancedBasics2008_09a.exe(180 KB)

Contents

Converting Sequences
Positioning within Sequences
Calculating Sequences
Performing Set Operations
Summing It Up

In the last installment, I took you on a quick tour through approximately half of the methods in the System.Linq.Enumerable class. This class, which provides all the extension methods that make LINQ queries in Visual Basic® and C#, also extends other classes, such as List(Of T) and Array, making it possible to use querying semantics with objects that wouldn't otherwise support querying methods. You can review that column in the July 2008 issue at msdn.microsoft.com/magazine/cc700332.

This time, I'll complete the tour of the methods, looking at methods that handle converting, positioning, calculating, and performing set operations on sequences of objects. (You should download the sample application, which contains all the code I show here. Following along in the app will help you understand how the code works, especially since you can alter it to experiment as you go.) Just note that since this article was originally published, Visual Studio 2008 SP1 has been released. Breaking changes in that update required several small changes to the demonstration code for this article, and the changes are currently reflected in the online version of the article. If you previously downloaded the demonstration code, and it no longer works because you’ve upgraded to SP1, please download the latest version, available here.

Converting Sequences

If you need to take the results of working with an enumerable sequence and pass them to some method that requires a specific type, or you need to call a method of a specific type using data that's stored in an enumerable sequence, you may need to call one of the Enumerable class methods that convert the data to a different type. For example, imagine that you have customer data in a sequence, and you'd like a comma-delimited list of customer names. Although you can solve this problem several other ways (as you'll see when you investigate the Aggregate method later in the column), the String.Join method solves this problem handily.

The problem is that this method only accepts an array of strings as input. The answer is the Enumerable.ToArray method, which converts the collection to an array:

'From ToArrayDemo in the sample:
Dim db As New SimpleDataContext
Dim customers = _
  From cust In db.Customers _
  Where cust.Country = "France" _
  Select cust.ContactName
Dim nameList As String = String.Join(", ", customers.ToArray())

Using the data in the Northwind sample database, this code fills the nameList variable with the following text:

Frédérique Citeaux, Laurence Lebihan, Janine Labrune, Martine Rancé, Carine Schmitt, Daniel Tonini, 
Annette Roulet, Marie Bertrand, Dominique Perrier, Mary Saveley, Paul  Henriot

You can convert from an enumerable sequence into a generic Dictionary, although you must at least supply a function that indicates how you want to generate the key values. Note that the Enumerable.ToDictionary method provides several overloads, allowing you to specify various combinations of key selector, value selector, key comparer, and value comparer methods.

Imagine that you've extracted product information from the Northwind sample database and you'd like an in-memory dictionary using product ID values as the key. The code in Figure 1 handles this task for you.

Figure 1 Creating a Dictionary

' From ToDictionaryDemo in the sample:
Dim db As New SimpleDataContext
Dim someProducts = _
  From prod In db.Products _
  Where prod.CategoryID = 1

Dim productsDictionary = _
  someProducts.ToDictionary( _
    Function(prod) prod.ProductID)

' Display the contents of the dictionary:
Dim sw As New StringWriter
For Each kvp In productsDictionary
  sw.WriteLine("{0}: {1}", _
    kvp.Key, kvp.Value.ProductName)
Next

This code takes the contents of the someProducts variable and converts it to a Dictionary, using the ProductID field as the key value. The code then loops through all the items in the dictionary, printing the key (that is, the ProductID field) and one field from the value of each dictionary item.

What if you want to use a value that isn't a simple type like the key in your dictionary? Maybe you want to use the entire Product as the dictionary key. In that case, you must again supply an instance of a custom comparer so that you provide a means of comparing instances of the key.

In Figure 2 , the code uses the entire Product as the key value for each item in the dictionary; simply supply a custom class that implements IEqualityComparer(Of Product). This class, which is also in the sample project, determines that two products are equal if their ProductID fields are equal.

Figure 2 Product Comparer

Public Class ProductComparer
  Implements IEqualityComparer(Of Product)

  Public Function Equals1( _
    ByVal x As Product, ByVal y As Product) As Boolean _
    Implements IEqualityComparer(Of Product).Equals

    Return x.ProductID.Equals(y.ProductID)

  End Function

  Public Function GetHashCode1( _
    ByVal obj As Product) As Integer _
    Implements IEqualityComparer(Of Product).GetHashCode

    Return obj.GetHashCode

  End Function
End Class

In Figure 3 , the first lambda expression calculates the key, the second calculates the value, and the instance of the ProductComparer class provides the means of comparing two keys.Figures 2 and 3 fill the StringWriter instance with the same output, using differently configured dictionaries to create the same results:

 1: Chai
 2: Chang
24: Guaraná Fantástica
34: Sasquatch Ale
35: Steeleye Stout
38: Côte de Blaye
39: Chartreuse verte
43: Ipoh Coffee
67: Laughing Lumberjack Lager
70: Outback Lager
75: Rhönbräu Klosterbier
76: Lakkalikööri

Figure 3 Comparing Two Keys

' From ToDictionaryDemo in the sample:
Dim productsDictionary2 As Dictionary(Of Product, String) = _
  someProducts.ToDictionary( _
    Function(prod) prod, _
    Function(prod) prod.ProductName, _
    New ProductComparer())

' Display information from the new dictionary:

sw = New StringWriter()
For Each kvp In productsDictionary2
  sw.WriteLine("{0}: {1}", kvp.Key.ProductID, kvp.Value)
Next

Some methods specifically require a generic List as input, rather than an enumerable sequence. To convert to a List, use the Enumerable.ToList method. The following code retrieves a list of product names, converts them to a generic List(Of String), and then uses the IndexOf method to locate an item in the list:

' From ToListDemo in the sample:
Dim db As New SimpleDataContext
Dim productNames = _
  db.Products.Select(Function(prod) prod.ProductName).ToList()
Dim results = _
  String.Format("Chang was found at index {0}", _
    productNames.IndexOf("Chang"))

After running the sample code, the variable "results" contains the following text:

Chang was found at index 2

A Dictionary data structure maps a key to a single value. A Lookup data structure maps a key to a group of values. This structure is a perfect match for a hierarchical Enumerable instance (for example, a CategoryID linking to a number of Product instances). The Enumerable.ToLookup method performs the conversion for you, assuming that you have a simple hierarchy of a key to values.

The code in Figure 4 converts an IEnumerable(Of Pro­duct) sequence into a Lookup, where each key is a CategoryID, and each value is a Product. The first parameter to the ToLookup method is a function that determines the key, and the second is a function that determines the value for each item.

Figure 4 Convert IEnumerable Sequence into a Lookup

' From ToLookupDemo in the sample:
Dim db As New SimpleDataContext
Dim products = _
  From prod In db.Products _
  Where prod.UnitPrice > 40 _
  Order By prod.CategoryID

Dim lookup = _
  products.ToLookup( _
    Function(prod) CInt(prod.CategoryID), _
    Function(prod) prod)

' Iterate through the lookup:
sw = New StringWriter
For Each grouping In lookup
  sw.WriteLine("Category ID = {0}", grouping.Key)
  For Each value In grouping
    sw.WriteLine("    {0} ({1:C})", _
                 value.ProductName, value.UnitPrice)
  Next
Next

Running the sample code places the text in Figure 5 into the StringWriter variable. Each key is associated with more than one product, and the information is stored in the Lookup instance.

Now, let's say you have a non-generic collection of some sort, but you'd like to apply standard query operators to it. For example, you might have an ArrayList containing data, and you would like to filter the list using a Where method call. The Enum­er­­­able.Cast method casts each element of a collection to a specific type and returns a generic IEnumerable instance containing the specified type. You can then operate on the result, as shown in Figure 6 .

Figure 5 Text for StringWriter

Category ID = 1
    Côte de Blaye ($263.50)
    Ipoh Coffee ($46.00)
Category ID = 2
    Vegie-spread ($43.90)
Category ID = 3
    Tarte au sucre ($49.30)
    Sir Rodney's Marmalade ($81.00)
    Schoggi Schokolade ($43.90)
Category ID = 4
    Raclette Courdavault ($55.00)
Category ID = 6
    Thüringer Rostbratwurst ($123.79)
    Mishi Kobe Niku ($97.00)
Category ID = 7
    Rössle Sauerkraut ($45.60)
    Manjimup Dried Apples ($53.00)
Category ID = 8
    Carnarvon Tigers ($62.50)

Figure 6 Filtering the ArrayList

    Dim items As New ArrayList
    items.Add("January")
    items.Add("August")
    items.Add("October")
    items.Add("April")

    ' Cast the ArrayList as a queryable group of strings
    ' (if the cast to String failed for any element,
    ' this would raise an exception):
    Dim query = items.Cast(Of String)()

    ' Now, use the Enumerable class to query the data:
    Dim results = _
      query.Where(Function(item) item.StartsWith("A"))

Be aware that Enumerable.Cast throws an exception if it can't convert all the input elements to the specified type. You can use the OfType method to filter the list before converting it, if that's the case. Figure 6 filters the ArrayList data to retrieve only those items that begin with the letter "A." After running the code in Figure 6 , the variable named results contains the following items:

August
April

(Note: The behavior of the Enumerable.Cast method changed between the original version of Visual Studio 2008 and SP1. Originally, the Cast method performed a conversion from the original type to the type specified in the generic parameter. Starting in SP1, the Cast method performs a cast, not a conversion. In other words, if the TryCast method would return True in an attempt convert from the original type to the new type for each element of the collection, the Cast method will succeed. The Cast method triggers an exception when you execute the query if any element within the collection can’t be implicitly cast to the new type.) Finally, the Enumerable.AsEnumerable method enables you to treat a source type as IEnumerable so you can use methods of IEnumerable rather than methods in the implemented class. This method is useful in specific circumstances, but it's unlikely that you'll need it in general coding.

The complex example in the documentation explains this method. If you need to coerce your own class to behave as if it were of type IEnumerable, then you should definitely take a look at the Enumer­­able.AsEnumerable method.

Positioning within Sequences

Given that LINQ uses deferred execution to retrieve data, and you may want to virtualize access to large sets of data, you need some way to retrieve a specific number of rows from a data source, starting at a particular offset within the data. To satisfy these needs, you can use the Enumerable.Take and Enumerable.Skip methods. These methods allow you to specify the number of rows to take and the number of rows to skip before starting to take rows. The sample project includes the following simple code, which returns 5 rows after skipping 10 rows:

Dim db As New SimpleDataContext
Dim products = (From p In db.Products _
   Order By p.ProductName _
     Select String.Format("{0}: {1}", p.ProductID, p.ProductName)). _
       Skip(10).Take(5)

You can use the Enumerable.TakeWhile and Enumerable.Skip­While methods to take and skip values in a sequence while some condition is true. The TakeWhile method takes values while a condition is true and returns a sequence containing all the values it took. The SkipWhile method skips values as long as the condition is true and returns the remainder of the input sequence.

The sample procedure in Figure 7 creates a generic List(Of In­teger) containing random integers. It then takes values while each item is less than a specific number and displays the results (the GetCommaList method in the sample creates a comma-delimited string containing the contents of the input sequence). The code also shows a second way to call SkipWhile and TakeWhile, passing the index of each item to the function that performs the decision making, and it supplies results shown at the bottom of Figure 7.

Figure 7 SkipWhile and TakeWhile

' From TakeWhileSkipWhileDemo
Dim sw As New StringWriter
Dim rnd As New Random
Dim cutOff As Integer = rnd.Next(50, 100)

' Create a list containing a fixed set of random numbers:
Dim items As New List(Of Integer)
For i As Integer = 1 To 20
  items.Add(rnd.Next(1, 100))
Next

sw.WriteLine("The cutoff value was: {0}", cutOff)
sw.WriteLine("The full list is:")
sw.WriteLine(GetCommaList(items))

Dim list = items.TakeWhile(Function(item) item < cutOff)
' Show the list:
sw.WriteLine()
sw.WriteLine("The result list is (item < cutoff):")
sw.WriteLine(GetCommaList(list))

' Can also pass in the index to the lambda expression:
list = items.TakeWhile(Function(item, index) item > index)
sw.WriteLine()
sw.WriteLine("The result list is (TakeWhile item > index):")
sw.WriteLine(GetCommaList(list))

list = items.SkipWhile(Function(item, index) item > index)
sw.WriteLine()
sw.WriteLine("The result list is (SkipWhile item > index):")
sw.WriteLine(GetCommaList(list))

RESULTS
The cutoff value was: 77
The full list is:
8, 5, 3, 77, 34, 54, 54, 40, 62, 90, 61, 76, 30, 48, 91, 71, 83, 6, 48, 66

The result list is (item < cutoff):
8, 5, 3

The result list is (TakeWhile item > index):
8, 5, 3, 77, 34, 54, 54, 40, 62, 90, 61, 76, 30, 48, 91, 71, 83

The result list is (SkipWhile item > index):
6, 48, 66

Calculating Sequences

The Enumerable class provides several different methods that perform calculations over sequences, and Visual Basic exposes almost all of these as query keywords. For simple calculations, you should use the Enumerable.Average, Enumerable.Count, Enumerable.LongCount, Enumerable.Max, Enumerable.Min, or Enumerable.Sum methods. For any other calculation, use the Enumerable.Aggregate method and provide a function that performs the specific calculation you need.

The Count and LongCount methods require no parameters and count all the elements in the input sequence; you can also specify a function as a parameter and have the function filter the results before counting. All the other methods let you call them without a parameter, but only if the input sequence has a single column. Otherwise, you must provide a function that indicates what value you want to calculate. The code in Figure 8 performs some simple calculations. The comments in the code provide further explanation.

Figure 8 Counts and Averages

' From SimpleCalculationDemo in the sample:
Dim db As New SimpleDataContext
Dim decimalResults = _
  From product In db.Products _
  Where product.CategoryID = 1 _
  Select product.UnitPrice

average = decimalResults.Average()

Dim results = _
  From product In db.Products _
  Where product.CategoryID = 1

average = results.Average(Function(prod) prod.UnitPrice)

' If you need an Int64 result, use Enumerable.LongCount().
' Supply no parameter to count all elements in the sequence.
' Supply a function to filter the results before counting:
Dim count = results.Count()

' Calculate filtered count:
count = results.Count( _
  Function(prod) prod.ProductName.StartsWith("C"))

' Retrieve the maximum unit price.
Dim max = results.Max(Function(prod) prod.UnitPrice)

' Retrieve the minimum unit price for products 
' whose name starts with C.
Dim min = _
  results. _
    Where(Function(prod) 
      prod.ProductName.StartsWith("C")). _
      Min(Function(prod As Product) prod.UnitPrice)

' Calculate sum of units in stock:
Dim total = results.Sum(Function(prod) prod.UnitsInStock)

Imagine that you need to find the standard deviation of the UnitPrice field in the Products table. This calculation determines a value that indicates how far from the mean, in general, the prices are. To calculate standard deviation, you first calculate the variance in the prices (involving the average of the sum of the squares of the differences between the prices and the mean price), and then take the square root of the result. The Enumerable class doesn't provide a simple method to do this for you, but you can use a combination of the Average and Aggregate methods to produce the result.

To call Aggregate, supply a "seed" value (that is, the initial value of the calculated result) and a calculating function that accepts two parameters: the first contains the current total value, and the second contains the specific item to be aggregated. Within the function, perform the calculation. For example, to sum the squares of the difference between the UnitPrice and the mean price, you could create an aggregating function like this:

Function(current As Decimal, item As Product) _
  current + CDec((item.UnitPrice - averagePrice) ^ 2))

The code in Figure 9 calculates the standard deviation.

Figure 9 Standard Deviation

' From AggregateDemo in the sample:
Dim db As New SimpleDataContext
Dim results = _
  From product In db.Products _
  Where product.CategoryID = 1

' Calculate the average price. Note the call to 
' CDec--because the UnitPrice field is nullable, 
' you must convert from Decimal? to Decimal:
Dim averagePrice = _
CDec(results.Average(Function(prod) prod.UnitPrice))

' Note that the seed value is 0:
Dim sumSquares = results.Aggregate(0, _
  Function(current As Decimal, item As Product) _
    current + CDec((item.UnitPrice - averagePrice) ^ 2))

Dim variance = sumSquares / results.Count
Dim standardDeviation = variance ^ 0.5

You can use the Aggregate method on non-numeric values as well. You previously saw an example that cast a list as an array of String values, so that you could use the String.Join method. You can accomplish the same goal using the Aggregate method:

' From AggregateDemo in the sample:
Dim db As New SimpleDataContext
Dim customers = _
  From cust In db.Customers _
  Where cust.Country = "France" _
  Select cust.ContactName

' Note that the seed value is an empty string:
Dim customerNames = customers.Aggregate(String.Empty, _
  Function(current, name) _
    If(String.IsNullOrEmpty(current), name, current & ", " & name))

Performing Set Operations

The Enumerable class provides methods that perform set operations, such as calculations of unions and intersections. This final section investigates the methods of the class that provide these capabilities. The first example, Figure 10 , shows the Enumerable.Concat, Enumerable.Union, Enumerable.Intersect, and Enumerable.Except methods. Each method operates on a pair of sequences.

Figure 10 Using Concat, Union, Intersect, Except

' From SimpleSetDemo in the sample: Concatenates two sequences, taking 
' all members of each. Union removes duplicates.
Dim db As New SimpleDataContext

' Create two sequences:
Dim group1 = db.Customers.Select(Function(cust) cust.Country).Take(5)
Dim group2 = db.Customers.Select(Function(cust) cust.Country) _
  .Skip(30).Take(5)

' Show both groups:
sw.WriteLine("group1:" & GetCommaList(group1))
sw.WriteLine("group2:" & GetCommaList(group2))

' Return the complete list of countries:
Dim group3 = group1.Concat(group2)

' Return a unique list of countries in the two groups:
group3 = group1.Union(group2)

' Find all elements in group1 that are also in group2: Can supply custom 
' IEqualityComparer instance to compare items if they're not simple 
' types:
group3 = group1.Intersect(group2)

' Find the differences between group1 and group2. That is, return the set 
' of all items that are in the first group but not the second. This is 
' equivalent to calculating the union of the two, and removing the 
' contents of the intersection of the two. Can supply custom 
' IEqualityComparer instance to compare items if they're not simple ' types:
group3 = group1.Except(group2)

RESULTS
group1: Germany, Mexico, Mexico, UK, Sweden
group2: France, France, France, France, Germany

group1.Concat(group2): Germany, Mexico, Mexico, UK, 
Sweden, France, France, France, France, Germany

group1.Union(group2): Germany, Mexico, UK, Sweden, France
group1.Intersect(group2): Germany
group1.Except(group2): Mexico, UK, Sweden

The Concat and Union methods seem similar, combining the results of two sequences. However, the Concat method adds the output of one sequence to another, while the Union method removes duplicates from the result. The Intersect method returns a sequence with all the items common to both input sequences, and the Except method returns all items that are in the first sequence but not in the second (in other words, the difference between the two sequences). Running the code in Figure 10 produces the output at the bottom of the figure (the output first displays the two sequences, then the results of the calculations).

Note that all the samples you see here use simple, default comparers. If you want to perform set operations on sequences of more complex objects, you can call the overloaded versions of the methods that accept instances of custom comparers, as you've seen in previous examples.

The Enumerate class also supports several more complex set operations, using the Enumerable.Join, Enumerable.GroupBy, and Enumerable.GroupJoin methods. I will describe and demonstrate each of these methods in a moment.

Imagine that some method hands you two sequences that are related, and you want to join them based on the correlation of a key value in both sequences. Obviously, this task is more typically thought of as being something for a relational database to handle. However, you can use the Enumerable.Join method to do this same kind of work. Figure 11 joins a sequence containing categories with a sequence containing products.

Figure 11 Joining Two Sequences

' From ComplexSetOperationDemo in the sample:
Dim categories = _
  db.Categories.Where(Function(cat) cat.CategoryID = 3)

' If the correlation key isn't a simple type, you could
' provide a comparer to compare the two key values:
Dim result = categories.Join(db.Products, _
  Function(cat) cat.CategoryID, _
  Function(prod) prod.CategoryID, _
  Function(cat, prod) _
    New With {.CategoryName = cat.CategoryName, _
      .ProductName = prod.ProductName})

sw = New StringWriter
For Each item In result
  sw.WriteLine("{0} -- {1}", item.CategoryName, item.ProductName)
Next

RESULTS
Confections -- Pavlova
Confections -- Teatime Chocolate Biscuits
Confections -- Sir Rodney's Marmalade
' More items here. . .
Confections -- Tarte au sucre
Confections -- Scottish Longbreads

To call the Join method, you must supply a function that returns the primary key in the parent sequence, a function that returns the foreign key in the child sequence, and a function that projects the data from the two sequences into the output sequence. (If the correlation key isn't a simple type, you must also supply a custom comparer, as you've seen previously, to compare the key values.) Running the sample code in Figure 11 produces the output at the bottom of the figure.

The Enumerable.GroupBy method groups the input sequence by a key value, creating groupings of items. Each group in the output sequence contains a "header" item and an Items property that provides access to the grouped items.To call the Enumer­able.GroupBy method, you must supply at least three functions. The first provides the grouping key, the second provides the resulting item in the group, and the third provides the header contents.

The code in Figure 12 groups a sequence of products by the category ID and displays the results. Running the code creates the output at the bottom of the figure, grouped by category ID.

Figure 12 Group by Category

' From ComplexSetOperationDemo in the sample:
Dim groupedProducts = _
  (From prod In db.Products _
   Where prod.UnitsInStock < 10).GroupBy( _
    Function(prod) prod.CategoryID, _
    Function(prod) _
      New With {prod.ProductName, prod.UnitsInStock}, _
    Function(catID, group) New With _
        {.CategoryID = catID, .Count = group.Count(), _
         .Items = group})

sw = New StringWriter
For Each grouping In groupedProducts
  sw.WriteLine("Category {0}: {1} item(s)", _
    grouping.CategoryID, grouping.Count)
  For Each item In grouping.Items
    sw.WriteLine("{0}{1} ({2})", vbTab, _
      item.ProductName, item.UnitsInStock)
  Next
Next

RESULTS
Category 2: 3 item(s)
  Chef Anton's Gumbo Mix (0)
  Northwoods Cranberry Sauce (6)
  Louisiana Hot Spiced Okra (4)
Category 3: 2 item(s)
  Sir Rodney's Scones (3)
  Scottish Longbreads (6)
Category 4: 2 item(s)
  Gorgonzola Telino (0)
  Mascarpone Fabioli (9)
Category 6: 3 item(s)
  Alice Mutton (0)
  Thüringer Rostbratwurst (0)
  Perth Pasties (0)
Category 7: 1 item(s)
  Longlife Tofu (4)
Category 8: 1 item(s)
  Rogede sild (5)

Finally, the Enumerable.GroupJoin method combines the functionality of the GroupBy and Join methods. This method correlates two sequences based on matching keys and groups the results in the right-hand sequence. Nothing quite like this method exists in standard database functionality, and you may find this method useful when working with related sets of data.

To call the GroupJoin method, you must supply three functions. The first function defines the key on the parent side. The second function defines the key on the child side. The third function projects the data, given the parent row and the child group, into the output format for the "one" set of rows. The two keys must be of the same type, and if the type isn't a simple type, you must provide a custom comparer so that the keys can be compared.

The sample code in Figure 13 joins category and product sequences using the category ID as the correlating key. The code specifies that the output rows contain a Category property (which contains the entire category row), a Count property (which contains the number of child rows), and an Items property (which exposes the group of child rows). The sample iterates through each grouping, prints out information about the header row, and then displays information about each child row. After running the sample, the StringWriter contains the results shown at the bottom of Figure 13.

Figure 13 Joining Category and Product Sequences

' From ComplexSetOperation in the sample:
categories = db.Categories.Where(Function(cat) cat.CategoryID < 5)
products = db.Products

Dim groupedResults = _
  categories.GroupJoin(products, _
  Function(cat) cat.CategoryID, _
  Function(prod) prod.CategoryID, _
  Function(cat, group) _
    New With {.Category = cat, .Count = group.Count, _
      .Items = group})

sw = New StringWriter
For Each grouping In groupedResults
  sw.WriteLine("{0}: {1} item(s)", _
    grouping.Category.CategoryName, grouping.Count)
  For Each item In grouping.Items
    sw.WriteLine("{0}{1} ({2})", vbTab, _
      item.ProductName, item.UnitsInStock)
  Next
Next

RESULTS
Beverages: 12 item(s)
  Chai (39)
  Chang (17)
  ' Items removed here. . .
  Lakkalikööri (57)
Condiments: 12 item(s)
  Aniseed Syrup (13)
  Chef Anton's Cajun Seasoning (53)
  ' Items removed here. . .
  Original Frankfurter grüne Soße (32)
Confections: 13 item(s)
  Pavlova (29)
  ' Items removed here. . .
  Scottish Longbreads (6)
Dairy Products: 10 item(s)
  Queso Cabrales (22)
  ' Items removed here. . .
  Mozzarella di Giovanni (14)

Summing It Up

Through its clever use of extension methods, the Enumerable class makes working with many different types of collections, lists, and sequences simpler. In addition, the Enumerable class provides the "heart" of working with LINQ queries. If you're building applications that use data from a database or any type of data structure in Visual Studio® 2008, it's worth completely internalizing the capabilities of the rich Enumerable class. You'll create more efficient code, and you'll create it faster. I constantly search for ways to make my code more declarative, and the Enumerable class makes it far easier to avoid looping and allows you to write less code.

Insights: Generics and Type Inference

Ken's been demonstrating a number of interesting ways to use lambdas and extension methods together. Let's take a look at the example where he selects all the Strings that start with the letter "A":

Dim results = query.Where(Function(item) item.StartsWith("A"))

This defines a lambda expression that operates over each element in the sequence. But how does the compiler know that item has a StartsWith method? How does the IDE know to provide IntelliSense® after "item dot"? Is that a late-bound call?

No. What's happening here is that the compiler's doing some work on your behalf through a process called "lambda parameter type inference." Let's look at the signature for the Where extension method, shown here:

<Extension()> _
  Function Where(Of T)(source As IEnumerable(Of T), _
  predicate As Func(Of T, Boolean)) As IEnumerable(Of T)

Now when the compiler sees a call to this extension method, it converts it to something like this:

Dim results = Enumerable.Where(query, Function(item) _
  item.StartsWith("A"))

Notice that no generic parameters were passed to the method—using the (Of T) syntax. You could pass them explicitly, but if you don't, the compiler will try to infer the generic type(s) based on the arguments passed in. In this case, the compiler has one generic parameter, "T," which is used in three places (in two arguments and a return value). It also has one type hint, which comes from the parameter query.

The query's type is IEnumerable(Of String), and the corresponding parameter on Where has type IEnumerable(Of T). Thus the compiler is able to infer that T must be of type String.

As a result, the compiler does not need to be given a type for the lambda expression's parameter since it already knows it from the context. Effectively, the lambda is treated exactly the same as if you'd typed this:

Function(item As String) item.StartsWith("A")

This process works the same for lambdas that take multiple parameters, saving you keystrokes and making your code more concise.

Now consider a slightly more complicated example:

Function Sum(Of T)(ByVal x As T, ByVal y As T) As T
  ...
End Function

Dim total = Sum(1, 2.0)

What type should T be now? There are two type hints: one that indicates it's an Integer and one that indicates it's a Double. As a result, if you hover over Sum you'll see that the compiler has inferred T to be a Double (which is a type wide enough to handle both an integer and a double).

In Visual Basic 2005 this would have caused a compile error (try it!); but in Visual Basic 2008 the compiler uses the new dominant type algorithm to select a type that will work for all arguments. How this algorithm works is actually pretty complicated, but you can read a simplified explanation of it in Section 8.13 of the Visual Basic 9.0 Language Specification (see go.microsoft.com/fwlink/?LinkId=123647 ).

What about this case?

Dim total = Sum(1D, 2)

Here the compiler will infer T to be of type Decimal since it's one of the types provided, and converting an Integer to a Decimal is a widening conversion (see go.microsoft.com/fwlink/?LinkId=123649 for more on widening and narrowing conversions).

—Jonathan Aneja, Microsoft Visual Basic Team

Send your questions and comments for Ken to basics@microsoft.com .

Ken Getz is a Senior Consultant with MCW Technologies and a courseware author for AppDev ( www.appdev.com ). He is coauthor of ASP.NET Developers Jumpstart , Access Developer's Handbook , and VBA Developer's Handbook, 2nd Edition . Reach him at keng@mcwtech.com .