Export (0) Print
Expand All

How to: Join Content from Dissimilar Files (LINQ)

This example shows how to join data from two comma-delimited files that share one common value that is used as a matching key. This technique can be useful if you have to combine data from two spreadsheets, or data from a spreadsheet and a file that has another format, into a new file. You can also modify this example to work with any kind of structured text.

To create the data files

  1. Copy these lines into a file that is named scores.csv and save it to the same folder as your project file. This file represents spreadsheet data. Column 1 is the student's ID, and columns 2 through 5 are test scores.

    111, 97, 92, 81, 60
    112, 75, 84, 91, 39
    113, 88, 94, 65, 91
    114, 97, 89, 85, 82
    115, 35, 72, 91, 70
    116, 99, 86, 90, 94
    117, 93, 92, 80, 87
    118, 92, 90, 83, 78
    119, 68, 79, 88, 92
    120, 99, 82, 81, 79
    121, 96, 85, 91, 60
    122, 94, 92, 91, 91
    
  2. Copy these lines into a file that is named names.csv and save it to the same folder as your project file. This file represents a spreadsheet that contains the student's last name, first name, and student ID.

    Omelchenko,Svetlana,111
    O'Donnell,Claire,112
    Mortensen,Sven,113
    Garcia,Cesar,114
    Garcia,Debra,115
    Fakhouri,Fadi,116
    Feng,Hanying,117
    Garcia,Hugo,118
    Tucker,Lance,119
    Adams,Terry,120
    Zabokritski,Eugene,121
    Tucker,Michael,122
    

class JoinStrings
{
    static void Main()
    {
        // Join content from dissimilar files that contain 
        // related information. names.csv contains the student name 
        // plus an ID number. scores. csv contains the ID and a  
        // set of four test scores. The following query joins 
        // the scores to the student names by using ID as a 
        // matching key. 

        string[] names = System.IO.File.ReadAllLines(@"../../../names.csv");
        string[] scores = System.IO.File.ReadAllLines(@"../../../scores.csv");


        // Name:    Last[0],       First[1],  ID[2],     Grade Level[3] 
        //          Omelchenko,    Svetlana,  11,        2 
        // Score:   StudentID[0],  Exam1[1]   Exam2[2],  Exam3[3],  Exam4[4] 
        //          111,           97,        92,        81,        60 

        // This query joins two dissimilar spreadsheets based on common ID value. 
        // Multiple from clauses are used instead of a join clause 
        // in order to store results of id.Split.
        IEnumerable<string> scoreQuery1 =
            from name in names
            let nameFields = name.Split(',')
            from id in scores
            let scoreFields = id.Split(',')
            where nameFields[2] == scoreFields[0]
            select nameFields[0] + "," + scoreFields[1] + "," + scoreFields[2] 
                   + "," + scoreFields[3] + "," + scoreFields[4];

        // Pass a query variable to a method and 
        // execute it in the method. The query itself 
        // is unchanged.
        OutputQueryResults(scoreQuery1, "Merge two spreadsheets:");

        // Keep console window open in debug mode.
        Console.WriteLine("Press any key to exit");
        Console.ReadKey();
    }

    static void OutputQueryResults(IEnumerable<string> query, string message)
    {
        Console.WriteLine(System.Environment.NewLine + message);
        foreach (string item in query)
        {
            Console.WriteLine(item);
        }
        Console.WriteLine("{0} total names in list", query.Count());
    }
}
/* Output:
Merge two spreadsheets:
Adams, 99, 82, 81, 79
Fakhouri, 99, 86, 90, 94
Feng, 93, 92, 80, 87
Garcia, 97, 89, 85, 82
Garcia, 35, 72, 91, 70
Garcia, 92, 90, 83, 78
Mortensen, 88, 94, 65, 91
O'Donnell, 75, 84, 91, 39
Omelchenko, 97, 92, 81, 60
Tucker, 68, 79, 88, 92
Tucker, 94, 92, 91, 91
Zabokritski, 96, 85, 91, 60
12 total names in list
 */

  • Create a Visual Studio project that targets the .NET Framework version 3.5. By default, the project has a reference to System.Core.dll and a using directive (C#) or Imports statement (Visual Basic) for the System.Linq namespace. In C# projects, add a using directive for the System.IO namespace.

  • Copy this code into your project.

  • Press F5 to compile and run the program.

  • Press any key to exit the console window.

Community Additions

ADD
Show:
© 2014 Microsoft