Share via


Handling Spaces (MGrammar)

[This content is no longer valid. For the latest information on "M", "Quadrant", SQL Server Modeling Services, and the Repository, see the Model Citizen blog.]

This tutorial builds on the preceding one, Hello World (MGrammar) , which is an introduction to creating a domain specific language (DSL) using the MGrammar feature of the Microsoft code name “M” language.

In this tutorial, you learn how to ignore extra spaces within your input text file. This tutorial builds a DSL that parses a text file that lists all the types contained in a .NET Framework assembly. You can create such a file by using the .NET Framework Reflection API against an assembly. The text file obeys the following rules:

  • Each line starts with the string TYPE.

  • The line contains the string Name= followed by the name of the type.

  • After the Name field, there is a string Access= followed by one of the following strings: public, private, protected, or internal.

  • The Access field is followed by a string Email= followed by an e-mail address.

  • Each of the preceding fields is followed by a one-space separator. This is the restriction that this tutorial removes.

A typical file might look like the following example.

TYPE Name=System.String Access=public Email=janedoe@contoso.com 
TYPE Name=System.Integer32 Access=private Email=bbrown@contoso.com 
TYPE Name=System.Byte Access=public Email=johndoe@contoso.com 
TYPE Name=System.Boolean Access=public Email=janedoe@contoso.com 

In the preceding tutorial, you learned how to use “Intellipad” to create a DSL that can recognize a fixed text string.

In this tutorial, you will learn to do the following:

  • Add additional syntax rules that enable you to parse separate fields in the input.

  • Add an interleave rule that specifies whitespace characters to be ignored when parsing the input.

In the tutorial after this one, you will learn how to use the token keyword so that you can make the fields contain variables, by defining constraints that apply to each field's contents. At this point, you have a DSL that might be useful.

A DSL that Recognizes One Line of Input

  1. First we build a DSL that recognizes a single line of input, much like the preceding tutorial did. Open the “Intellipad” editor by clicking Start, All Programs, Microsoft Oslo SDK, Tools, and Intellipad (Samples Enabled).

  2. Add the following code, which defines a DSL that recognizes a single fixed line of input.

    
    module Types 
    {
       language Parser
       {
          syntax Main = "TYPE Name=System.String Access=public Email=janedoe@contoso.com";
       } 
    }
    
  3. In “Intellipad”, click the DSL menu entry, and select Tree Preview. An Open dialog box appears. Create a new text file, and note that “Intellipad” now displays 3 additional windows: one on the left in DSL Input Mode, one on the right in Output Mode, and an error window at the bottom.

  4. To test this language, enter the following line into the left DSL Input Mode pane.

    TYPE Name=System.String Access=public Email=janedoe@contoso.com 
    
  5. Note there are no errors, because this is the single line of text that this DSL currently recognizes. The generated output seen in the Output Mode pane consists of a sequence that contains one element, the entire data string that was input.

  6. In this tutorial, you enable your DSL to handle extra spaces between the different fields. Try changing the input by inserting an extra space between TYPE and Name=. Note the errors that are generated. Undo the change.

Breaking the Input into Fields

  1. Now, enable this language to parse input that may contain extra spaces between the Name, Access, and Email field contents. To do this we add the following rules that specify the different pieces of an input line. Add them right after the Main syntax statement.

            syntax Type = "TYPE";
            syntax Name = "Name=System.String";
            syntax Access = "Access=public";
            syntax Email = "Email=janedoe@contoso.com";
    
  2. Next change the Main syntax declaration to reference the other syntax declarations.

          syntax Main = Type Name Access Email;
    

    Note that errors are generated: now the grammar does not accept spaces between the fields. We can make it accept a space by defining a Space syntax rule, and inserting references to Space between the field names in the Main syntax rule. The following code shows the changes required.

             syntax Main = Type Space Name Space Access Space Email;
                                                 ...........
             syntax Space = "\u0020";
    

    Note the errors disappear and the output pane on the right shows a sequence that contains multiple fields.

  3. Now try changing the input by inserting an extra space between the TYPE and Name= fields. Note the errors that are generated.

Handling Arbitrary Amounts of Whitespace

  1. Whitespace is a term that refers to characters you want to ignore, such as spaces, returns, line feeds, possibly tab characters, and others.

  2. This grammar still fails to handle the case where more than one space is added. MGrammar deals with handling arbitrary amounts of whitespace with the interleave keyword. Add the following line of code, and change the Main statement back to its previous form by deleting all the references to the Space syntax rule.

            interleave Whitespace = Space;
    
  3. In a later tutorial, you will learn how to specify additional whitespace characters such as tabs or returns.

    Try adding multiple spaces to the input line and observe that this does not cause any errors.

Example

The following is the complete “M” code sample for this tutorial.

module Types 
{
   language Parser
   {
        syntax Main = Type  Name  Access  Email;
        syntax Type = "TYPE";
        syntax Name = "Name=System.String";
        syntax Access = "Access=public";
        syntax Email = "Email=janedoe@contoso.com";        
        syntax Space = "\u0020";
        interleave Whitespace = Space;
   } 
}

See Also

Tasks

Hello World (MGrammar)

Other Resources

interleave