Object Model for "M" Values

[This content is no longer valid. For the latest information on "M", "Quadrant", SQL Server Modeling Services, and the Repository, see the Model Citizen blog.]

 

Clemens Szyperski

November 3, 2009

1.  Introduction

Developing and utilizing domain-specific languages (DSLs) in the SQL Server Modeling CTP typically involves authoring the grammar in “Intellipad” then using the code name “M” command-line tools to process input files against the compiled grammar. The output of this process is “M” language constructs commonly referred to as “M” values. Often, this output takes the form of an abstract syntax tree (AST) in which the highly variable structures of a DSL have been organized into a standardized graph representation. Data represented in “M” values can be checked against schema authored in “M” and can then be loaded into the SQL Server Modeling Services repository database using the mx.exe command-line tool.

Of course, a command-line tool is meant for interactive use and automation scripts. Applications and runtimes also need a means to process “M” values output programmatically, a process known as dynamic parsing. This allows a user to directly enter DSL input at runtime which the application then processes without the need to utilize intermediate command-line tools.

The Object Model (OM) described in this document is what allows this to happen. It’s the lowest-level runtime model to manage “M” values in .NET applications, that is, processing input against a grammar through .NET classes written for this purpose; and it is the same set of classes that “Intellipad” and mx.exe use themselves. Developers will utilize this object model whenever dynamic parsing is needed. Note that it replaces an older object model as described in §1.4 below.

Beyond the dynamic-parsing scenarios outlined above, the object model is a general .NET framework to manipulate arbitrary graph-shaped data. Thus, an “M” values graph is really the “data profile” of “M” with a typical instance looking like this:–

// Populate a small village with some people

Villagers => {

  Jenn => Person { Name => 'Jennifer', Age => 28, Spouse => Rich },

  Rich => Person { Name => 'Richard', Age => 26, Spouse => Jenn },

  Charly => Person { Name => 'Charlotte', Age => 12 }

},

HaveSpouses => { Villagers.Rich, Villagers.Jenn }

All aspects of “M”, including the output structure, are still evolving and, in the November 2009 CTP, several not-yet-finalized language details appear only in “M” values. Specifically, the above example exhibits the following details:

  1. An “M” values instance is a sequence of optionally labeled edges. An edge is a pair (label, targetNode). In the example above, there are two such top-level edges, labeled Villagers and HaveSpouses.
  2. Edge labels appear to the left of the binding operator (=>). An identifier appearing to the right of the binding operator and before the value being bound is called a brand.
  3. “M” values supports a fairly rich set of “M” intrinsic types for leaf values. (See §2.1.3 for the supported “M” types and how they map to CLR types.)
  4. “M” values supports comments (//… and /*…*/ style) and verbatim text (@"…").

Caution: As a consequence of this staging of language changes, the “M” values syntax is temporarily incompatible with that used for graphs (extent initializers) in “M” and also, in different ways, from that used for graph construction in the right-hand side of “M” language rules. All this is going to be cleaned up in the future.

Architecturally, the object model comprises three main subsystems: the graph model, graph stores, and graph readers/writers. The graph model is about construction, traversal/inspection, and updating of graphs. Graph stores provide the backing implementations for different graph representations and bridges to external stores. Readers/writers support the mapping of graphs from and to streams.

This introduction comprises an overview of those three subsystems. The following sections cover each in full detail, followed by a set of fully worked examples. Some readers may prefer to skip forward to the examples sections (§2.7, §3.5, and §4.3) before delving into the details of the design.

The overall object model strikes a balance between usability and convenience on the one hand and implementation flexibility and enabling high performance on the other. Performance, both at the level of individual operations and at the level of scalability to very large graphs, is a central feature of the design. Graphs are a very general and low-level abstraction that many complex designs can be built on top of, but only if the supporting infrastructure can bear such load.

Heads-up: To meet this two-pronged goal, the design is unusual in that it relies on a family of struct types beyond what is common in the .NET framework. Readers unfamiliar with the various consequences of using structs may want to pause and pay attention to the related detail. The key property of the design is that almost all of the structs (but enumerators) are stateless; making their use convenient and straightforward without requiring constant attention to the fact that struct types are being used.

1.1.  Graph Model

An “M” values instance is a directed, edge-labeled graph. Each node is either atomic, holding a single value, or non-atomic, holding a multiset of outgoing edges. Nodes carry an optional brand.  Edges outgoing from a node are either unlabeled or uniquely labeled, relative to their source node. A value held by an atomic node is either of an “M” values classified type or of unclassified type (arbitrary other CLR type).

 

A graph allow for sharing: a single node can be reachable via more than one path (where a path is a sequence of one or more edges). A graph can contain loops: an edge can refer back to its originating node. A graph can contain cycles: a path can refer back to its originating node.

 

1.1.1. Labels and Brands

Edges may be associated with a nominal label, where labels are either strings or integers. The interpretation of labels is left open in the graph model, except that labels are mandated to be unique among edges outgoing from the same source node. A typical interpretation uses labels to designate roles of successor nodes.

Nodes may be associated with a nominal brand, where brands are strings. The interpretation of brands is left open in the graph model, but a typical interpretation uses brands to associate a nominal concept, such as a type or class, with a node.

1.1.2. Graph patterns

The graph model introduces no inherent ordering on successors of a node. Thus, if multiple unlabeled edges originate in a given node, then the successors (the target nodes of these edges) are distinguishable only by their content. It is valid to have any number of fully indistinguishable successors – essentially representing the elements of a multiset as unlabeled successors of a node.

A node that has only labeled outgoing edges, all of which are strings, represents a record. The successors are the members of the record; the labels are the member names.

A node that has at least two successors, all of which are the targets of labeled edges, using integer labels in the range from zero to the number of successors minus one, represents a tuple. The successors are the members of the tuple; the labels are the indices into the tuple.

A node that has only labeled outgoing edges, with exactly the two labels “Head” and “Tail”, represents a list. The successor labeled “Head” holds the first value in the list; the successor labeled “Tail” holds the remaining values in the list. This is the familiar cell-based encoding of lists.

A node with only unlabeled outgoing edges represents an unordered collection. The successors are the members of that collection.

A node with both labeled and unlabeled outgoing edges represents a composite, which is the most general pattern supported.

A node with no successors represents an empty collection, but also serves to represent an empty list, an empty record (a record with zero members), and an empty composite.

The following table summarizes the graph patterns, their corresponding notation in “M” values, and their associated labeling constraints. For tuples and lists, the expanded (and equivalent) notation that shows the representation as labeled graphs is also shown over a grey backdrop.

Graph Pattern

“M” Values Notation

Labeling Constraints

Composite

{ 42, Name => "Jack" }

Collection

Primes { 1, 2, 3, 5, 7 }

No labels

Record

Jack { Age => 42, Name => "Jack" }

All labeled, all string labels

Tuple

( 42, "Jack" )

{ 0 => 42, 1 => "Jack" }

All labeled, all integer labels, labels in range [0..count)

List

Fibonacci [ 1, 1, 2 ]

Fibonacci { Head => 1,

  Tail => { Head => 1,

  Tail => { Head => 2,

  Tail => {} } } }  

All labeled; exactly the labels “Head” and “Tail”

1.1.3. Symbolic References

Normally, all relationships among nodes in a graph are expressed using edges. However, it is also possible to hold unresolved symbolic references to other nodes in a graph. Such references are captured as special atomic values (of type GraphReference). As the one exception to the branding mechanism, atomic nodes holding such symbolic references cannot be branded.

Rationale: nodes holding symbolic references are seen as place holders for the actual target node that they refer to. Reference resolution substitutes the node located by the resolution procedure for the target node that held the reference. This way, the edge label is retained and all properties of the reference target are shared, including the reference target’s brand, if any.

A symbolic reference is a pair (isGlobal, labelSequence). The isGlobal flag determines whether the reference is global or local. A global reference is resolved by starting from a separately defined global source node; if no such node is defined for a given graph, then global references are undefined.

A local reference is resolved by following the label sequence, in order, as designated by the reference, starting at the node holding the symbolic reference.

In the special case where a global source node is defined and where a local reference’s first label does not resolve from the node holding the reference, a special scope determination procedure is applied to determine the node form where the reference is to be resolved. This procedure constructs a path from the global source node to the node holding the reference. If no such path can be constructed, reference resolution fails. If multiple such paths can be constructed, then an arbitrary one of them is chosen. The node closest to the node holding the reference that has an outgoing edge labeled with the first label of the reference at hand is chosen as the scope node. Resolution proceeds from that node.

The textual syntax for symbolic references is [ label ] ('.' label)*. A global reference has no initial label, that is, it starts with a dot: .Jill or .Jack.Spouse.Age. A local reference starts with a label: Name or Spouse.Name.

 

In the example above, the symbolic references Spouse.Age and .Jack.Spouse can both be resolved, assuming that the shaded node is the designated global source node. Note that references can be resolved “through” other references. In the example, .Jill.Spouse.Age resolves by first resolving the encountered reference .Jill and then resolving Age relative to that node.

Global alternatives for Spouse.Age would be .Jack.Age, .Jill.Spouse.Age, and so on. Local alternatives would be Spouse.Spouse.Spouse.Age, and so on. There are no local alternatives for the global reference in the example, .Jack.Spouse.

1.2. Graph Stores

For most cases of programmatic creation, traversal, use, and modification of graphs, the types Node and Edge form the main points of access.

For implementers of custom graph “stores” (a term used broadly for all graph implementations, whether in-memory or virtualized over an external store), the type GraphStore is the center of the design. When creating graphs using Node and Edge without reference to any specific GraphSt0re, the system provides a shadow store implementation.

The architecture of the object model provides substantial flexibility for implementers of graph stores, catering for everything from very simple store implementations that just use a few CLR types to represent graphs as objects to highly sophisticated implementations that virtualize over an external store and use dense, non-object-based cache structures. The external use of such stores is the same in all cases, governed by the fixed types of the graph model.

1.3. Graph Readers and Writers

The types GraphReader and GraphWriter are used to map graphs from and to stream formats. Specific subtypes implement the reading/writing of specific stream representations; most importantly, GraphTextReader and GraphTextWriter implement the mapping from/to text streams.

1.4. Transition from Old to New Object Model

The “M” values object model covered in this document is meant to replace the older design around type IGraphBuilder, GraphBuilder, etc. This older design is to be deprecated and then removed in future releases. At the current point in time (November 2009), some of the larger design still depends on IGraphBuilder; most notably, DynamicParser does.

To enable a staged migration from the old to the new design, class NodeGraphBuilder is included in the current release. It, too, will be removed as soon as IGraphBuilder goes. NodeGraphBuilder bridges from the old IGraphBuilder API to the new GraphStore API.

In the example below, Contact, Alias, and Number are brands and Info is an edge label. Class NodeGraphBuilder maps labels in the IGraphBuilder world to brands in the new design unless an entity, in the IEntityBuilder sense, is built, in which case the entity-member label is mapped to an edge label in the new design.

string contactGrammar = @"

  module M {

    language Contacts {

      syntax Main =

        'Contact' ':' a:Alias => Contact { Info => Alias { a } }

      | 'Contact' ':' p:Number => Contact { Info => Number { p } };        

      token Alias = ('A'..'Z' | 'a'..'z')+;   

      token Number = ('0'..'9' | '-')+;

    }

  }";

CompilationResults results = Compiler.Compile(

  new CompilerOptions {

    Sources = {

      new TextItem {

        Reader = new StringReader(contactGrammar),

        ContentType = TextItemType.MGrammar }

    }});

DynamicParser parser = results.ParserFactories["M.Contacts"].Create();

parser.GraphBuilder = new NodeGraphBuilder();

With the parser’s graph builder set to an instance of NodeGraphBuilder, the results of DynamicParser.Parse calls are always of type Node (not just object, as in the old design). The following examples show the parsing of sample input sentences (underlined) and their interpretation using the (new) graph model.

Node contact = (Node)parser.Parse(

  new StringTextStream("Contact:gatsby"), null);

Node info = contact.Edges["Info"];

Node infoSingleValue = info.ViewAllNodes().Single();

Assert.AreEqual(contact.Brand, "Contact");

Assert.AreEqual(info.Brand, "Alias");

Assert.AreEqual(infoSingleValue.AtomicValue, "gatsby");

contact = (Node)parser.Parse(

  new StringTextStream("Contact:555-1212"), null);

info = contact.Edges["Info"];

infoSingleValue = info.ViewAllNodes().Single();

Assert.AreEqual(contact.Brand, "Contact");

Assert.AreEqual(info.Brand, "Number");

Assert.AreEqual(infoSingleValue.AtomicValue, "555-1212");

2. Graph Model

The type Node is the design center for almost all cases where graphs are being created, traversed, read, written, or updated. Node is a struct and leads to a family of further struct types that, together, enable allocation- and copy-free operations over graphs while providing a convenient type surface that is friendly for Linq and other modern patterns, such as initializers, as supported by C# 3.0.

Rationale: These struct types contain no implementation or representation decisions of their own and delegate all such decisions through type GraphStore to specific implementations. It is useful to think of GraphStore as a large method dispatch type that provides convenient defaults for implementers of graph stores, but that is not meant to be used directly. Access to GraphStore’s through Node and associated types is much more convenient and equally efficient. In fact, each of the struct types in this design contains exactly two reference fields: a reference to the underlying graph store and a store-specific key object. The only exceptions are the enumerator structs that hold additional enumeration state.

2.1. Node

The Node struct has a large surface area. For the purposes of documentation, it is broken up into pieces in this section and its subsections. The following shows the main surface of Node, with references to subsections for elided members.

namespace System.Dataflow {

  public struct Node : IEquatable<Node>, INotifyPropertyChanged {

    public static readonly Node Default;

    // Constructors, listed in §2.1.4

    // Explicit and implicit conversions, listed in §2.1.5

    // Read/Write methods, listed in §2.1.6

    // Navigation methods, listed in §2.1.7

    public static bool operator ==(Node left, Node right);

    public static bool operator !=(Node left, Node right);

    public object AtomicValue { get; set; }

    public Identifier Brand { get; set; }

    public EdgeCollection Edges { get; }

    public bool IsDefault { get; }

    public bool IsListOrEmptyCollection { get; }

    public bool IsReadOnly { get; }

    public bool IsRecordOrEmptyCollection { get; }

    public NodeKind NodeKind { get; }

    public GraphStore Store { get; }

    public object StoreKey { get; }

    public event PropertyChangedEventHandler PropertyChanged { add; remove; }

    public static GraphStore CreateDefaultStore();

    public static Node CreateNodeForStoreKey(GraphStore store, object key);

    public override bool Equals(object obj);

    public bool Equals(Node other);

    public override int GetHashCode();

    public override string ToString();

    public NodeCollection ViewAllNodes();

    public GraphList ViewAsList();

    public GraphDictionary ViewAsRecord();

    public GraphList ViewAsTuple();

    public GraphList ViewOrderedNodes();

    public NodeCollection ViewUnorderedNodes();

  }

}

The properties of Node directly reflect the structure of the graph model; they are covered in detail in §2.1.1.

2.1.1. Node Properties

public object AtomicValue { get; set; }

public Identifier Brand { get; set; }

public EdgeCollection Edges { get; }

public bool IsDefault { get; }

public bool IsListOrEmptyCollection { get; }

public bool IsReadOnly { get; }

public bool IsRecordOrEmptyCollection { get; }

public NodeKind NodeKind { get; }

public GraphStore Store { get; }

public object StoreKey { get; }

Atomic nodes have an AtomicValue, non-atomic nodes have an atomic value of null. All nodes, but atomic nodes holding a symbolic reference, can carry a Brand (§1.1.1). The kind of a node is reflected by its NodeKind property (§2.1.2).

Non-atomic nodes have a collection of outgoing Edges; for atomic nodes, this collection is always empty. For a discussion of the rich surface provided over edge collections, see §.

Since non-atomic nodes with an empty collection of outgoing edges are reported as having a node kind of Collection, the tests IsListOrEmptyCollection and IsRecordOrEmptyCollection are provided to handle the case where a node of interest should follow the list or record pattern, but where the empty case is permissible.

If the backing store does not support updates of a node, it is marked as IsReadOnly.

A node that has been constructed using Node’s default constructor , is marked as IsDefault. A node that has been constructed using any but Node’s default constructor will have a valid store (Store) and store key (StoreKey). The details of those two concepts can be found in §3.1, but can be ignored for all but advanced scenarios.

2.1.2. NodeKind

Nodes are of one of a small number of defined kinds.

namespace System.Dataflow {

  public enum NodeKind {

    Undefined, Atomic, Composite, Collection, List, Record, Tuple

  }

}

Only nodes constructed using the Node default constructor will be of kind Undefined.

2.1.3. Classified value types

Atomic nodes hold a value of a classified type: a type that is part of the profile’s set of atomic types. The following table shows how the classified CLR types map to “M” types. C# synonyms, as uses in the type excerpts in this document, are also shown.

C# Synonym CLR Type “M” Type

bool

System.Boolean

Logical

byte[]

System.Byte[]

Binary

Date

System.Date

Date

DateTime

System.DateTime

DateTime

DateTimeOffset

System.DateTimeOffset

DateTimeOffset

Time

System.Time

Time

decimal

System.Decimal

Decimal28

double

System.Double

Double

GraphReference

System.Dataflow.GraphReference

Guid

System.Guid

Guid

int

System.Int32

Integer32

long

System.Int64

Integer64

Time

System.Time

Time

Note that types System.Date and System.Time are not part of the .NET framework (yet) and found in System.Dataflow.dll.

2.1.4. Node Constructors

Type Node has a constructor overloads that fall into two groups: constructors for atomic value nodes and constructors for non-atomic nodes. These groups are covered in that order below.

Constructors for atomic nodes take value of one of the classified types (see §2.1.3) and, optionally, a brand and a store implementation to use. If the value type is not known statically, then the object-typed constructors can be used.

public Node(object constant);

public Node(Identifier brand, object constant);

public Node(GraphStore store, Identifier brand, object constant);

When holding a strongly typed value, the following constructor overloads can be used.

Rationale: Using these constructors enables the underlying graph store to avoid boxing. Whether values get boxed depends on the specific store implementation.

public Node(bool constant);

public Node(Identifier brand, bool constant);

public Node(GraphStore store, Identifier brand, bool constant);

public Node(Identifier brand, byte[] constant);

public Node(GraphStore store, Identifier brand, byte[] constant);

public Node(Date constant);

public Node(Identifier brand, Date constant);

public Node(GraphStore store, Identifier brand, Date constant);

public Node(DateTime constant);

public Node(Identifier brand, DateTime constant);

public Node(GraphStore store, Identifier brand, DateTime constant);

public Node(DateTimeOffset constant;

public Node(Identifier brand, DateTimeOffset constant;

public Node(GraphStore store, Identifier brand, DateTimeOffset constant);

public Node(decimal constant);

public Node(Identifier brand, decimal constant;

public Node(GraphStore store, Identifier brand, decimal constant);

public Node(double constant);

public Node(Identifier brand, double constant);

public Node(GraphStore store, Identifier brand, double constant);

public Node(GraphReference constant);

public Node(GraphStore store, GraphReference constant);

public Node(Guid constant);

public Node(Identifier brand, Guid constant);

public Node(GraphStore store, Identifier brand, Guid constant);

public Node(int constant);

public Node(Identifier brand, int constant);

public Node(GraphStore store, Identifier brand, int constant);

public Node(long constant);

public Node(Identifier brand, long constant);

public Node(GraphStore store, Identifier brand, long constant);

public Node(string constant);

public Node(Identifier brand, string constant);

public Node(GraphStore store, Identifier brand, string constant);

public Node(Time constant);

public Node(Identifier brand, Time constant);

public Node(GraphStore store, Identifier brand, Time constant);

The following constructors for non-atomic nodes take a collection of outgoing edges in combination with an optional brand. These constructors are intended to support the effective cloning of non-atomic nodes.

public Node(EdgeCollection edges);

public Node(Identifier brand, EdgeCollection edges);

The following constructors supported graph patterns are composite, collection, record, tuple, and list. If left unspecified, composites are built by default. Optionally, the brand and store to use can be specified. The successors are specified as enumerables or params of edges.

public Node(IEnumerable<Edge> edges);

public Node(Identifier brand, IEnumerable<Edge> edges);

public Node(NodeKind kind, IEnumerable<Edge> edges);

public Node(Identifier brand, NodeKind kind, IEnumerable<Edge> edges);

public Node(GraphStore store, IEnumerable<Edge> edges);

public Node(GraphStore store, Identifier brand, IEnumerable<Edge> edges);

public Node(GraphStore store, NodeKind kind, IEnumerable<Edge> edges);

public Node(GraphStore store, Identifier brand, NodeKind kind,

       IEnumerable<Edge> edges);

public Node(params Edge[] edges);

public Node(Identifier brand, params Edge[] edges);

public Node(NodeKind kind, params Edge[] edges);

public Node(Identifier brand, NodeKind kind, params Edge[] edges);

public Node(GraphStore store, params Edge[] edges);

public Node(GraphStore store, Identifier brand, params Edge[] edges);

public Node(GraphStore store, NodeKind kind, params Edge[] edges);

public Node(GraphStore store, Identifier brand, NodeKind kind,

       params Edge[] edges);

The following constructors support the graph patterns collection, tuple, and list. If left unspecified, collections are built by default. Optionally, the brand and store to use can be specified. The successors are specified as enumerables or params of nodes.

public Node(NodeKind kind);

public Node(Identifier brand, NodeKind kind);

public Node(GraphStore store, NodeKind kind);

public Node(GraphStore store, Identifier brand, NodeKind kind);

public Node(params Node[] nodes);

public Node(Identifier brand, params Node[] nodes);

public Node(NodeKind kind, params Node[] nodes);

public Node(Identifier brand, NodeKind kind, params Node[] nodes);

public Node(GraphStore store, params Node[] nodes);

public Node(GraphStore store, Identifier brand, params Node[] nodes);

public Node(GraphStore store, NodeKind kind, params Node[] nodes);

public Node(GraphStore store, Identifier brand, NodeKind kind,

       params Node[] nodes);

2.1.5. Node Conversions

The following explicit conversions on Node support the extraction of typed values from atomic nodes. They throw if the node is non-atomic or if it holds a value of a different type.

    public static explicit operator bool(Node node);

    public static explicit operator byte[](Node node);

    public static explicit operator char(Node node);

    public static explicit operator Date(Node node);

    public static explicit operator DateTime(Node node);

    public static explicit operator DateTimeOffset(Node node);

    public static explicit operator decimal(Node node);

    public static explicit operator double(Node node);

    public static explicit operator GraphReference(Node node);

    public static explicit operator Guid(Node node);

    public static explicit operator int(Node node);

    public static explicit operator long(Node node);

    public static explicit operator string(Node node);

    public static explicit operator Time(Node node);

The following explicit conversions on Node support the extraction of nullable typed values from atomic nodes. They return null if the node is non-atomic or if it holds a value of a different type.

    public static explicit operator bool?(Node node)

    public static explicit operator char?(Node node)

    public static explicit operator Date?(Node node)

    public static explicit operator DateTime?(Node node)

    public static explicit operator DateTimeOffset?(Node node)

    public static explicit operator decimal?(Node node)

    public static explicit operator double?(Node node)

    public static explicit operator Guid?(Node node)

    public static explicit operator GraphReference?(Node node)

    public static explicit operator int?(Node node)

    public static explicit operator long?(Node node)

    public static explicit operator Time?(Node node)

The following implicit conversions on Node support the conversion of values of classified type to atomic nodes.

    public static implicit operator Node(bool value);

    public static implicit operator Node(char value);

    public static implicit operator Node(Date value);

    public static implicit operator Node(DateTime value);

    public static implicit operator Node(DateTimeOffset value);

    public static implicit operator Node(decimal value);

    public static implicit operator Node(double value);

    public static implicit operator Node(Guid value);

    public static implicit operator Node(GraphReference value);

    public static implicit operator Node(int value);

    public static implicit operator Node(long value);

    public static implicit operator Node(string value);

    public static implicit operator Node(Time value);

2.1.6. Node Read/Write Support

The following methods support the reading of a graph from a stream (yielding a sequence of edges).

Note: Reading directly from a string is mostly useful for simple scenarios such as unit tests.

public static IEnumerable<Edge> ReadFrom(GraphReader reader);

public static IEnumerable<Edge> ReadFrom(GraphReader reader,

       Func<GraphStore> createStore);

public static IEnumerable<Edge> ReadFrom(TextReader reader);

public static IEnumerable<Edge> ReadFrom(TextReader reader,

       Func<GraphStore> createStore);

public static IEnumerable<Edge> ReadFrom(Stream stream);

public static IEnumerable<Edge> ReadFrom(Stream stream,

       Func<GraphStore> createStore);

public static IEnumerable<Edge> ReadFrom(Stream stream,

       Encoding encoding);

public static IEnumerable<Edge> ReadFrom(Stream stream,

       Encoding encoding, Func<GraphStore> createStore);

public static IEnumerable<Edge> ReadFrom(string filePath);

public static IEnumerable<Edge> ReadFrom(string filePath,

       Func<GraphStore> createStore);

public static IEnumerable<Edge> ReadFrom(string filePath,

       Encoding encoding);

public static IEnumerable<Edge> ReadFrom(string filePath,

       Encoding encoding, Func<GraphStore> createStore);

public static IEnumerable<Edge> ReadFromString(string text);

public static IEnumerable<Edge> ReadFromString(string text,

       Func<GraphStore> createStore);

Given a node, the following methods support the writing of the graph reachable from that node to a stream.

Note: Writing directly to a string is mostly useful for simple scenarios such as unit tests.

Note: Writing to a string writes the entire reachable graph, while calling ToString returns a size-bounded string that summarizes the top structure of the reachable graph.

public void WriteTo(Stream stream);

public void WriteTo(Stream stream, Encoding encoding);

public void WriteTo(string filePath);

public void WriteTo(string filePath, Encoding encoding);

public void WriteTo(TextWriter writer);

public void WriteTo(GraphWriter writer);

public string WriteToString();

Issue: Methods to directly write a collection of edges should be added to match the model of the read methods.

2.1.7. Node Navigation Methods

Given a node, the following methods use that node as a navigation scope and attempt to resolve a given symbolic reference within that scope. If the reference is global (reference.IsGlobal), then the resolution is only supported if the underlying store maintains a set of outer edges. (See §1.1.3 for details on symbolic references and §3.2 for details on reference resolution as performed by graph stores.)

public Node NavigateFromRootTo(GraphReference reference);

public Node NavigateFromRootTo(string path);

public IAsyncResult BeginNavigateFromRootTo(GraphReference reference,

       AsyncCallback callback, object asyncState);

public IAsyncResult BeginNavigateFromRootTo(string path,

       AsyncCallback callback, object state);

public Node EndNavigateFromRootTo(IAsyncResult asyncResult);

public Node NavigateTo(GraphReference reference);

public Node NavigateTo(string path);

public IAsyncResult BeginNavigateTo(GraphReference reference,

       AsyncCallback callback, object asyncState);

public IAsyncResult BeginNavigateTo(string path,

       AsyncCallback callback, object state);

public Node EndNavigateTo(IAsyncResult asyncResult);

2.2. Edge

The type Edge represents a (label, target node) pair. Typically, edges belong to a source node – the edge is then a graph-theoretic labeled directed edge from the source to the target node. However, labeled edges in the object model can also stand on their own, that is, not have an associated source node. Such edges are interpreted as leading into a graph, optionally via a label, while remaining silent on the source of such an edge. In fact, edges may also be shared among multiple source nodes.

namespace System.Dataflow {

  public struct Edge : IEquatable<Edge> {

    public static readonly Edge Default = new Edge();

    public Edge(Node node);

    public Edge(KeyValuePair<Label, Node> pair);

    public Edge(Label label, Node node);

    public static explicit operator Edge(KeyValuePair<Label, Node> pair);

    public static explicit operator KeyValuePair<Label, Node>(Edge edge);

    public static bool operator ==(Edge left, Edge right);

    public static bool operator !=(Edge left, Edge right);

    public bool IsDefault { get; }

    public KeyValuePair<Label, Node> KeyValuePair { get; }

    public Label Label { get; }

    public Node Node { get; }

    public override bool Equals(object obj);

    public bool Equals(Edge other);

    public static IEnumerable<Edge> FromKeyValuePairs(

           IEnumerable<KeyValuePair<Label, Node>> pairs);

    public static IEnumerable<Edge> FromNodes(IEnumerable<Node> nodes);

    public override int GetHashCode();

    public static IEnumerable<KeyValuePair<Label, Node>> ToKeyValuePairs(

           IEnumerable<Edge> edges);

    public static IEnumerable<Node> ToNodes(IEnumerable<Edge> edges);

    public override string ToString();

  }

}

Besides the properties Label and Node representing the pair, property KeyValuePair (and a corresponding constructor and a couple of static helpers) are provided to easy interoperation with existing types defined in terms of KeyValuePair; most importantly, IDictionary<TKey, TValue>.

The static helper methods FromEdges and FromNodes are provided to translate between the two common cases of enumerations of edges and enumerations of (target) nodes.

2.3. Label

Labels are used to represent edge labels. Labels are either based on non-negative integers or on uninterpreted strings. Labels are atomized per AppDomain. That is, two labels are equal exactly if their object references are equal.

namespace System.Dataflow {

  [Serializable]

  public sealed class Label

    : IEquatable<Label>, IComparable, IComparable<Label>

  {

    public static Label Empty { get; }

    public string Text { get; }

    public int Value { get; }

    public static implicit operator Label(string text);

    public static implicit operator Label(int value);

    public static bool operator ==(Label left, Label right);

    public static bool operator !=(Label left, Label right);

    public static bool operator <(Label id1, Label id2);

    public static bool operator <=(Label id1, Label id2);

    public static bool operator >(Label id1, Label id2);

    public static bool operator >=(Label id1, Label id2);

    public int Compare(object other);

    int IComparable.CompareTo(object obj);

    int IComparable<Label>.CompareTo(Label other);

    public override bool Equals(object obj);

    bool IEquatable<Label>.Equals(Label other)

    public static Label Get(int value);

    public static Label Get(string text);

    public static Label Get(string text, int index, int count);

    public override int GetHashCode();

    public static bool IsNullOrEmpty(Label label);

    public override string ToString();

  }

}

2.4. Identifier

Identifiers are used to represent brands. Identifiers are uninterpreted strings that are atomized per AppDomain. That is, two identifiers are equal exactly if their object references are equal.

namespace System.Dataflow {

  public sealed class Identifier

    : IEquatable<Identifier>, IComparable, IComparable<Identifier>

    {

      public static Identifier Empty { get; }

      public string Text { get; }

      public static implicit operator Identifier(string text);

      public static bool operator ==(Identifier left, Identifier right);

      public static bool operator !=(Identifier left, Identifier right);

      public static bool operator <(Identifier id1, Identifier id2);

      public static bool operator <=(Identifier id1, Identifier id2);

      public static bool operator >(Identifier id1, Identifier id2);

      public static bool operator >=(Identifier id1, Identifier id2);

    public int Compare(object other);

    int IComparable.CompareTo(object obj);

    int IComparable<Identifier>.CompareTo(Identifier other);

    public override bool Equals(object obj);

    public static Identifier Get(string text);

    public static Identifier Get(string text, int index, int count);

    public override int GetHashCode();

    public static bool IsNullOrEmpty(Identifier id);

    public override string ToString();

    bool IEquatable<Identifier>.Equals(Identifier other);

  }

}

2.5. GraphReference

Symbolic references are held by atomic nodes as values of type GraphReference.

namespace System.Dataflow {

  public struct GraphReference : IEquatable<GraphReference> {

    public static readonly GraphReference Default = new GraphReference();

    public GraphReference(string path);

    public GraphReference(IEnumerable<Label> segments);

    public GraphReference(params Label[] segments);

    public GraphReference(IEnumerable<string> segments);

    public GraphReference(params string[] segments);

    public static explicit operator GraphReference(string path);

    public static implicit operator Node(GraphReference value);

    public static bool operator ==(GraphReference reference1,

           GraphReference reference2);

    public static bool operator !=(GraphReference reference1,

           GraphReference reference2);

    public int Count { get; }

    public bool IsDefault { get; }

    public bool IsGlobal { get; }

    public IEnumerable<Label> Segments { get; }

    public Label this[int index] { get; }

    public override bool Equals(object obj);

    public bool Equals(GraphReference other);

    public static string Format(GraphReference reference);

    public override int GetHashCode();

    public static Label LabelFromSegment(string segment);

    public override string ToString();

    public static Label[] Parse(string path);

  }

}

A global reference (one starting with a dot in “M” notation) is flagged as IsGlobal and will have a first segment that is empty.

The Format method creates a string from a given reference that adheres to “M” syntax, including proper quoting.

The LabelFromSegment method constructs an integer label if possible and a string label otherwise.

2.6. EdgeCollection

EdgeCollection is a lightweight struct that represents the collection of outgoing edges of the corresponding source node (Source property).

namespace System.Dataflow {

  public struct EdgeCollection : ICollection<Edge> {

    public int Count { get; }

    public bool IsDefault { get; }

    public bool IsReadOnly { get; }

    public Node Source { get; }

    public Node this[Label label] { get; set; }

    public static bool operator ==(EdgeCollection left,

           EdgeCollection right);

    public static bool operator !=(EdgeCollection left,

           EdgeCollection right);

    public void Add(Edge item);

    public void Add(Label label, Node node);

    public void Add(Node node);

    public void Add(params Edge[] edges);

    public void Add(params object[] nodes);

    public void Clear();

    public bool Contains(Edge item);

    public bool ContainsLabel(Label label);

    public void CopyTo(Edge[] array, int arrayIndex);

    public override bool Equals(object obj);

    public EdgeCollectionEnumerator GetEnumerator();

    IEnumerator<Edge> IEnumerable<Edge>.GetEnumerator();

    IEnumerator IEnumerable.GetEnumerator();

    public override int GetHashCode();

    public Node GetLabeledEdgeOrDefault(Label label);

    public IEnumerable<Label> GetLabels();

    public bool Remove(Edge item);

    public override string ToString();

  }

}

The Add overloads on EdgeCollection allow for direct contributions to the edge collection of a node. The C# collection initializer mechanism can be used for this purpose. The following example shows the direct use of Add on an existing node’s edge collection as well as the creation of a new node with a new edge collection created using a collection initializer:–

person.Edges.Add(  // Contribute new person to collection of villagers.

  "Manager",

  new Node(

    "Person",

    new EdgeCollection { { "Name", "John" }, { "Id", 42 } }

  )

);

Rationale: Note that the primary enumerator, EdgeCollectionEnumerator, is again a struct. This enables the performance optimizations around the static enumerator pattern as supported by the C# compiler.

namespace System.Dataflow {

  public struct EdgeCollectionEnumerator : IEnumerator<Edge> {

    public EdgeCollectionEnumerator(GraphStoreEdgeCollectionAdapter adapter,

           EmbeddableEnumeratorState state);

    public Edge Current { get; }

    object IEnumerator.Current { get; }

    public void Dispose();

    public bool MoveNext();

    public void Reset();

  }

}

2.7. Examples

Below is a simple scenario for using Node directly to construct a few simple things:–

var two = new Node(2);

var @true = new Node(true);

var twoAndTrue = new Node(two, @true);

var flag =

  (from node in twoAndTrue.ViewAllNodes()

   where node.AtomicValue is bool

   select (bool)node).Single();

var number =

  (from node in twoAndTrue.ViewAllNodes()

   where node.AtomicValue is int

   select (int)node).Single();

Assert.IsTrue(flag);

Assert.AreEqual(2, number);

Given the two atomic nodes constructed above, a simple tuple can be constructed:–

var pair = new Node(NodeKind.Tuple, two, @true);

var pairAsTuple = pair.ViewAsTuple();

var first = pairAsTuple[0];

var second = pairAsTuple[1];

Assert.AreEqual(2, (int)first);

Assert.IsTrue((bool)second);

Note the use of ViewAsTuple() on a node obtain a view that interprets the node’s successors using the tuple pattern. Given such a view, tuple elements can then be accessed by indexing.

The following example shows the construction of instances of the list pattern as well as the equality properties of atomic and non-atomic nodes. Specifically, atomic nodes have equality defined based on their value, while non-atomic nodes have equality defined based on their (graph-store-managed) identity.

var constants = new Node(NodeKind.List, first, two);

var structures = new Node(NodeKind.List, tuple, tuple);

var constantsAsList = constants.ViewAsList();

var structuresAsList = structures.ViewAsList();

Assert.AreEqual(constantsAsList[0], constantsAsList[1]);   // by value

Assert.AreEqual(structuresAsList[0], structuresAsList[1]); // by identity

Graphs can be read from streamed forms:–

string villageGraph = @"

  {

    Villagers =>

      {

        Jenn => Person { Name => 'Jennifer', Age => 28, Spouse => Rich },

        Rich => Person { Name => 'Richard', Age => 26, Spouse => Jenn },

        Charly => Person { Name => 'Charlotte' }

      },

    HaveSpouses => { Villagers.Rich, Villagers.Jenn }

  }";

var village = Node.ReadFromString(villageGraph).Single().Node;

var villagers = village.Edges["Villagers"];

New successors can be contributed to a non-atomic node:–

villagers.Add(  // Contribute new person to collection of villagers.

  "Billy",

  new Node(

    "Person",

    new EdgeCollection { { "Name", "William" }, { "Age", 3 } }

  )

);

villagers.Add(  // Contribute age to existing person Charly.

  "Charly",

  new Node(new EdgeCollection { { "Age", 12 } })

);

var charly = villagers.Edges["Charly"];

var billy = villagers.Edges["Billy"];

village.Add(

  "Youth",

  new Node(charly, billy)

);

var youth = village.Edges["Youth"];

var charly2 =

  (from node in youth.ViewAllNodes()

   where node.Edges["Name"] == "Charlotte"

   select node).Single();

var billy2 =

  (from node in youth.ViewAllNodes()

   where node.Edges["Age"] == 3

   select node).Single();

Assert.AreEqual(billy, billy2);

Assert.AreEqual(charly, charly2);

Assert.IsTrue(charly.Edges["Age"] == 12);

Remark: The contribution mechanics of the system respect the rules of graph merging (also referred to as graph coalescing) of “M”: if a contributed edge carries a label and the node being contributed to already has an outgoing edge with the same label, then both edges need to point to a non-atomic node. The content of the contributed edge’s target node will be merged into the content of the node that is being contributed to. This process repeats recursively. In the example above, this mechanism is used to add an Age edge to the existing node under Villagers.Charly.

3. Graph Stores

Graph stores are implementations of graph representations, including possible virtualization over external stores.

Besides the abstract type GraphStore, the system provides two concrete implementations: DefaultGraphStore and ShadowGraphStore. Graph stores have a large feature surface area that is mostly covered by defaults provided by GraphStore. This large area allows store implementations to specialize deeply and gain high-performance characteristics; the almost complete coverage by default implementations enables a low barrier to entry for custom graph-store implementations.

The DefaultGraphStore implementation uses in-memory structures to represent graphs and can be instantiated as needed to create separate graphs.

The ShadowGraphStore implementation builds on the default one, but exists in only one singleton instance per AppDomain. It is used automatically when constructing nodes using type Node without specifying a store. The shadow store is stateless, enabling thread-safe use as well as avoiding coupling among graphs independently created using the shadow store. To avoid state, the shadow store does not maintain a collection of outer edges and, therefore, has a default synthetic source node.

3.1. Stores and Store Keys

Graph stores deal in abstract nodes of externally invisible representation. Stores identify such nodes by handing out and taking back store keys: objects of unrefined type object.

A node is uniquely identified by a pair (store, store key). It is possible that a group of related stores share keys. For example, one store may represent a graph’s nodes while another, related store, may represent a particular kind of metadata associated with the nodes of the former store. By using the same keys for both stores, that association is maintained efficiently without “leaking” any particular association mechanism (such as attached properties) and without committing a store to a single metadata story.

The Node/Edge surface types all use (store, store key) pairs as their only state. Creating, copying, and destroying instances of these surface types is therefore very cheap and, in particular, creates no pressure on the garbage collector.

3.2. Reference Resolution

A graph store may provide a reference resolution service. If provided, that service can be used to perform a number of tasks: (i) resolve symbolic references, yielding the target nodes of references, (ii) substitute resolved targets for symbolic references, and (iii) compute a set of symbolic references that, if substituted in for some of a graph’s edges, yields a sharing and cycle free graph.

Most graph stores can work with the default resolver implementation provided in the GraphReferenceResolver class. The default resolver is stateless. Custom resolvers can subclass the default resolver.

namespace System.Dataflow {

  public class GraphReferenceResolver {

    public virtual bool TryBreakAllSharing(Node root,

           out Dictionary<Node, GraphReferenceTable> referenceTables);

    public virtual bool TryResolveAllReferences(Node root);

    public virtual Node ResolveReferenceInScope(Node scope,

           GraphReference reference);

    public virtual IAsyncResult BeginResolveReferenceInScope(

           Node scope, GraphReference reference,

           AsyncCallback callback, object asyncState);

    public virtual Node EndResolveReferenceInScope(IAsyncResult result);

  }

}

BeginResolveReferenceInScope and EndResolveReferenceInScope implement the asynchronous pattern for asynchronous invocation of ResolveReferenceInScope. The default resolver is synchronous, but custom resolvers may opt to implement asynchronous resolution.

Rationale: Asynchronous reference resolution allows virtual graph stores to “page in” further parts of a graph without blocking clients that have been written to use the asynchronous version of ResolveReferenceInScope. This is the only asynchronous entry point into graph stores, keeping the overall programming model straight forward.

3.3. GraphStore

Class GraphStore is the abstract base for all graph store implementations. It provides default implementations for most parts of its (very large) surface area, allowing graph store implementations to be gradually enriched to address performance issues.

3.3.1. Graph store clients

Below is the public surface area of GraphStore; writing client code against GraphStore is for advanced scenarios – most use cases should be use the graph model types (Node, Edge, …) instead.

namespace System.Dataflow {

  public abstract class GraphStore {

    public virtual IEqualityComparer<object> KeyComparer { get; }

    public virtual ICollection<Edge> OuterEdges { get; }

    public virtual GraphReferenceResolver ReferenceResolver { get; set; }

    public virtual Node SyntheticSource { get; }

    public void AddContribution(GraphReader reader);

    public void AddContribution(Node node, IEnumerable<Edge> edges);

    public virtual void AddContribution(Node node, Edge edge);

    public virtual void AddPropertyChangedEventHandler(Node node,

           PropertyChangedEventHandler eventHandler);

    public Node CreateCollection(Identifier brand, NodeKind kind);

    public virtual Node CreateCollection(Identifier brand, NodeKind kind,

           IEnumerable<Node> nodes);

    public virtual Node CreateCollection(Identifier brand, NodeKind kind,

           IEnumerable<Edge> edges);

    public virtual Node CreateConstant(Identifier brand, object value);

    public virtual Node CreateConstant(Identifier brand, byte[] value);

    public virtual Node CreateConstant(Identifier brand, Date value);

    public virtual Node CreateConstant(Identifier brand, DateTime value);

    public virtual Node CreateConstant(Identifier brand,

           DateTimeOffset value);

    public virtual Node CreateConstant(Identifier brand, decimal value);

    public virtual Node CreateConstant(Identifier brand, Double value);

    public virtual Node CreateConstant(Identifier brand, Guid value);

    public virtual Node CreateConstant(Identifier brand, int value);

    public virtual Node CreateConstant(Identifier brand, long value);

    public virtual Node CreateConstant(Identifier brand, bool value);

    public virtual Node CreateConstant(Identifier brand); // creates null

    public virtual Node CreateConstant(GraphReference value);

    public virtual Node CreateConstant(Identifier brand, string value);

    public virtual Node CreateConstant(Identifier brand, Time value);

    public abstract Identifier GetBrand(Node node);

    public virtual ConstantClassification GetConstantClassification(

           object value);

    public static ConstantClassification GetConstantClassification(

           object value);

    public virtual bool GetConstantBoolean(Node node);

    public virtual byte[] GetConstantBytes(Node node);

    public virtual Date GetConstantDate(Node node);

    public virtual DateTime GetConstantDateTime(Node node);

    public virtual DateTimeOffset GetConstantDateTimeOffset(Node node);

    public virtual decimal GetConstantDecimal(Node node);

    public virtual double GetConstantDouble(Node node);

    public virtual GraphReference GetConstantGraphReference(Node node);

    public virtual Guid GetConstantGuid(Node node);

    public virtual int GetConstantInt32(Node node);

    public virtual long GetConstantInt64(Node node);

    public virtual string GetConstantString(Node node);

    public virtual Time GetConstantTime(Node node);

    public virtual object GetConstantValue(Node node);

    public virtual IDictionary<Label, Node> GetLabeledEdges(Node node);

    public abstract NodeKind GetNodeKind(Node node);

    public virtual ICollection<Node> GetUnlabeledEdges(Node node);

    public Node NavigateFromRootTo(GraphReference reference);

    public IAsyncResult BeginNavigateFromRootTo(GraphReference reference,

           AsyncCallback callback, object asyncState);

    public Node EndNavigateFromRootTo(IAsyncResult asyncResult);

    public virtual Node NavigateTo(Node scope, GraphReference reference);

    public virtual IAsyncResult BeginNavigateTo(Node scope,

           GraphReference reference, AsyncCallback callback,

           object asyncState);

    public virtual Node EndNavigateTo(IAsyncResult asyncResult);

    public virtual bool OwnsKey(object key);

    public static GraphStore ReadFrom(TextReader reader);

    public static GraphStore ReadFrom(Func<GraphStore> createDoc,

           TextReader reader);

    public static T ReadFrom<T>(TextReader reader)

           where T : GraphStore, new();

    public static GraphStore ReadFrom(GraphReader reader);

    public static GraphStore ReadFrom(Func<GraphStore> createStore,

           GraphReader reader);

    public static T ReadFrom<T>(GraphReader reader)

           where T : GraphStore, new();

    public static GraphStore ReadFrom(string filePath);

    public static GraphStore ReadFrom(Func<GraphStore> createDoc,

           string filePath);

    public static T ReadFrom<T>(string filePath)

           where T : GraphStore, new();

    public static GraphStore ReadFrom(string filePath, Encoding encoding);

    public static GraphStore ReadFrom(Func<GraphStore> createDoc,

           string filePath, Encoding encoding);

    public static T ReadFrom<T>(string filePath, Encoding encoding)

           where T : GraphStore, new();

    public static GraphStore ReadFrom(Stream stream);

    public static GraphStore ReadFrom(Func<GraphStore> createDoc,

           Stream stream);

    public static T ReadFrom<T>(Stream stream) where T : GraphStore, new();

    public static GraphStore ReadFrom(Stream stream, Encoding encoding);

    public static GraphStore ReadFrom(Func<GraphStore> createDoc,

           Stream stream, Encoding encoding);

    public static T ReadFrom<T>(Stream stream, Encoding encoding)

           where T : GraphStore, new();

    public static GraphStore ReadFromString(string text);

    public static GraphStore ReadFromString(Func<GraphStore> createDoc,

           string text);

    public static T ReadFromString<T>(string text)

           where T : GraphStore, new();

    public virtual void RemovePropertyChangedEventHandler(Node node,

           PropertyChangedEventHandler eventHandler);

    public virtual void ResolveAllReferences();

    public virtual void SetBrand(Node node, Identifier brand);

    public virtual void SetConstantValue(Node node, object value);

    public virtual void SetEdge(Node node, Label label, Node value);

    public virtual void SetListValue(Node list, int index, Node element);

    public virtual void SetReferenceTarget(Action<Node> nodeVariable,

           Node target);

    public virtual string ToString(Node node);

    public void WriteTo(string filePath);

    public void WriteTo(string filePath, Encoding encoding);

    public void WriteTo(Stream stream);

    public void WriteTo(Stream stream, Encoding encoding);

    public void WriteTo(TextWriter target);

    public void WriteTo(GraphWriter writer);

    public virtual void WriteTo(GraphWriter writer, Node root,

           bool includeRootNode);

    public string WriteToString();

  }

}

The SyntheticSource property provides a node that is not technically part of the graph managed by the graph store, but that synthetically gathers all outer edges of the graph. Graph stores may or may not keep track of such outer edges and even if they do, may not do so completely. That is, there may be parts of a graph that are unreachable through any of the outer edges – and thus from the synthetic source node.

Rationale: Providing outer edges enables a general model of disconnected graphs that are still globally reachable. Providing the synthetic source over the collection of outer edges is a convenience mechanism for many graph algorithms that naturally operate from a given single source node.

3.3.2. Adapters for Graph-Pattern Views

Adapters are used by graph stores to possibly provide streamlined, allocation free access and enumeration for specialized views over graph patterns. The adapter design allows for implementations that are stateless singletons – and all default adapters are such singletons.

Adapters map point functions to efficient paths in the store implementation, as available. The default adapters all map to the collection-level getters (and their collection's enumerators).

public virtual GraphStoreNodeCollectionAdapter GetAdapterForAllNodes(

       object storeKey);

public virtual GraphStoreEdgeCollectionAdapter GetAdapterForEdges(

       object storeKey);

public virtual GraphStoreListAdapter<Node> GetAdapterForLists(

       object storeKey);

public virtual GraphStoreDictionaryAdapter GetAdapterForRecords(

       object storeKey);

public virtual GraphStoreNodeCollectionAdapter GetAdapterForRecordValues(

       object storeKey);

public virtual GraphStoreListAdapter<Node> GetAdapterForOrderedNodes(

       object storeKey);

public virtual GraphStoreNodeCollectionAdapter GetAdapterForUnorderedNodes(

       object storeKey);

public virtual GraphStoreListAdapter<Node> GetAdapterForTuples(

       object storeKey);

The (abstract) adapter classes referenced above provide the protocols of corresponding collection abstractions. The following table shows the correspondence:–

Collection Type Graph Adapter

ICollection<Node>

GraphStoreNodeCollectionAdapter

ICollection<Edge>

GraphStoreEdgeCollectionAdapter

IList<T>

GraphStoreListAdapter<T>

IDictionary<Label, Node>

GraphStoreDictionaryAdapter

As an example, below is the class GraphStoreNodeCollectionAdapter:–

namespace System.Dataflow {

  public abstract class GraphStoreNodeCollectionAdapter {

    public abstract void Add(GraphStore store, object storeKey, Node item);

    public abstract void Clear(GraphStore store, object storeKey);

    public abstract bool Contains(GraphStore store, object storeKey,

           Node item);

    public abstract void CopyTo(GraphStore store, object storeKey,

           Node[] array, int arrayIndex);

    public abstract void Dispose(ref EmbeddableEnumeratorState state);

    public abstract int GetCount(GraphStore store, object storeKey);

    public abstract Node GetCurrent(ref EmbeddableEnumeratorState state);

    public abstract EmbeddableEnumeratorState GetEnumerator(GraphStore store,

           object storeKey);

    public abstract bool IsReadOnly(GraphStore store, object storeKey);

    public abstract bool MoveNext(ref EmbeddableEnumeratorState state);

    public abstract void Reset(ref EmbeddableEnumeratorState state);

    public abstract bool Remove(GraphStore store, object storeKey,

           Node item);

    }

}

As long as specific enumeration implementations can live with the limited space to encode state afforded by the struct EmbeddableEnumeratorState, all enumeration can be kept allocation-free. However, it is equally simple to just store a reference to a standard (heap-allocated) enumerator in the object reference held by that struct and use that to connect to regular enumerable collections.

namespace System.Dataflow {

  public struct EmbeddableEnumeratorState {

    public GraphStore Store { get; set; }

    public object StoreKey { get; set; }

    public object State { get; set; }

    public long Index { get; set; }

  }

}

3.3.3. Custom graph stores

Class GraphStore implements split/merge mechanisms to enable simple custom graph store implementations while not getting in the way of sophisticated, high-performance ones.

Simple graph stores merely operate in terms of edge collections and GraphStore splits those edge collections into a collection of nodes for unlabeled edges and a dictionary from label to node for labeled edges. From there, it projects mutable views that correspond to the list and tuple graph patterns. The following method illustrates how this is achieved by revealing some of the implementation details of GraphStore:–

protected virtual IList<Node> GetTupleValues(Node node) {

  var coll = new NodeAsSplitCollection { Node = node };

  return new TupleView(coll);

}

Here, NodeAsSplitCollection is an internal class used by GraphStore to implement a mutable view implementing ISplitCollection over an arbitrary node. This is the splitting step.

namespace System.Dataflow {

  public interface ISplitCollection : IEnumerable<Edge> {

    Identifier Brand { get; }

    ICollection<Node> UnlabeledEdges { get; }

    bool IsReadOnly { get; }

    IDictionary<Label, Node> LabeledEdges { get; }

  }

}

The second step wraps a mutable tuple view (using the internal class TupleView) around the split view, yielding the mutable IList<Node> that represents the tuple elements.

Sophisticated graph stores operate directly in terms of specialized views with an opportunity to fine-tune the representations chosen for each of the standard graph patterns. Class DefaultGraphStore (§3.4) is an example of such a sophisticated implementation that implements the various graph patterns using separate representation classes and that hooks into the GraphStore logic directly at the ISplitCollection level.

Below is the abstract surface area of GraphStore that every graph store has to implement as well as the protected hooking surface to hook subclasses into the default implementations in the base class:–

namespace System.Dataflow {

  public abstract class GraphStore {

    protected static ICollection<Edge> EmptyEdges { get; }

    // Valid if IsNode(node) && GetKind(node) != Reference,

    // Identifier.Empty otherwise. Never returns null or throws.

    public abstract Identifier GetBrand(Node node);

    protected virtual IEnumerator<Edge> GetEdgeEnumerator(Node node);

    // Valid if IsCollection(node), empty immutable collection otherwise.

    // Never returns null or throws.

    // post: non-null collection of pairs, where pair.Key is either null

    // or unique; may be empty

    protected abstract ICollection<Edge> GetEdges(Node node);

    // pre: IsRecord(node) && HasItem(node, label)

    protected virtual Node GetLabeledEdge(Node node, Label label);

    // pre: IsRecord(node)

    protected virtual Node GetLabeledEdgeOrDefault(Node node, Label label);

    // Valid if IsNode(node), empty enumeration otherwise.

    // Never returns null or throws.

    protected virtual IEnumerable<Node> GetLabeledValues(Node node);

    // Valid if IsNode(node), empty enumeration otherwise.

    // Never returns null or throws.

    protected virtual IEnumerable<Label> GetLabels(Node node);

    // Valid if IsList(node), empty immutable list otherwise.

    // Never returns null or throws.

    protected virtual IList<Node> GetListValues(Node node);

    // Valid if IsNode(node), MNodeKind.None otherwise.

    // Never returns null or throws.

    public abstract NodeKind GetNodeKind(Node node);

    // Valid if IsList(node) or IsTuple(node),

    // empty immutable collection otherwise.

    // Never returns null or throws.

    protected virtual IList<Node> GetOrderedValues(Node node);

    // Valid if IsTuple(node), empty immutable list otherwise.

    // Never returns null or throws.

    protected virtual IList<Node> GetTupleValues(Node node);

    // Default: returns GetUnlabeledNodeCollection(node);

    // override to direct contributions to lists/tuples/...

    protected virtual ICollection<Node> GetUnlabeledContributionTarget(

        Node node);

    // Valid if IsCollection(node), empty immutable collection otherwise.

    // Never returns null or throws.

    protected virtual ICollection<Node> GetValues(Node node);

    // For all Has/Is tests but IsNode:

    // valid if IsNode(node), false otherwise.

    // Never returns null or throws.

    protected virtual bool HasLabel(Node node, Label label);

    // If IsAtomic(node) then !node.IsDefault && IsNode(node)

    //    && !IsCollection(node)

    protected virtual bool IsAtomic(Node node);

    // If IsCollection(node) then !node.IsDefault && IsNode(node)

    //    && IsComposite(node)

    protected virtual bool IsCollection(Node node);

    // If IsComposite(node) then !node.IsDefault && IsNode(node)

    //    && !IsAtomic(node)

    protected virtual bool IsComposite(Node node);

    // If IsList(node) then !node.IsDefault && IsNode(node) && IsRecord(node)

    protected virtual bool IsList(Node node);

    // If IsRecord(node) then !node.IsDefault && IsNode(node)

    //    && IsComposite(node)

    protected virtual bool IsRecord(Node node);

    // If IsTuple(node) then !node.IsDefault && IsNode(node)

    //    && IsRecord(node)

    protected virtual bool IsTuple(Node node);

  }

}

The following shows the virtual members of GraphStore, together with the default behavior exposed.

namespace System.Dataflow {

    public abstract class GraphStore {

        // Default key comparer uses object.Equals/GetHashCode

        public virtual IEqualityComparer<object> KeyComparer { get; }

        // Default returns SyntheticSource.Store.GetEdges(SyntheticSource)

        public virtual ICollection<Edge> OuterEdges { get; }

        // Initialized to direct instance of GraphReferenceResolver

        public virtual GraphReferenceResolver ReferenceResolver { get; set; }

        // Default uses ShadowGraphStore to create a composite node

        // to hold outer edges

        public virtual Node SyntheticSource { get; }

        // Default coalescing logic; override to distribute logic elsewhere

        // or to coalesce lazily.

        public virtual void AddContribution(Node node, Edge edge);

        // Default: empty

        public virtual void AddPropertyChangedEventHandler(Node node,

               PropertyChangedEventHandler eventHandler);

        // Override to accelerate the case where the collection

        // is known to be constructed over unlabeled edges only

        public virtual Node CreateCollection(Identifier brand, NodeKind kind,

               IEnumerable<Node> nodes);

        // Override at least this one method to support store-specific

        // construction. Default throws NotSupportedException, rendering

        // the store essentially read-only.

        public virtual Node CreateCollection(Identifier brand, NodeKind kind,

               IEnumerable<Edge> edges);

        // By default, all constants (incl. null) are representing

        // themselves in a graph. Therefore, by default, only the default

        // brand is accepted; otherwise this method throws.

        public virtual Node CreateConstant(Identifier brand, object value);

        // The following methods, by default, just call through the boxing

        // CreateConstant method above. Override to avoid boxing.

        public virtual Node CreateConstant(Identifier brand, byte[] value);

        public virtual Node CreateConstant(Identifier brand, Date value);

        public virtual Node CreateConstant(Identifier brand, DateTime value);

        public virtual Node CreateConstant(Identifier brand,

               DateTimeOffset value);

        public virtual Node CreateConstant(Identifier brand, decimal value);

        public virtual Node CreateConstant(Identifier brand, Double value);

        public virtual Node CreateConstant(Identifier brand, Guid value);

        public virtual Node CreateConstant(Identifier brand, int value);

        public virtual Node CreateConstant(Identifier brand, long value);

        public virtual Node CreateConstant(Identifier brand, bool value);

        public virtual Node CreateConstant(Identifier brand); // creates null

        public virtual Node CreateConstant(GraphReference value);

        public virtual Node CreateConstant(Identifier brand, string value);

        public virtual Node CreateConstant(Identifier brand, Time value);

        // For adapter retrieval methods, see §3.3.2

        // Default returns GetConstantClassification(node.StoreKey),

        // assuming that constant values represent themselves.

    // Returns ConstantClassification.Node if value cannot be classified,

    // never throws.

    public virtual ConstantClassification GetConstantClassification(

           Node node);

    // The following methods share the precondition:

    // IsConstant(node) && GetConstantClassification(node) == ...

    // Override to enable fast access to type-specific atom representations.

    public virtual bool GetConstantBoolean(Node node);

    public virtual byte[] GetConstantBytes(Node node);

    public virtual Date GetConstantDate(Node node);

    public virtual DateTime GetConstantDateTime(Node node);

    public virtual DateTimeOffset GetConstantDateTimeOffset(Node node);

    public virtual decimal GetConstantDecimal(Node node);

    public virtual double GetConstantDouble(Node node);

    public virtual GraphReference GetConstantGraphReference(Node node);

    public virtual Guid GetConstantGuid(Node node);

    public virtual int GetConstantInt32(Node node);

    public virtual long GetConstantInt64(Node node);

    public virtual string GetConstantString(Node node);

    public virtual Time GetConstantTime(Node node);

    // Valid if IsConstant(node), null otherwise, never throws.

    public virtual object GetConstantValue(Node node);

    // Valid if IsNode(node), empty immutable dictionary otherwise.

    // Never returns null or throws.

    // Default constructs a view over the result returned by GetEdges(node)

    public virtual IDictionary<Label, Node> GetLabeledEdges(Node node);

    // Valid if IsCollection(node), empty immutable collection otherwise.

    // Never returns null or throws.

    // Default constructs a view over the result returned by GetEdges(node)

    public virtual ICollection<Node> GetUnlabeledEdges(Node node);

    // By default, only SyntheticRoot (OuterEdges) is mutable.

    public virtual bool IsReadOnly(Node node);

    // Default returns ReferenceResolver.ResolveReferenceInScope

    public virtual Node NavigateTo(Node scope, GraphReference reference);

    // Default returns ReferenceResolver.BeginNavigateTo

    public virtual IAsyncResult BeginNavigateTo(Node scope,

           GraphReference reference, AsyncCallback callback,

           object asyncState);

    // Default returns ReferenceResolver.EndNavigateTo

    public virtual Node EndNavigateTo(IAsyncResult asyncResult);

    // By default, it is assumed that non-store-specific objects

    // can be used as keys. For instance, stores may represent

    // atomic nodes just by their (possibly boxed) constant value.

    // OwnsKey enables strongly key-owning stores to protect against

    // foreign store keys. However, OwnsKey cannot be used to reliably

    // determine the owner for a store key, unless it is known that

    // _all_ queried stores use strongly owned keys.

    public virtual bool OwnsKey(object key);

    // Default: empty

    public virtual void RemovePropertyChangedEventHandler(Node node,

           PropertyChangedEventHandler eventHandler);

    // Default: calls

    // ReferenceResolver.TryResolveAllReferences(SyntheticSource)

    // and throws if some references cannot be resolved.

    public virtual void ResolveAllReferences();

    // pre: !IsReadOnly(node) && !IsReference(node)

    // Default: throw NotSupportedException

    public virtual void SetBrand(Node node, Identifier brand);

    // pre: !IsReadOnly(node) && IsAtomic(node)

    // Default: throw NotSupportedException

    public virtual void SetConstantValue(Node node, object value);

    // pre: !IsReadOnly(node) && IsCollection(node)

    // Default: throw NotSupportedException

    public virtual void SetEdge(Node node, Label label, Node value);

    // pre: !IsReadOnly(list) && IsList(list)

    // Default: throw NotSupportedException

    public virtual void SetListValue(Node list, int index, Node element);

    // pre: nodeVariable != null && IsNode(target)

    // Note: Stores that ban setting of reference targets need to also

    // ban reference resolution.

    // Default: call nodeVariable(target)

    public virtual void SetReferenceTarget(Action<Node> nodeVariable,

           Node target);

    // post: if string.Empty is returned,

    //    then no special string encoding is available.

    //    (The caller may substitute a default string value in such a case.)

    // Default: builds a debugger-friendly size-limited summary

    //    of the graph reachable from the given node.

    public virtual string ToString(Node node);

    // Write the reachable graph from the given node.

    // If includeRootNode, then include the given node, otherwise write

    //    a sequence of that node’s outgoing edges.

    // Default: use ReferenceResolver.TryBreakAllSharing to convert edges

    //    into symbolic references as needed to eliminate sharing and cycles.

    public virtual void WriteTo(GraphWriter writer, Node root,

           bool includeRootNode)

3.4. DefaultGraphStore

The class DefaultGraphStore uses classes DefaultGraphStoreCollectionNode, DefaultGraphStoreCompositeNode, DefaultGraphStoreConstantNode, DefaultGraphStoreListNode, DefaultGraphStoreNode, DefaultGraphStoreRecordNode, and DefaultGraphStoreTupleNode to implement the object model using CLR objects in a direct way. These classes are available to ease the implementation of similar custom graph stores.

Two base classes simplify the construction of similar representation classes: DefaultGraphStoreNode and DefaultGraphStoreCollectionBase.

namespace System.Dataflow {

  public abstract class DefaultGraphStoreNode {

    public GraphStore Store { get; set; }

    public bool IsReadOnly { get; set; }

  }

}

All default representations maintain a back link to their store, including the representations that hold atomic values. That is, DefaultGraphStore does not use values to represent themselves, that is, to be their own store key. The uniform back link is used to implement a strong version of GraphStore.OwnsKey in DefaultGraphStore:–

public override bool OwnsKey(object key) {

  var node = key as DefaultGraphStoreNode;

  return node != null && node.Store == this;

}

Rationale: A strong OwnsKey implementation helps catching errors in initial multi-store graph code. However, such checking is not required and stores may opt to not pay the price to enable such checking.

namespace System.Dataflow {

  public abstract class DefaultGraphStoreCollectionBase

    : DefaultGraphStoreNode, ISplitCollection

  {

    public static readonly Label HeadLabel = Label.Get("Head");

    public static readonly Label TailLabel = Label.Get("Tail");

    public virtual Identifier Brand { get; set; }

    public virtual bool IsList { get; }

    public virtual bool IsRecord { get; }

    public virtual bool IsTuple { get; }

    public abstract IDictionary<Label, Node> LabeledEdges { get; }

    public abstract ICollection<Node> UnlabeledEdges { get; }

    public virtual void Add(IEnumerable<Edge> edges);

    public virtual void Add(IEnumerable<Node> nodes);

    public virtual void Add(ILookup<Label, Node> nodesByLabel);

    public virtual IEnumerator<Edge> GetEnumerator();

    IEnumerator IEnumerable.GetEnumerator();

    public virtual ICollection<Node> ViewAllNodes();

    public virtual ICollection<Edge> ViewEdges();

    public virtual IList<Node> ViewAsList();

    public virtual IList<Node> ViewAsTuple();

  }

}

Note that the above Add methods assume that coalescing (merging) has already been performed.

The default implementations of View… use the same (internal) view implementations as GraphStore, enabling pointwise implementation of some graph patterns while leaving others to default mechanisms.

3.4.1. Specialized Implementation of Tuple Graph Pattern

Most of the specialized implementations are omitted in this document as their functionality is self evident. However, below are the details of DefaultGraphStoreTupleNode to follow up on the implementation details around the default implementation of the tuple graph pattern (§3.3.3).

namespace System.Dataflow {

  public class DefaultGraphStoreTupleNode : DefaultGraphStoreCollectionBase {

    public override bool IsList { get; }

    public override bool IsRecord { get; }

    public override bool IsTuple { get; }

    public override IDictionary<Label, Node> LabeledEdges { get; }

    public override ICollection<Node> UnlabeledEdges { get; }

    public override void Add(IEnumerable<Edge> edges);

    public override void Add(IEnumerable<Node> nodes);

    public override IEnumerator<Edge> GetEnumerator();

    public override ICollection<Node> ViewAllNodes();

    public override IList<Node> ViewAsList();

    public override IList<Node> ViewAsTuple();

  }

}

3.5. Examples

3.5.1. Custom CLR Objects from Graph

The following example shows a simple mapping implementation that builds a web of strongly-typed custom CLR objects, using a two-pass approach to deal with references. In a first pass, a graph is read into a default store, including reference resolution. In a second pass, a custom factory is used to traverse the graph and create a web of objects:–

string villageGraph = @"

  {

    Villagers =>

    {

      Jenn => { Name => 'Jennifer', Age => 28, Spouse => Rich },

      Rich => { Name => 'Richard', Age => 26, Spouse => Jenn },

      Charly => { Name => 'Charlotte', Age => 12 }

    },

    HaveSpouses => { Villagers.Rich, Villagers.Jenn }

  }";

var root = Node.ReadFromString(villageGraph).Single().Node;

// With the graph store fully loaded, references are now resolved.

// Generate custom objects from graph store, retaining no back link

// to graph store.

var village = VillageFactory.VillageFromGraph(root);

var jenn =

  (from v in village.Villagers

   where v.Name == "Jennifer"

   select v).Single();

var jennsAge = jenn.Age;

var richsSpouse =

 (from v in village.Villagers

  where v.Name == "Richard"

  select v.Spouse).Single();

var richsSpousesAge = richsSpouse.Age;

var jennHasSpouse = village.HaveSpouses.Contains(jenn);

var charly =

 (from v in village.Villagers

  where v.Name == "Charlotte"

  select v).Single();

var charlyHasSpouse = village.HaveSpouses.Contains(charly);

Assert.AreEqual(28, jennsAge);

Assert.AreEqual(28, richsSpousesAge);

Assert.IsTrue(jennHasSpouse);

Assert.IsFalse(charlyHasSpouse);

The simple factory used above is shown below. The factory uses a dictionary to ensure that sharing and cycles in the original graph are preserved in the generated object web.

static class VillageFactory {

  public static Village VillageFromGraph(Node root) {

    var villagers = new Dictionary<Node, Villager>();

    foreach (var node in root.Edges["Villagers"].ViewAllNodes()) {

      var villager = GetVillager(node, villagers);

    }

    var village = new Village();

    foreach (var node in root.Edges["Villagers"].ViewAllNodes()) {

      var villager = GetVillager(node, villagers);

      foreach (var i in node.ViewAsRecord()) {

        switch (i.Key.Text) {

          case "Age": villager.Age = (int)i.Value; break;

          case "Name": villager.Name = (string)i.Value; break;

          case "Spouse": villager.Spouse = GetVillager(i.Value, villagers);

            break;

        }

      }

      village.Villagers.Add(villager);

    }

    foreach (var node in root.Edges["HaveSpouses"].ViewAllNodes()) {

      village.HaveSpouses.Add(GetVillager(node, villagers));

    }

    return village;

  }

  static Villager GetVillager(Node node,

    Dictionary<Node, Villager> villagers)

  {

    Villager villager;

    if (!villagers.TryGetValue(node, out villager)) {

      villagers.Add(node, new Villager());

    }

    return villager;

  }

}

3.5.2. Arbitrary CLR Objects built by Custom Graph Store

This example shows a simple custom graph store that directly builds custom-typed CLR objects.

string villageGraph = @"

  Village {

    Villagers => {

      Jenn => Villager { Name => 'Jennifer', Age => 28, Spouse => Rich },

      Rich => Villager { Name => 'Richard', Age => 26, Spouse => Jenn },

      Charly => Villager { Name => 'Charlotte', Age => 12 }

    },

    HaveSpouses => { Villagers.Rich, Villagers.Jenn }

  }";

var typeMap = new Dictionary<Identifier, Type> {

  { typeof(Village).Name, typeof(Village) },

  { typeof(Villager).Name, typeof(Villager) }

};

Func<Identifier, Type> typeResolver =

  id => {

    Type t = null;

    typeMap.TryGetValue(id, out t);

    return t;

  };

var village = ObjectGraphStore.Load<Village>(villageGraph, typeResolver);

var jenn =

  (from v in village.Villagers

   where v.Name == "Jennifer"

   select v).Single();

var jennsAge = jenn.Age;

var richsSpouse =

  (from v in village.Villagers

   where v.Name == "Richard"

   select v.Spouse).Single();

var richsSpousesAge = richsSpouse.Age;

var jennHasSpouse = village.HaveSpouses.Contains(jenn);

var charly =

  (from v in village.Villagers

   where v.Name == "Charlotte"

   select v).Single();

var charlyHasSpouse = village.HaveSpouses.Contains(charly);

Assert.AreEqual(28, jennsAge);

Assert.AreEqual(28, richsSpousesAge);

Assert.IsTrue(jennHasSpouse);

Assert.IsFalse(charlyHasSpouse);

The store implementation used above is shown below. It uses an externally provided mapping function (typeResolver in the code above) to map brands to .NET types.

class ObjectGraphStore : GraphStore {

  static Edge[] emptyEdges = new Edge[0];

  public virtual Func<Identifier, Type> TypeResolver { get; set; }

  public override Node CreateCollection(Identifier brand, NodeKind kind,

         IEnumerable<Edge> edges)

  {

    var builder = new GraphObjectBuilder();

    if (Identifier.IsNullOrEmpty(brand)) {

      // All unbranded collections are forced into lists.

      builder.Type = typeof(List<Edge>);

    }

    else {

      // All branded collections are forced into specific objects.

      Type type = TypeResolver(brand);

      if (type == null) {

        throw new NotSupportedException(

          "The specified brand cannot be resovled to a type.");

      }

      builder.Type = type;

    }

    builder.Store = this;

    if (edges != null) {

      foreach (var edge in edges) { builder.Edges.Add(edge); }

    }

    return Node.CreateNodeForStoreKey(this, builder);

  }

  public override Identifier GetBrand(Node node) {

    var builder = node.StoreKey as GraphObjectBuilder;

    if (builder != null) { return Identifier.Get(builder.Type.Name); }

    return Identifier.Empty;

  }

  protected override ICollection<Edge> GetEdges(Node node) {

    var builder = node.StoreKey as GraphObjectBuilder;

    if (builder == null) { return emptyEdges; }

    return builder.Edges;

  }

  public override NodeKind GetNodeKind(Node node) {

    var builder = node.StoreKey as GraphObjectBuilder;

    if (builder != null) { return NodeKind.Record; }

    return NodeKind.Atomic;

  }

  public static object GetValue(object key) {

    var builder = key as GraphObjectBuilder;

    if (builder != null) { return builder.GetObject(); }

    return key;

  }

  public static T Load<T>(string source, Func<Identifier, Type> typeResolver)

  {

    var sreader = new StringReader(source);

    var mreader = new GraphTextReader(sreader);

    var store = GraphStore.ReadFrom(

      () => new ObjectGraphStore { TypeResolver = typeResolver },

      mreader);

    return (T)GetValue(store.OuterEdges.Single().Node.StoreKey);

  }

}

The custom store above uses the auxiliary class GraphObjectBuilder to build individual objects. Builders build their object at most once, preserving reference identity and thus sharing and cycles in the build web of objects. The builder protocol uses two phases. In the first, the custom store creates one builder per node in the graph, sets each builder’s Type property and populates each builder’s Edges collection. In the second phase, the custom store requests each builder’s object, which triggers the builder to populate those objects’ properties from the Edges collection.

class GraphObjectBuilder {

  static Type edgeCollectionType = typeof(ICollection<Edge>);

  ICollection<Edge> edges = new List<Edge>();

  object target;

  public ICollection<Edge> Edges { get { return edges; } }

  public Type Type { get; set; }

  public object GetObject() {

    if (target != null) { return target; }

    target = Activator.CreateInstance(Type);

    if (edgeCollectionType.IsAssignableFrom(Type)) {

      var coll = (ICollection<Edge>)target;

      coll.Clear();

      foreach (var m in Edges) { coll.Add(m); }

    }

    else {

      foreach (var m in Edges) {

        var property = Type.GetProperty(m.Label.Text);

        if (property == null) {

          throw new NotSupportedException("Only properties can be set.");

        }

        object value = ObjectGraphStore.GetValue(m.Node.StoreKey);

        if (edgeCollectionType.IsAssignableFrom(value.GetType())) {

          var args = (ICollection<Edge>)value;

          var getter = property.GetGetMethod();

          var coll = getter.Invoke(target, null);

          var adder = coll.GetType().GetMethod("Add");

          foreach (var a in args) {

            var arg = ObjectGraphStore.GetValue(a.Node.StoreKey);

            adder.Invoke(coll, new object[] { arg });

          }

        }

        else {

          var setter = property.GetSetMethod();

          setter.Invoke(target, new object[] { value });

        }

      }

    }

    return target;

  }

}

3.5.3. Objects Built by Custom Graph Store, no References

If it is known that a graph contains no sharing or cycles, then a very simple custom graph store can be used to build custom object webs directly, in a single pass:–

string villageTree = @"

  Village {

    Villagers => {

      Jenn => Villager { Name => 'Jennifer', Age => 28 },

      Rich => Villager { Name => 'Richard', Age => 26 },

      Charly => Villager { Name => 'Charlotte', Age => 12 }

    }

  }";

var village = VillageTreeStore.Load(villageTree);

var jenn =

    (from v in village.Villagers

     where v.Name == "Jennifer"

     select v).Single();

var jennsAge = jenn.Age;

Assert.AreEqual(28, jennsAge);

The custom store VillageTreeStore used above is shown below. The main constraint that enables its simple form is that it does not support references (and thus reference resolution).

The store does implement the minimally required graph traversal methods. The web of objects can therefore be traversed/inspected as a graph as well. Besides being generally useful, this feature enables this simple store to establish sharing relationships: an already built object can be traversed while building the graph. The resulting graph cannot contain cycles, since the one-pass building approach of this store cannot build an object while deferring resolution of references to objects not yet built. In other words, the graphs buildable by this store are always directed acyclic graphs.

The store virtualizes the graph over the underlying objects in a straightforward way that does not afford updates. The graph is not exactly read-only as it supports the bottom-up addition of new nodes (and thus underlying objects) at any time. However, it does not support mutation of already constructed nodes (the edge collections returned are read-only).

If the underlying objects are directly mutated, then the store will reflect such changes in the virtualized graph’s next traversal. This works because the store itself is stateless. There is no notification mechanism for graph clients though.

class VillageTreeStore : GraphStore {

  static readonly Edge[] emptyEdges = new Edge[0];

  public Village Village {

    get { return SyntheticSource.Edges.First().Node.StoreKey as Village; }

  }

  public override Node CreateCollection(Identifier brand, NodeKind kind,

         IEnumerable<Edge> edges)

  {

    if (brand == typeof(Village).Name) {

      return CreateVillage(edges);

    }

    if (brand == typeof(Villager).Name) {

      return CreateVillager(edges);

    }

    if (Identifier.IsNullOrEmpty(brand)) {

      return CreateEnumerableCollection(edges);

    }

    throw new NotSupportedException();

  }

  Node CreateEnumerableCollection(IEnumerable<Edge> edges) {

    return Node.CreateNodeForStoreKey(

      this,

      (edges ?? emptyEdges).Select(m => m.Node.StoreKey));

  }

  Node CreateVillage(IEnumerable<Edge> edges) {

    var village = new Village();

    foreach (var edge in edges) {

      switch (edge.Label.Text) {

        case "Villagers":

          var villagers = (IEnumerable)edge.Node.StoreKey;

          foreach (var v in villagers.Cast<Villager>()) {

            village.Villagers.Add(v);

          }

          break;

        default:

          throw new NotSupportedException();

      }

    }

    return Node.CreateNodeForStoreKey(this, village);

  }

  Node CreateVillager(IEnumerable<Edge> edges) {

    var villager = new Villager();

    foreach (var edge in edges) {

      switch (edge.Label.Text) {

        case "Age":

          villager.Age = (int)edge.Node;

          break;

        case "Name":

          villager.Name = (string)edge.Node;

          break;

        default:

          throw new NotSupportedException();

      }

    }

    return Node.CreateNodeForStoreKey(this, villager);

  }

  public override Identifier GetBrand(Node node) {

    var key = node.StoreKey;

    if (key is Village || key is Villager) { return key.GetType().Name; }

    return Identifier.Empty;

  }

  protected override ICollection<Edge> GetEdges(Node node) {

    var key = node.StoreKey;

    var village = key as Village;

    if (village != null) {

      return new List<Edge> {

        new Edge("Villagers",

          Node.CreateNodeForStoreKey(this, village.Villagers))

      };

    }

    var villager = key as Villager;

    if (villager != null) {

      return new List<Edge> {

        new Edge("Age", Node.CreateNodeForStoreKey(this, villager.Age)),

        new Edge("Name", Node.CreateNodeForStoreKey(this, villager.Name))

      };

    }

    var coll = key as IEnumerable;

    if (coll != null) {

      return

        (from v in coll.Cast<object>()

         select new Edge(Label.Empty, Node.CreateNodeForStoreKey(this, v)))

        .ToList();

     }

     return emptyEdges;

   }

  public override NodeKind GetNodeKind(Node node) {

    var key = node.StoreKey;

    if (key is Village || key is Villager) {

      return NodeKind.Record;

    }

    if (key is IEnumerable && !(key is string)) {

      return NodeKind.Collection;

    }

    return NodeKind.Atomic;

  }

  public static Village Load(string source) {

    return GraphStore.ReadFromString<VillageTreeStore>(source).Village;

  }

}

Note: The techniques used in this example can be combined with the ones used for generic CLR object building exhibited in §3.5.2 to move beyond hard-coded knowledge of specific types. A future version of the SQL Server Modeling CTP may provide more support for such specialized graph store implementations.

4. Graph Readers and Writers

The mapping of graphs from and to serial streams is performed by GraphReader and GraphWriter, respectively. Concrete implementations of these abstract classes are provided for the important special case of text streams: GraphTextReader and GraphTextWriter.

4.1. GraphReader

The GraphReader base class has a low-level bottleneck interface, combined with a set of virtual default implementations built on top (listed separately, further below).

namespace System.Dataflow {

  public abstract class GraphReader : IDisposable {

    protected GraphReader();

    // Available for StartCollection and for

    // Constant tokens if present in the input stream;

    // Identifier.Empty if a brand is not present in the input stream

    // for those tokens; null for other tokens; never throws.

    public abstract Identifier Brand { get; }

    // Available for Constant tokens; ConstantClassification.None otherwise.

    public abstract ConstantClassification ConstantClassification { get; }

    // The number of StartCollections minus number of

    // EndCollections, both counts excluding the current token

    public abstract int Depth { get; }

    // Available for StartCollection and for

    // Constant tokens if present in the input stream;

    // Label.Empty if a label is not present in the input stream

    // for those tokens; null for other tokens; never throws.

    public abstract Label Label { get; }

    // See §4.1.1

    public GraphReaderQuotas Quotas { get; }

    // Current position;

    // initially StartStream when a reader is created and no Read() calls

    // have been made;

    // EndStream once reader has been advanced past the end of the stream.

    public GraphStreamTokenKind TokenKind { get; }

    // Close the reader and any underlying streams held by it.

    public abstract void Close();

    protected virtual void Dispose(bool disposing);

    public void Dispose();

    // Available for Constant tokens; may throw on parse error.

    public abstract object GetConstant();

    // Throws if current token is EndStream.  Otherwise, advance to

    // the next token and set the Label, TokenKind, TextValue

    // and ConstantKind properties according to the new token.

    public abstract bool Read();

  }

}

The set of (mostly virtual) functions built on top of the reader’s bottleneck interface are listed below.

The following set of read methods can be specialized by reader implementations to avoid boxing of values. By default, these all call through ReadConstant, causing values to be boxed.

public virtual bool ReadBoolean();

public virtual byte[] ReadBytes();

public virtual Date ReadDate();

public virtual DateTime ReadDateTime();

public virtual DateTimeOffset ReadDateTimeOffset();

public virtual decimal ReadDecimal();

public virtual double ReadDouble();

public virtual GraphReference ReadGraphReference();

public virtual Guid ReadGuid();

public virtual int ReadInt32();

public virtual long ReadInt64();

public virtual object ReadNull();

public virtual string ReadText();

public virtual Time ReadTime();

The following method allows readers to skip forward over nested non-atomic nodes, while doing minimal work. The reader must be at a Start… token; after the call, the reader will be at the token that is immediately after the matching End… token.

public virtual void Skip();

4.1.1. GraphReaderQuotas

Stream reading opens up systems for potential denial of service attacks. The GraphReaderQuotas class limits the size complexity of streams to preset limits that can be used to curb such attacks.

namespace System.Dataflow {

  public class GraphReaderQuotas {

    public int MaxConstantLength { get; set; }

    public int MaxDepth { get; set; }

    public int MaxLabelLength { get; set; }

    public int MaxReferenceSegments { get; set; }

  }

}

The quota setters can only be called while the reader has not yet been used; all limits must be positive values. The following table shows the quotas and their default limits:–

Quota Default limit Purpose

MaxConstantLength

1024 characters

Maximum acceptable length of serialized constants.

MaxDepth

32

Maximum acceptable nesting depth of non-atomic nodes.

MaxLabelLength

256 characters

Maximum acceptable length of serialized labels and brands.

MaxReferenceSegments

32

Maximum acceptable number of segments in a symbolic reference.

4.1.2. GraphTextReader

Class GraphTextReader exposes almost no new functionality beyond base class GraphReader. It has a set of constructors that support text streams from a number of sources and in a variety of encodings. The one added member is the TextValue property that exposes the source-level textual encoding of the most recently read token, as literally found in the text stream.

namespace System.Dataflow {

  public class GraphTextReader : GraphReader {

    public GraphTextReader(string filePath);

    public GraphTextReader(string filePath, Encoding encoding);

    public GraphTextReader(Stream stream);

    public GraphTextReader(Stream stream, Encoding encoding);

    public GraphTextReader(TextReader reader);

    public override Identifier Brand { get; }

    public override ConstantClassification ConstantClassification { get; }

    public override int Depth { get; }

    public override Label Label { get; }

    public virtual string TextValue { get; }

    public override void Close();

    public override object GetConstant();

    public override bool Read();

    public override void VerifyReaderPosition(

           GraphStreamTokenKind requiredTokenKind, Label requiredLabel);

  }

}

4.2. GraphWriter

The GraphWriter base class has a low-level bottleneck interface, combined with a set of virtual default implementations built on top (listed separately, further below).

namespace System.Dataflow {

  public abstract class GraphWriter : IDisposable {

    // The number of StartCollections written so far minus number of

    // EndCollections written so far.

    public abstract int Depth { get; }

    // The last written token.  Initially, StartStream.

    public abstract GraphStreamTokenKind TokenKind { get; }

    // Close the writer and any underlying streams held by it.

    // This should be idempotent.

    public abstract void Close();

    protected virtual void Dispose(bool disposing);

    public void Dispose();

    // Flush the writer and any underlying streams held by it

    public abstract void Flush();

    // Write the specified token.  The label paramater may be

    // specified only for StartCollection, Constant, and Reference

    // tokens.  The value paramater may be specified only for

    // Constant and Reference tokens.

    public void Write(GraphStreamTokenKind tokenKind,

           Label label, Identifier brand, object value);

    protected abstract void OnWrite(GraphStreamTokenKind tokenKind,

           Label label, Identifier brand, object value);

  }

}

The following convenience functions map specific write requests to the common Write bottleneck interface.

public void WriteConstant(Label label, Identifier brand, object value);

public void WriteEnd();

public void WriteEndCollection();

public void WriteEndComposite();

public void WriteEndRecord();

public void WriteEndList();

public void WriteEndTuple();

public void WriteStart(NodeKind kind, Label label, Identifier brand);

public void WriteStartCollection(Label label, Identifier brand);

public void WriteStartComposite(Label label, Identifier brand);

public void WriteStartRecord(Label label, Identifier brand);

public void WriteStartList(Label label, Identifier brand);

public void WriteStartTuple(Label label, Identifier brand);

The following strongly typed write methods call WriteConstant by default, causing boxing. They can be overridden to avoid boxing.

public virtual void WriteBoolean(Label label, Identifier brand, bool value);

public virtual void WriteBytes(Label label, Identifier brand, byte[] value);

public virtual void WriteDate(Label label, Identifier brand, Date value);

public virtual void WriteDateTime(Label label, Identifier brand,

       DateTime value);

public virtual void WriteDateTimeOffset(Label label, Identifier brand,

       DateTimeOffset value);

public virtual void WriteDecimal(Label label, Identifier brand,

       decimal value);

public virtual void WriteDouble(Label label, Identifier brand, double value);

public virtual void WriteGraphReference(Label label,

       GraphReference reference);

public virtual void WriteGuid(Label label, Identifier brand, Guid value);

public virtual void WriteInt32(Label label, Identifier brand, int value);

public virtual void WriteInt64(Label label, Identifier brand, long value);

public virtual void WriteNull(Label label, Identifier brand);

public virtual void WriteString(Label label, Identifier brand, string value);

public virtual void WriteTime(Label label, Identifier brand, Time value);

The following method copies from the current reader's token to the end of the fragment, both inclusive. If the reader started on a StartXxx token, then the fragment ends with the matching EndXxx token, inclusive. Otherwise, just the current token is copied. After the call, the reader will be at the token just after the copied token.

public virtual void Write(GraphReader reader);

4.2.1. GraphTextWriter

Class GraphTextWriter mostly implements GraphWriter. It provides a set of constructors to write to a variety of stream targets, using a variety of encodings. The one added feature allows control over indentation width per nesting level: if set to 0, indentation is turned off.

namespace System.Dataflow {

  public class GraphTextWriter : GraphWriter {

    public GraphTextWriter(string filePath);

    public GraphTextWriter(string filePath, Encoding encoding);

    public GraphTextWriter(Stream stream);

    public GraphTextWriter(Stream stream, Encoding encoding);

    public GraphTextWriter(TextWriter writer);

    public override int Depth { get; }

    // The set can be called only before the first write call.

    // The default is 2.  A value of 0 value does no indentation.

    public int Indentation { get; set; }

    public override GraphStreamTokenKind TokenKind { get; }

    public override void Close();

    public override void Flush();

    protected override void OnWrite(GraphStreamTokenKind tokenKind,

           Label label, Identifier brand, object value);

  }

}

4.3. Examples

string villageGraph = @"

  {

    Villagers =>

      {

        Jenn => { Name => 'Jennifer', Age => 28, Spouse => Rich },

        Rich => { Name => 'Richard', Age => 26, Spouse => Jenn },

        Charly => { Name => 'Charlotte', Age => 12 }

      },

    HaveSpouses => { Villagers.Rich, Villagers.Jenn }

  }";

  var input = new StringReader(villageGraph);

  var reader = new GraphTextReader(input);

  reader.Read();

  while (reader.Label != "Rich") { reader.Read(); }

  string name = "";

  int age = 0;

  GraphReference spouse = new GraphReference();

  reader.Read();

  while (reader.TokenKind != GraphStreamTokenKind.EndComposite

    && reader.TokenKind != GraphStreamTokenKind.EndStream)

  {

    switch (reader.Label.Text) {

      case "Age": age = reader.ReadInt32(); break;

      case "Name": name = reader.ReadText(); break;

      case "Spouse": spouse = reader.ReadGraphReference(); break;

      default: throw new NotSupportedException("Unsupported label.");

    }

  }

  reader.Close();

  Assert.AreEqual("Richard", name);

  Assert.AreEqual(26, age);

  Assert.AreEqual("Jenn", spouse.ToString());