Clemens Szyperski
Published: June, 2009
1. Introduction
Developing and utilizing domain-specific languages (DSLs) in
“Oslo” typically involves authoring the grammar in “Intellipad” then using the
“M” command-line tools to process input files against the compiled grammar. The
output of this process is “M” language constructs commonly referred to as
MGraph. Often, this output takes the form of an abstract syntax tree (AST) in
which the highly variable structures of a DSL have been organized into a
standardized graph representation. Data represented in MGraph can be checked
against schema authored in MSchema and can then be loaded into the “Oslo”
repository using the mx.exe command-line tool.
Of course, a command-line tool is meant for interactive use
and automation scripts. Applications and runtimes also need a means to process
MGraph output programmatically, a process known as dynamic parsing. This allows
a user to directly enter DSL input at runtime which the application then
processes without the need to utilize intermediate command-line tools.
The MGraph Object Model (OM) described in this document is
what allows this to happen. It’s the lowest-level runtime model to manage
MGraph instances in .NET applications, that is, processing input against a
grammar through .NET classes written for this purpose; and it is the same set
of classes that “Intellipad” and mx.exe use themselves. Developers will utilize
this object model whenever dynamic parsing is needed. Note that it replaces an
older object model as described in §1.4 below.
Beyond the dynamic-parsing scenarios outlined above, the
MGraph OM is a general .NET framework to manipulate arbitrary graph-shaped
data. Thus, MGraph is really the “data profile” of “M” with a typical instance
looking like this:–
// Populate a small village with some people
Villagers => {
Jenn => Person { Name => 'Jennifer', Age => 28, Spouse => Rich },
Rich => Person { Name => 'Richard', Age => 26, Spouse => Jenn },
Charly => Person { Name => 'Charlotte', Age => 12 }
},
HaveSpouses => { Villagers.Rich, Villagers.Jenn }
MGraph and “M” are still evolving and, in the May 2009
“Oslo” CTP, several not yet finalized language details appear only in MGraph.
Specifically, the above example exhibits the following details:
- An MGraph instance is a sequence of optionally
labeled edges. An edge is a pair (label, targetNode). In the example above,
there are two such top-level edges, labeled Villagers and HaveSpouses.
- Edge labels appear to the left of the binding
operator (=>). An identifier appearing to the right of the binding operator
and before the value being bound is called a brand.
- MGraph supports a fairly rich set of “M”
intrinsic types for leaf values. (See §2.1.3 for the supported “M” types and
how they map to CLR types.)
- MGraph supports comments (//… and /*…*/ style)
and verbatim text (@"…").
Caution: As a
consequence of this staging of language changes, the MGraph syntax is
temporarily incompatible with that used for graphs (extent initializers) in
MSchema and also, in different ways, from that used for graph construction in
the right-hand side of MGrammar rules. (The Spork sample in the May 2009 CTP
shows how M source can be generated programmatically that is compatible with
MSchema.) All this is going to be cleaned up in the near future.
Architecturally, the object model comprises three main
subsystems: the graph model, graph stores, and graph readers/writers. The graph
model is about construction, traversal/inspection, and updating of graphs.
Graph stores provide the backing implementations for different graph
representations and bridges to external stores. Readers/writers support the
mapping of graphs from and to streams.
This introduction comprises an overview of those three
subsystems. The following sections cover each in full detail, followed by a set
of fully worked examples. Some readers may prefer to skip forward to the
examples sections (§2.7, §3.5, and §4.3) before delving into the details of the
design.
The overall object model strikes a balance between usability
and convenience on the one hand and implementation flexibility and enabling
high performance on the other. Performance, both at the level of individual
operations and at the level of scalability to very large graphs, is a central
feature of the design. Graphs are a very general and low-level abstraction that
many complex designs can be built on top of, but only if the supporting
infrastructure can bear such load.
Heads-up: To meet
this two-pronged goal, the design is unusual in that it relies on a family of
struct types beyond what is common in the .NET framework. Readers unfamiliar
with the various consequences of using structs may want to pause and pay
attention to the related detail. The key property of the design is that almost
all of the structs (but enumerators) are stateless; making their use convenient
and straightforward without requiring constant attention to the fact that
struct types are being used.
1.1. Graph Model
An MGraph instance is a directed, edge-labeled graph. Each
node is either atomic, holding a single value, or non-atomic, holding a
multiset of outgoing edges. Nodes carry an optional brand. Edges outgoing from a node are either
unlabeled or uniquely labeled, relative to their source node. A value held by
an atomic node is either of an MGraph classified type or of unclassified type
(arbitrary other CLR type).
.gif)
A graph allow for sharing: a single node can be reachable
via more than one path (where a path is a sequence of one or more edges). A
graph can contain loops: an edge can refer back to its originating node. A
graph can contain cycles: a path can refer back to its originating node.
.gif)
1.1.1. Labels
and Brands
Edges may be associated with a nominal label, where labels
are either strings or integers. The interpretation of labels is left open in
the graph model, except that labels are mandated to be unique among edges
outgoing from the same source node. A typical interpretation uses labels to
designate roles of successor nodes.
Nodes may be associated with a nominal brand, where brands
are strings. The interpretation of brands is left open in the graph model, but
a typical interpretation uses brands to associate a nominal concept, such as a
type or class, with a node.
1.1.2. Graph
patterns
The graph model introduces no inherent ordering on
successors of a node. Thus, if multiple unlabeled edges originate in a given
node, then the successors (the target nodes of these edges) are distinguishable
only by their content. It is valid to have any number of fully
indistinguishable successors – essentially representing the elements of a multiset
as unlabeled successors of a node.
A node that has only labeled outgoing edges, all of which
are strings, represents a record. The successors are the members of the record;
the labels are the member names.
A node that has at least two successors, all of which are
the targets of labeled edges, using integer labels in the range from zero to
the number of successors minus one, represents a tuple. The successors are the
members of the tuple; the labels are the indices into the tuple.
A node that has only labeled outgoing edges, with exactly
the two labels “Head” and “Tail”, represents a list. The successor labeled
“Head” holds the first value in the list; the successor labeled “Tail” holds
the remaining values in the list. This is the familiar cell-based encoding of
lists.
A node with only unlabeled outgoing edges represents an unordered
collection. The successors are the members of that collection.
A node with both labeled and unlabeled outgoing edges
represents a composite, which is the most general pattern supported.
A node with no successors represents an empty collection,
but also serves to represent an empty list, an empty record (a record with zero
members), and an empty composite.
The following table summarizes the graph patterns, their
corresponding notation in MGraph, and their associated labeling constraints.
For tuples and lists, the expanded (and equivalent) notation that shows the
representation as labeled graphs is also shown over a grey backdrop.
1.1.3. Symbolic
References
Normally, all relationships among nodes in a graph are
expressed using edges. However, it is also possible to hold unresolved symbolic
references to other nodes in a graph. Such references are captured as special
atomic values (of type GraphReference). As the one exception to the branding
mechanism, atomic nodes holding such symbolic references cannot be branded.
Rationale: nodes holding symbolic references are seen as
place holders for the actual target node that they refer to. Reference
resolution substitutes the node located by the resolution procedure for the
target node that held the reference. This way, the edge label is retained and
all properties of the reference target are shared, including the reference
target’s brand, if any.
A symbolic reference is a pair (isGlobal, labelSequence).
The isGlobal flag determines whether the reference is global or local. A global
reference is resolved by starting from a separately defined global source node;
if no such node is defined for a given graph, then global references are
undefined.
A local reference is resolved by following the label
sequence, in order, as designated by the reference, starting at the node
holding the symbolic reference.
In the special case where a global source node is defined
and where a local reference’s first label does not resolve from the node
holding the reference, a special scope determination procedure is applied to
determine the node form where the reference is to be resolved. This procedure
constructs a path from the global source node to the node holding the
reference. If no such path can be constructed, reference resolution fails. If
multiple such paths can be constructed, then an arbitrary one of them is chosen.
The node closest to the node holding the reference that has an outgoing edge
labeled with the first label of the reference at hand is chosen as the scope
node. Resolution proceeds from that node.
The textual syntax for symbolic references is [ label ] ('.'
label)*. A global reference has no initial label, that is, it starts with a
dot: .Jill or .Jack.Spouse.Age. A local reference starts with a label: Name or Spouse.Name.
.gif)
In the example above, the symbolic references Spouse.Age and
.Jack.Spouse can both be resolved, assuming that the shaded node is the
designated global source node. Note that references can be resolved “through”
other references. In the example, .Jill.Spouse.Age resolves by first resolving
the encountered reference .Jill and then resolving Age relative to that node.
Global alternatives for Spouse.Age would be .Jack.Age, .Jill.Spouse.Age,
and so on. Local alternatives would be Spouse.Spouse.Spouse.Age, and so on.
There are no local alternatives for the global reference in the example, .Jack.Spouse.
1.2. Graph
Stores
For most cases of programmatic creation, traversal, use, and
modification of graphs, the types Node and Edge form the main points of access.
For implementers of custom graph “stores” (a term used
broadly for all graph implementations, whether in-memory or virtualized over an
external store), the type GraphStore is the center of the design. When creating
graphs using Node and Edge without reference to any specific GraphSt0re, the
system provides a shadow store implementation.
The architecture of the MGraph object model provides
substantial flexibility for implementers of graph stores, catering for
everything from very simple store implementations that just use a few CLR types
to represent graphs as objects to highly sophisticated implementations that
virtualize over an external store and use dense, non-object-based cache
structures. The external use of such stores is the same in all cases, governed
by the fixed types of the graph model.
1.3. Graph
Readers and Writers
The types GraphReader and GraphWriter are used to map graphs
from and to stream formats. Specific subtypes implement the reading/writing of
specific stream representations; most importantly, GraphTextReader and GraphTextWriter
implement the mapping from/to text streams.
1.4. Transition
from Old to New MGraph Object Model
The MGraph object model covered in this document is meant to
replace the older design around type IGraphBuilder, GraphBuilder, etc. This
older design is to be deprecated and then removed in future releases. At the
current point in time (May 2009 “Oslo” CTP), some of the larger design still
depends on IGraphBuilder; most notably, DynamicParser does.
To enable a staged migration from the old to the new design,
class NodeGraphBuilder is included in the current release. It, too, will be
removed as soon as IGraphBuilder goes. NodeGraphBuilder bridges from the old IGraphBuilder
API to the new GraphStore API.
In the example below, Contact, Alias, and Number are brands
and Info is an edge label. Class NodeGraphBuilder maps labels in the IGraphBuilder
world to brands in the new design unless an entity, in the IEntityBuilder
sense, is built, in which case the entity-member label is mapped to an edge
label in the new design.
string contactGrammar = @"
module M {
language Contacts {
syntax Main =
'Contact' ':' a:Alias => Contact { Info => Alias { a } }
| 'Contact' ':' p:Number => Contact { Info => Number { p } };
token Alias = ('A'..'Z' | 'a'..'z')+;
token Number = ('0'..'9' | '-')+;
}
}";
CompilationResults results = Compiler.Compile(
new CompilerOptions {
Sources = {
new TextItem {
Reader = new StringReader(contactGrammar),
ContentType = TextItemType.MGrammar }
}});
DynamicParser parser = results.ParserFactories["M.Contacts"].Create();
parser.GraphBuilder = new NodeGraphBuilder();
With the parser’s graph builder set to an instance of NodeGraphBuilder,
the results of DynamicParser.Parse calls are always of type Node (not just
object, as in the old design). The following examples show the parsing of
sample input sentences (underlined) and their interpretation using the (new)
graph model.
Node contact = (Node)parser.Parse(
new StringTextStream("Contact:gatsby"), null);
Node info = contact.Edges["Info"];
Node infoSingleValue = info.ViewAllNodes().Single();
Assert.AreEqual(contact.Brand, "Contact");
Assert.AreEqual(info.Brand, "Alias");
Assert.AreEqual(infoSingleValue.AtomicValue, "gatsby");
contact = (Node)parser.Parse(
new StringTextStream("Contact:555-1212"), null);
info = contact.Edges["Info"];
infoSingleValue = info.ViewAllNodes().Single();
Assert.AreEqual(contact.Brand, "Contact");
Assert.AreEqual(info.Brand, "Number");
Assert.AreEqual(infoSingleValue.AtomicValue, "555-1212");
2. Graph
Model
The type Node is the design center for almost all cases
where graphs are being created, traversed, read, written, or updated. Node is a
struct and leads to a family of further struct types that, together, enable
allocation- and copy-free operations over graphs while providing a convenient
type surface that is friendly for Linq and other modern patterns, such as
initializers, as supported by C# 3.0.
Rationale: These struct types contain no implementation or
representation decisions of their own and delegate all such decisions through
type GraphStore to specific implementations. It is useful to think of GraphStore
as a large method dispatch type that provides convenient defaults for
implementers of graph stores, but that is not meant to be used directly. Access
to GraphStore’s through Node and associated types is much more convenient and equally
efficient. In fact, each of the struct types in this design contains exactly
two reference fields: a reference to the underlying graph store and a
store-specific key object. The only exceptions are the enumerator structs that
hold additional enumeration state.
2.1. Node
The Node struct has a large surface area. For the purposes
of documentation, it is broken up into pieces in this section and its
subsections. The following shows the main surface of Node, with references to
subsections for elided members.
namespace System.Dataflow {
public struct Node : IEquatable<Node>, INotifyPropertyChanged {
public static readonly Node Default;
// Constructors, listed in §2.1.4
// Explicit and implicit conversions, listed in §2.1.5
// Read/Write methods, listed in §2.1.6
// Navigation methods, listed in §2.1.7
public static bool operator ==(Node left, Node right);
public static bool operator !=(Node left, Node right);
public object AtomicValue { get; set; }
public Identifier Brand { get; set; }
public EdgeCollection Edges { get; }
public bool IsDefault { get; }
public bool IsListOrEmptyCollection { get; }
public bool IsReadOnly { get; }
public bool IsRecordOrEmptyCollection { get; }
public NodeKind NodeKind { get; }
public GraphStore Store { get; }
public object StoreKey { get; }
public event PropertyChangedEventHandler PropertyChanged { add; remove; }
public static GraphStore CreateDefaultStore();
public static Node CreateNodeForStoreKey(GraphStore store, object key);
public override bool Equals(object obj);
public bool Equals(Node other);
public override int GetHashCode();
public override string ToString();
public NodeCollection ViewAllNodes();
public GraphList ViewAsList();
public GraphDictionary ViewAsRecord();
public GraphList ViewAsTuple();
public GraphList ViewOrderedNodes();
public NodeCollection ViewUnorderedNodes();
}
}
The properties of Node directly reflect the structure of the
graph model; they are covered in detail in §2.1.1.
2.1.1. Node
Properties
public object AtomicValue { get; set; }
public Identifier Brand { get; set; }
public EdgeCollection Edges { get; }
public bool IsDefault { get; }
public bool IsListOrEmptyCollection { get; }
public bool IsReadOnly { get; }
public bool IsRecordOrEmptyCollection { get; }
public NodeKind NodeKind { get; }
public GraphStore Store { get; }
public object StoreKey { get; }
Atomic nodes have an AtomicValue, non-atomic nodes have an
atomic value of null. All nodes, but atomic nodes holding a symbolic reference,
can carry a Brand (§1.1.1). The kind of a node is reflected by its NodeKind
property (§2.1.2).
Non-atomic nodes have a collection of outgoing Edges; for
atomic nodes, this collection is always empty. For a discussion of the rich
surface provided over edge collections, see §.
Since non-atomic nodes with an empty collection of outgoing
edges are reported as having a node kind of Collection, the tests IsListOrEmptyCollection
and IsRecordOrEmptyCollection are provided to handle the case where a node of
interest should follow the list or record pattern, but where the empty case is
permissible.
If the backing store does not support updates of a node, it
is marked as IsReadOnly.
A node that has been constructed using Node’s default
constructor (recall that all CLR structs have a default constructor whose
implementation cannot be overridden), is marked as IsDefault. A node that has
been constructed using any but Node’s default constructor will have a valid
store (Store) and store key (StoreKey). The details of those two concepts can
be found in §3.1, but can be ignored for all but advanced scenarios.
2.1.2. NodeKind
Nodes are of one of a small number of defined kinds.
namespace System.Dataflow {
public enum NodeKind {
Undefined, Atomic, Composite, Collection, List, Record, Tuple
}
}
Only nodes constructed using the Node default constructor
will be of kind Undefined.
2.1.3. Classified
value types
Atomic nodes hold a value of a classified type: a type that
is part of the MGraph profile’s set of atomic types. The following table shows
how the classified CLR types map to M types. C# synonyms, as uses in the type
excerpts in this document, are also shown.
Note that types System.Date and System.Time are not part of
the .NET framework (yet) and found in System.Dataflow.dll.
2.1.4. Node
Constructors
Type Node has a constructor overloads that fall into two
groups: constructors for atomic value nodes and constructors for non-atomic
nodes. These groups are covered in that order below.
Constructors for atomic nodes take value of one of the
classified types (see §2.1.3) and, optionally, a brand and a store
implementation to use. If the value type is not known statically, then the object-typed
constructors can be used.
public Node(object constant);
public Node(Identifier brand, object constant);
public Node(GraphStore store, Identifier brand, object constant);
When holding a strongly typed value, the following
constructor overloads can be used.
Rationale: Using these constructors enables the underlying
graph store to avoid boxing. Whether values get boxed depends on the specific
store implementation.
public Node(bool constant);
public Node(Identifier brand, bool constant);
public Node(GraphStore store, Identifier brand, bool constant);
public Node(Identifier brand, byte[] constant);
public Node(GraphStore store, Identifier brand, byte[] constant);
public Node(Date constant);
public Node(Identifier brand, Date constant);
public Node(GraphStore store, Identifier brand, Date constant);
public Node(DateTime constant);
public Node(Identifier brand, DateTime constant);
public Node(GraphStore store, Identifier brand, DateTime constant);
public Node(DateTimeOffset constant;
public Node(Identifier brand, DateTimeOffset constant;
public Node(GraphStore store, Identifier brand, DateTimeOffset constant);
public Node(decimal constant);
public Node(Identifier brand, decimal constant;
public Node(GraphStore store, Identifier brand, decimal constant);
public Node(double constant);
public Node(Identifier brand, double constant);
public Node(GraphStore store, Identifier brand, double constant);
public Node(GraphReference constant);
public Node(GraphStore store, GraphReference constant);
public Node(Guid constant);
public Node(Identifier brand, Guid constant);
public Node(GraphStore store, Identifier brand, Guid constant);
public Node(int constant);
public Node(Identifier brand, int constant);
public Node(GraphStore store, Identifier brand, int constant);
public Node(long constant);
public Node(Identifier brand, long constant);
public Node(GraphStore store, Identifier brand, long constant);
public Node(string constant);
public Node(Identifier brand, string constant);
public Node(GraphStore store, Identifier brand, string constant);
public Node(Time constant);
public Node(Identifier brand, Time constant);
public Node(GraphStore store, Identifier brand, Time constant);
The following constructors for non-atomic nodes take a
collection of outgoing edges in combination with an optional brand. These
constructors are intended to support the effective cloning of non-atomic nodes.
public Node(EdgeCollection edges);
public Node(Identifier brand, EdgeCollection edges);
The following constructors supported graph patterns are composite,
collection, record, tuple, and list. If left unspecified, composites are built
by default. Optionally, the brand and store to use can be specified. The
successors are specified as enumerables or params of edges.
public Node(IEnumerable<Edge> edges);
public Node(Identifier brand, IEnumerable<Edge> edges);
public Node(NodeKind kind, IEnumerable<Edge> edges);
public Node(Identifier brand, NodeKind kind, IEnumerable<Edge> edges);
public Node(GraphStore store, IEnumerable<Edge> edges);
public Node(GraphStore store, Identifier brand, IEnumerable<Edge> edges);
public Node(GraphStore store, NodeKind kind, IEnumerable<Edge> edges);
public Node(GraphStore store, Identifier brand, NodeKind kind,
IEnumerable<Edge> edges);
public Node(params Edge[] edges);
public Node(Identifier brand, params Edge[] edges);
public Node(NodeKind kind, params Edge[] edges);
public Node(Identifier brand, NodeKind kind, params Edge[] edges);
public Node(GraphStore store, params Edge[] edges);
public Node(GraphStore store, Identifier brand, params Edge[] edges);
public Node(GraphStore store, NodeKind kind, params Edge[] edges);
public Node(GraphStore store, Identifier brand, NodeKind kind,
params Edge[] edges);
The following constructors support the graph patterns collection,
tuple, and list. If left unspecified, collections are built by default.
Optionally, the brand and store to use can be specified. The successors are
specified as enumerables or params of nodes.
public Node(NodeKind kind);
public Node(Identifier brand, NodeKind kind);
public Node(GraphStore store, NodeKind kind);
public Node(GraphStore store, Identifier brand, NodeKind kind);
public Node(params Node[] nodes);
public Node(Identifier brand, params Node[] nodes);
public Node(NodeKind kind, params Node[] nodes);
public Node(Identifier brand, NodeKind kind, params Node[] nodes);
public Node(GraphStore store, params Node[] nodes);
public Node(GraphStore store, Identifier brand, params Node[] nodes);
public Node(GraphStore store, NodeKind kind, params Node[] nodes);
public Node(GraphStore store, Identifier brand, NodeKind kind,
params Node[] nodes);
2.1.5. Node
Conversions
The following explicit conversions on Node support the
extraction of typed values from atomic nodes. They throw if the node is
non-atomic or if it holds a value of a different type.
public static explicit operator bool(Node node);
public static explicit operator byte[](Node node);
public static explicit operator char(Node node);
public static explicit operator Date(Node node);
public static explicit operator DateTime(Node node);
public static explicit operator DateTimeOffset(Node node);
public static explicit operator decimal(Node node);
public static explicit operator double(Node node);
public static explicit operator GraphReference(Node node);
public static explicit operator Guid(Node node);
public static explicit operator int(Node node);
public static explicit operator long(Node node);
public static explicit operator string(Node node);
public static explicit operator Time(Node node);
The following explicit conversions on Node support the
extraction of nullable typed values from atomic nodes. They return null if the
node is non-atomic or if it holds a value of a different type.
public static explicit operator bool?(Node node)
public static explicit operator char?(Node node)
public static explicit operator Date?(Node node)
public static explicit operator DateTime?(Node node)
public static explicit operator DateTimeOffset?(Node node)
public static explicit operator decimal?(Node node)
public static explicit operator double?(Node node)
public static explicit operator Guid?(Node node)
public static explicit operator GraphReference?(Node node)
public static explicit operator int?(Node node)
public static explicit operator long?(Node node)
public static explicit operator Time?(Node node)
The following implicit conversions on Node support the
conversion of values of classified type to atomic nodes.
public static implicit operator Node(bool value);
public static implicit operator Node(char value);
public static implicit operator Node(Date value);
public static implicit operator Node(DateTime value);
public static implicit operator Node(DateTimeOffset value);
public static implicit operator Node(decimal value);
public static implicit operator Node(double value);
public static implicit operator Node(Guid value);
public static implicit operator Node(GraphReference value);
public static implicit operator Node(int value);
public static implicit operator Node(long value);
public static implicit operator Node(string value);
public static implicit operator Node(Time value);
2.1.6. Node
Read/Write Support
The following methods support the reading of a graph from a
stream (yielding a sequence of edges).
Note: Reading
directly from a string is mostly useful for simple scenarios such as unit
tests.
public static IEnumerable<Edge> ReadFrom(GraphReader reader);
public static IEnumerable<Edge> ReadFrom(GraphReader reader,
Func<GraphStore> createStore);
public static IEnumerable<Edge> ReadFrom(TextReader reader);
public static IEnumerable<Edge> ReadFrom(TextReader reader,
Func<GraphStore> createStore);
public static IEnumerable<Edge> ReadFrom(Stream stream);
public static IEnumerable<Edge> ReadFrom(Stream stream,
Func<GraphStore> createStore);
public static IEnumerable<Edge> ReadFrom(Stream stream,
Encoding encoding);
public static IEnumerable<Edge> ReadFrom(Stream stream,
Encoding encoding, Func<GraphStore> createStore);
public static IEnumerable<Edge> ReadFrom(string filePath);
public static IEnumerable<Edge> ReadFrom(string filePath,
Func<GraphStore> createStore);
public static IEnumerable<Edge> ReadFrom(string filePath,
Encoding encoding);
public static IEnumerable<Edge> ReadFrom(string filePath,
Encoding encoding, Func<GraphStore> createStore);
public static IEnumerable<Edge> ReadFromString(string text);
public static IEnumerable<Edge> ReadFromString(string text,
Func<GraphStore> createStore);
Given a node, the following methods support the writing of
the graph reachable from that node to a stream.
Note: Writing
directly to a string is mostly useful for simple scenarios such as unit tests.
Note: Writing to
a string writes the entire reachable graph, while calling ToString returns a
size-bounded string that summarizes the top structure of the reachable graph.
public void WriteTo(Stream stream);
public void WriteTo(Stream stream, Encoding encoding);
public void WriteTo(string filePath);
public void WriteTo(string filePath, Encoding encoding);
public void WriteTo(TextWriter writer);
public void WriteTo(GraphWriter writer);
public string WriteToString();
Issue: Methods to
directly write a collection of edges should be added to match the model of the
read methods.
2.1.7. Node
Navigation Methods
Given a node, the following methods use that node as a
navigation scope and attempt to resolve a given symbolic reference within that
scope. If the reference is global (reference.IsGlobal), then the resolution is
only supported if the underlying store maintains a set of outer edges. (See §1.1.3
for details on symbolic references and §3.2 for details on reference resolution
as performed by graph stores.)
public Node NavigateFromRootTo(GraphReference reference);
public Node NavigateFromRootTo(string path);
public IAsyncResult BeginNavigateFromRootTo(GraphReference reference,
AsyncCallback callback, object asyncState);
public IAsyncResult BeginNavigateFromRootTo(string path,
AsyncCallback callback, object state);
public Node EndNavigateFromRootTo(IAsyncResult asyncResult);
public Node NavigateTo(GraphReference reference);
public Node NavigateTo(string path);
public IAsyncResult BeginNavigateTo(GraphReference reference,
AsyncCallback callback, object asyncState);
public IAsyncResult BeginNavigateTo(string path,
AsyncCallback callback, object state);
public Node EndNavigateTo(IAsyncResult asyncResult);
2.2. Edge
The type Edge represents a (label, target node) pair.
Typically, edges belong to a source node – the edge is then a graph-theoretic labeled
directed edge from the source to the target node. However, labeled edges in the
MGraph object model can also stand on their own, that is, not have an
associated source node. Such edges are interpreted as leading into a graph,
optionally via a label, while remaining silent on the source of such an edge.
In fact, edges may also be shared among multiple source nodes.
namespace System.Dataflow {
public struct Edge : IEquatable<Edge> {
public static readonly Edge Default = new Edge();
public Edge(Node node);
public Edge(KeyValuePair<Label, Node> pair);
public Edge(Label label, Node node);
public static explicit operator Edge(KeyValuePair<Label, Node> pair);
public static explicit operator KeyValuePair<Label, Node>(Edge edge);
public static bool operator ==(Edge left, Edge right);
public static bool operator !=(Edge left, Edge right);
public bool IsDefault { get; }
public KeyValuePair<Label, Node> KeyValuePair { get; }
public Label Label { get; }
public Node Node { get; }
public override bool Equals(object obj);
public bool Equals(Edge other);
public static IEnumerable<Edge> FromKeyValuePairs(
IEnumerable<KeyValuePair<Label, Node>> pairs);
public static IEnumerable<Edge> FromNodes(IEnumerable<Node> nodes);
public override int GetHashCode();
public static IEnumerable<KeyValuePair<Label, Node>> ToKeyValuePairs(
IEnumerable<Edge> edges);
public static IEnumerable<Node> ToNodes(IEnumerable<Edge> edges);
public override string ToString();
}
}
Besides the properties Label and Node representing the pair,
property KeyValuePair (and a corresponding constructor and a couple of static
helpers) are provided to easy interoperation with existing types defined in
terms of KeyValuePair; most importantly, IDictionary<TKey, TValue>.
The static helper methods FromEdges and FromNodes are provided
to translate between the two common cases of enumerations of edges and
enumerations of (target) nodes.
2.3. Label
Labels are used to represent edge labels. Labels are either
based on non-negative integers or on uninterpreted strings. Labels are atomized
per AppDomain. That is, two labels are equal exactly if their object references
are equal.
namespace System.Dataflow {
[Serializable]
public sealed class Label
: IEquatable<Label>, IComparable, IComparable<Label>
{
public static Label Empty { get; }
public string Text { get; }
public int Value { get; }
public static implicit operator Label(string text);
public static implicit operator Label(int value);
public static bool operator ==(Label left, Label right);
public static bool operator !=(Label left, Label right);
public static bool operator <(Label id1, Label id2);
public static bool operator <=(Label id1, Label id2);
public static bool operator >(Label id1, Label id2);
public static bool operator >=(Label id1, Label id2);
public int Compare(object other);
int IComparable.CompareTo(object obj);
int IComparable<Label>.CompareTo(Label other);
public override bool Equals(object obj);
bool IEquatable<Label>.Equals(Label other)
public static Label Get(int value);
public static Label Get(string text);
public static Label Get(string text, int index, int count);
public override int GetHashCode();
public static bool IsNullOrEmpty(Label label);
public override string ToString();
}
}
2.4. Identifier
Identifiers are used to represent brands. Identifiers are
uninterpreted strings that are atomized per AppDomain. That is, two identifiers
are equal exactly if their object references are equal.
namespace System.Dataflow {
public sealed class Identifier
: IEquatable<Identifier>, IComparable, IComparable<Identifier>
{
public static Identifier Empty { get; }
public string Text { get; }
public static implicit operator Identifier(string text);
public static bool operator ==(Identifier left, Identifier right);
public static bool operator !=(Identifier left, Identifier right);
public static bool operator <(Identifier id1, Identifier id2);
public static bool operator <=(Identifier id1, Identifier id2);
public static bool operator >(Identifier id1, Identifier id2);
public static bool operator >=(Identifier id1, Identifier id2);
public int Compare(object other);
int IComparable.CompareTo(object obj);
int IComparable<Identifier>.CompareTo(Identifier other);
public override bool Equals(object obj);
public static Identifier Get(string text);
public static Identifier Get(string text, int index, int count);
public override int GetHashCode();
public static bool IsNullOrEmpty(Identifier id);
public override string ToString();
bool IEquatable<Identifier>.Equals(Identifier other);
}
}
2.5. GraphReference
Symbolic references are held by atomic nodes as values of
type GraphReference.
namespace System.Dataflow {
public struct GraphReference : IEquatable<GraphReference> {
public static readonly GraphReference Default = new GraphReference();
public GraphReference(string path);
public GraphReference(IEnumerable<Label> segments);
public GraphReference(params Label[] segments);
public GraphReference(IEnumerable<string> segments);
public GraphReference(params string[] segments);
public static explicit operator GraphReference(string path);
public static implicit operator Node(GraphReference value);
public static bool operator ==(GraphReference reference1,
GraphReference reference2);
public static bool operator !=(GraphReference reference1,
GraphReference reference2);
public int Count { get; }
public bool IsDefault { get; }
public bool IsGlobal { get; }
public IEnumerable<Label> Segments { get; }
public Label this[int index] { get; }
public override bool Equals(object obj);
public bool Equals(GraphReference other);
public static string Format(GraphReference reference);
public override int GetHashCode();
public static Label LabelFromSegment(string segment);
public override string ToString();
public static Label[] Parse(string path);
}
}
A global reference (one starting with a dot in “M” notation)
is flagged as IsGlobal and will have a first segment that is empty.
The Format method creates a string from a given reference
that adheres to MGraph (and thus “M”) syntax, including proper quoting.
The LabelFromSegment method constructs an integer label if
possible and a string label otherwise.
2.6. EdgeCollection
EdgeCollection is a lightweight struct that represents the
collection of outgoing edges of the corresponding source node (Source property).
namespace System.Dataflow {
public struct EdgeCollection : ICollection<Edge> {
public int Count { get; }
public bool IsDefault { get; }
public bool IsReadOnly { get; }
public Node Source { get; }
public Node this[Label label] { get; set; }
public static bool operator ==(EdgeCollection left,
EdgeCollection right);
public static bool operator !=(EdgeCollection left,
EdgeCollection right);
public void Add(Edge item);
public void Add(Label label, Node node);
public void Add(Node node);
public void Add(params Edge[] edges);
public void Add(params object[] nodes);
public void Clear();
public bool Contains(Edge item);
public bool ContainsLabel(Label label);
public void CopyTo(Edge[] array, int arrayIndex);
public override bool Equals(object obj);
public EdgeCollectionEnumerator GetEnumerator();
IEnumerator<Edge> IEnumerable<Edge>.GetEnumerator();
IEnumerator IEnumerable.GetEnumerator();
public override int GetHashCode();
public Node GetLabeledEdgeOrDefault(Label label);
public IEnumerable<Label> GetLabels();
public bool Remove(Edge item);
public override string ToString();
}
}
The Add overloads on EdgeCollection allow for direct
contributions to the edge collection of a node. The C# collection initializer
mechanism can be used for this purpose. The following example shows the direct
use of Add on an existing node’s edge collection as well as the creation of a
new node with a new edge collection created using a collection initializer:–
person.Edges.Add( // Contribute new person to collection of villagers.
"Manager",
new Node(
"Person",
new EdgeCollection { { "Name", "John" }, { "Id", 42 } }
)
);
Rationale: Note
that the primary enumerator, EdgeCollectionEnumerator, is again a struct. This
enables the performance optimizations around the static enumerator pattern as
supported by the C# compiler.
namespace System.Dataflow {
public struct EdgeCollectionEnumerator : IEnumerator<Edge> {
public EdgeCollectionEnumerator(GraphStoreEdgeCollectionAdapter adapter,
EmbeddableEnumeratorState state);
public Edge Current { get; }
object IEnumerator.Current { get; }
public void Dispose();
public bool MoveNext();
public void Reset();
}
}
2.7. Examples
Below is a simple scenario for using Node directly to
construct a few simple things:–
var two = new Node(2);
var @true = new Node(true);
var twoAndTrue = new Node(two, @true);
var flag =
(from node in twoAndTrue.ViewAllNodes()
where node.AtomicValue is bool
select (bool)node).Single();
var number =
(from node in twoAndTrue.ViewAllNodes()
where node.AtomicValue is int
select (int)node).Single();
Assert.IsTrue(flag);
Assert.AreEqual(2, number);
Given the two atomic nodes constructed above, a simple tuple
can be constructed:–
var pair = new Node(NodeKind.Tuple, two, @true);
var pairAsTuple = pair.ViewAsTuple();
var first = pairAsTuple[0];
var second = pairAsTuple[1];
Assert.AreEqual(2, (int)first);
Assert.IsTrue((bool)second);
Note the use of ViewAsTuple() on a node obtain a view that
interprets the node’s successors using the tuple pattern. Given such a view,
tuple elements can then be accessed by indexing.
The following example shows the construction of instances of
the list pattern as well as the equality properties of atomic and non-atomic
nodes. Specifically, atomic nodes have equality defined based on their value,
while non-atomic nodes have equality defined based on their
(graph-store-managed) identity.
var constants = new Node(NodeKind.List, first, two);
var structures = new Node(NodeKind.List, tuple, tuple);
var constantsAsList = constants.ViewAsList();
var structuresAsList = structures.ViewAsList();
Assert.AreEqual(constantsAsList[0], constantsAsList[1]); // by value
Assert.AreEqual(structuresAsList[0], structuresAsList[1]); // by identity
Graphs can be read from streamed forms:–
string villageGraph = @"
{
Villagers =>
{
Jenn => Person { Name => 'Jennifer', Age => 28, Spouse => Rich },
Rich => Person { Name => 'Richard', Age => 26, Spouse => Jenn },
Charly => Person { Name => 'Charlotte' }
},
HaveSpouses => { Villagers.Rich, Villagers.Jenn }
}";
var village = Node.ReadFromString(villageGraph).Single().Node;
var villagers = village.Edges["Villagers"];
New successors can be contributed to a non-atomic node:–
villagers.Add( // Contribute new person to collection of villagers.
"Billy",
new Node(
"Person",
new EdgeCollection { { "Name", "William" }, { "Age", 3 } }
)
);
villagers.Add( // Contribute age to existing person Charly.
"Charly",
new Node(new EdgeCollection { { "Age", 12 } })
);
var charly = villagers.Edges["Charly"];
var billy = villagers.Edges["Billy"];
village.Add(
"Youth",
new Node(charly, billy)
);
var youth = village.Edges["Youth"];
var charly2 =
(from node in youth.ViewAllNodes()
where node.Edges["Name"] == "Charlotte"
select node).Single();
var billy2 =
(from node in youth.ViewAllNodes()
where node.Edges["Age"] == 3
select node).Single();
Assert.AreEqual(billy, billy2);
Assert.AreEqual(charly, charly2);
Assert.IsTrue(charly.Edges["Age"] == 12);
Note: The
contribution mechanics of the system respect the rules of graph merging (also
referred to as graph coalescing) of “M”: if a contributed edge carries a label
and the node being contributed to already has an outgoing edge with the same
label, then both edges need to point to a non-atomic node. The content of the
contributed edge’s target node will be merged into the content of the node that
is being contributed to. This process repeats recursively. In the example
above, this mechanism is used to add an Age edge to the existing node under Villagers.Charly.
3. Graph
Stores
Graph stores are implementations of graph representations,
including possible virtualization over external stores.
Besides the abstract type GraphStore, the system provides
two concrete implementations: DefaultGraphStore and ShadowGraphStore. Graph
stores have a large feature surface area that is mostly covered by defaults
provided by GraphStore. This large area allows store implementations to
specialize deeply and gain high-performance characteristics; the almost
complete coverage by default implementations enables a low barrier to entry for
custom graph-store implementations.
The DefaultGraphStore implementation uses in-memory
structures to represent graphs and can be instantiated as needed to create
separate graphs.
The ShadowGraphStore implementation builds on the default
one, but exists in only one singleton instance per AppDomain. It is used
automatically when constructing nodes using type Node without specifying a
store. The shadow store is stateless, enabling thread-safe use as well as
avoiding coupling among graphs independently created using the shadow store. To
avoid state, the shadow store does not maintain a collection of outer edges
and, therefore, has a default synthetic source node.
3.1. Stores
and Store Keys
Graph stores deal in abstract nodes of externally invisible
representation. Stores identify such nodes by handing out and taking back store
keys: objects of unrefined type object.
A node is uniquely identified by a pair (store, store key).
It is possible that a group of related stores share keys. For example, one
store may represent a graph’s nodes while another, related store, may represent
a particular kind of metadata associated with the nodes of the former store. By
using the same keys for both stores, that association is maintained efficiently
without “leaking” any particular association mechanism (such as attached
properties) and without committing a store to a single metadata story.
The Node/Edge surface types all use (store, store key) pairs
as their only state. Creating, copying, and destroying instances of these
surface types is therefore very cheap and, in particular, creates no pressure
on the garbage collector.
3.2. Reference
Resolution
A graph store may provide a reference resolution service. If
provided, that service can be used to perform a number of tasks: (i) resolve
symbolic references, yielding the target nodes of references, (ii) substitute
resolved targets for symbolic references, and (iii) compute a set of symbolic
references that, if substituted in for some of a graph’s edges, yields a
sharing and cycle free graph.
Most graph stores can work with the default resolver
implementation provided in the GraphReferenceResolver class. The default
resolver is stateless. Custom resolvers can subclass the default resolver.
namespace System.Dataflow {
public class GraphReferenceResolver {
public virtual bool TryBreakAllSharing(Node root,
out Dictionary<Node, GraphReferenceTable> referenceTables);
public virtual bool TryResolveAllReferences(Node root);
public virtual Node ResolveReferenceInScope(Node scope,
GraphReference reference);
public virtual IAsyncResult BeginResolveReferenceInScope(
Node scope, GraphReference reference,
AsyncCallback callback, object asyncState);
public virtual Node EndResolveReferenceInScope(IAsyncResult result);
}
}
BeginResolveReferenceInScope and EndResolveReferenceInScope
implement the asynchronous pattern for asynchronous invocation of ResolveReferenceInScope.
The default resolver is synchronous, but custom resolvers may opt to implement
asynchronous resolution.
Rationale: Asynchronous reference resolution allows virtual
graph stores to “page in” further parts of a graph without blocking clients
that have been written to use the asynchronous version of ResolveReferenceInScope.
This is the only asynchronous entry point into graph stores, keeping the
overall programming model straight forward.
3.3. GraphStore
Class GraphStore is the abstract base for all graph store
implementations. It provides default implementations for most parts of its
(very large) surface area, allowing graph store implementations to be gradually
enriched to address performance issues.
3.3.1. Graph
store clients
Below is the public surface area of GraphStore; writing
client code against GraphStore is for advanced scenarios – most use cases
should be use the graph model types (Node, Edge, …) instead.
namespace System.Dataflow {
public abstract class GraphStore {
public virtual IEqualityComparer<object> KeyComparer { get; }
public virtual ICollection<Edge> OuterEdges { get; }
public virtual GraphReferenceResolver ReferenceResolver { get; set; }
public virtual Node SyntheticSource { get; }
public void AddContribution(GraphReader reader);
public void AddContribution(Node node, IEnumerable<Edge> edges);
public virtual void AddContribution(Node node, Edge edge);
public virtual void AddPropertyChangedEventHandler(Node node,
PropertyChangedEventHandler eventHandler);
public Node CreateCollection(Identifier brand, NodeKind kind);
public virtual Node CreateCollection(Identifier brand, NodeKind kind,
IEnumerable<Node> nodes);
public virtual Node CreateCollection(Identifier brand, NodeKind kind,
IEnumerable<Edge> edges);
public virtual Node CreateConstant(Identifier brand, object value);
public virtual Node CreateConstant(Identifier brand, byte[] value);
public virtual Node CreateConstant(Identifier brand, Date value);
public virtual Node CreateConstant(Identifier brand, DateTime value);
public virtual Node CreateConstant(Identifier brand,
DateTimeOffset value);
public virtual Node CreateConstant(Identifier brand, decimal value);
public virtual Node CreateConstant(Identifier brand, Double value);
public virtual Node CreateConstant(Identifier brand, Guid value);
public virtual Node CreateConstant(Identifier brand, int value);
public virtual Node CreateConstant(Identifier brand, long value);
public virtual Node CreateConstant(Identifier brand, bool value);
public virtual Node CreateConstant(Identifier brand); // creates null
public virtual Node CreateConstant(GraphReference value);
public virtual Node CreateConstant(Identifier brand, string value);
public virtual Node CreateConstant(Identifier brand, Time value);
public abstract Identifier GetBrand(Node node);
public virtual ConstantClassification GetConstantClassification(
object value);
public static ConstantClassification GetConstantClassification(
object value);
public virtual bool GetConstantBoolean(Node node);
public virtual byte[] GetConstantBytes(Node node);
public virtual Date GetConstantDate(Node node);
public virtual DateTime GetConstantDateTime(Node node);
public virtual DateTimeOffset GetConstantDateTimeOffset(Node node);
public virtual decimal GetConstantDecimal(Node node);
public virtual double GetConstantDouble(Node node);
public virtual GraphReference GetConstantGraphReference(Node node);
public virtual Guid GetConstantGuid(Node node);
public virtual int GetConstantInt32(Node node);
public virtual long GetConstantInt64(Node node);
public virtual string GetConstantString(Node node);
public virtual Time GetConstantTime(Node node);
public virtual object GetConstantValue(Node node);
public virtual IDictionary<Label, Node> GetLabeledEdges(Node node);
public abstract NodeKind GetNodeKind(Node node);
public virtual ICollection<Node> GetUnlabeledEdges(Node node);
public Node NavigateFromRootTo(GraphReference reference);
public IAsyncResult BeginNavigateFromRootTo(GraphReference reference,
AsyncCallback callback, object asyncState);
public Node EndNavigateFromRootTo(IAsyncResult asyncResult);
public virtual Node NavigateTo(Node scope, GraphReference reference);
public virtual IAsyncResult BeginNavigateTo(Node scope,
GraphReference reference, AsyncCallback callback,
object asyncState);
public virtual Node EndNavigateTo(IAsyncResult asyncResult);
public virtual bool OwnsKey(object key);
public static GraphStore ReadFrom(TextReader reader);
public static GraphStore ReadFrom(Func<GraphStore> createDoc,
TextReader reader);
public static T ReadFrom<T>(TextReader reader)
where T : GraphStore, new();
public static GraphStore ReadFrom(GraphReader reader);
public static GraphStore ReadFrom(Func<GraphStore> createStore,
GraphReader reader);
public static T ReadFrom<T>(GraphReader reader)
where T : GraphStore, new();
public static GraphStore ReadFrom(string filePath);
public static GraphStore ReadFrom(Func<GraphStore> createDoc,
string filePath);
public static T ReadFrom<T>(string filePath)
where T : GraphStore, new();
public static GraphStore ReadFrom(string filePath, Encoding encoding);
public static GraphStore ReadFrom(Func<GraphStore> createDoc,
string filePath, Encoding encoding);
public static T ReadFrom<T>(string filePath, Encoding encoding)
where T : GraphStore, new();
public static GraphStore ReadFrom(Stream stream);
public static GraphStore ReadFrom(Func<GraphStore> createDoc,
Stream stream);
public static T ReadFrom<T>(Stream stream) where T : GraphStore, new();
public static GraphStore ReadFrom(Stream stream, Encoding encoding);
public static GraphStore ReadFrom(Func<GraphStore> createDoc,
Stream stream, Encoding encoding);
public static T ReadFrom<T>(Stream stream, Encoding encoding)
where T : GraphStore, new();
public static GraphStore ReadFromString(string text);
public static GraphStore ReadFromString(Func<GraphStore> createDoc,
string text);
public static T ReadFromString<T>(string text)
where T : GraphStore, new();
public virtual void RemovePropertyChangedEventHandler(Node node,
PropertyChangedEventHandler eventHandler);
public virtual void ResolveAllReferences();
public virtual void SetBrand(Node node, Identifier brand);
public virtual void SetConstantValue(Node node, object value);
public virtual void SetEdge(Node node, Label label, Node value);
public virtual void SetListValue(Node list, int index, Node element);
public virtual void SetReferenceTarget(Action<Node> nodeVariable,
Node target);
public virtual string ToString(Node node);
public void WriteTo(string filePath);
public void WriteTo(string filePath, Encoding encoding);
public void WriteTo(Stream stream);
public void WriteTo(Stream stream, Encoding encoding);
public void WriteTo(TextWriter target);
public void WriteTo(GraphWriter writer);
public virtual void WriteTo(GraphWriter writer, Node root,
bool includeRootNode);
public string WriteToString();
}
}
The SyntheticSource property provides a node that is not
technically part of the graph managed by the graph store, but that
synthetically gathers all outer edges of the graph. Graph stores may or may not
keep track of such outer edges and even if they do, may not do so completely.
That is, there may be parts of a graph that are unreachable through any of the
outer edges – and thus from the synthetic source node.
Rationale: Providing
outer edges enables a general model of disconnected graphs that are still
globally reachable. Providing the synthetic source over the collection of outer
edges is a convenience mechanism for many graph algorithms that naturally
operate from a given single source node.
3.3.2. Adapters
for Graph-Pattern Views
Adapters are used by graph stores to possibly provide
streamlined, allocation free access and enumeration for specialized views over
graph patterns. The adapter design allows for implementations that are
stateless singletons – and all default adapters are such singletons.
Adapters map point functions to efficient paths in the store
implementation, as available. The default adapters all map to the
collection-level getters (and their collection's enumerators).
public virtual GraphStoreNodeCollectionAdapter GetAdapterForAllNodes(
object storeKey);
public virtual GraphStoreEdgeCollectionAdapter GetAdapterForEdges(
object storeKey);
public virtual GraphStoreListAdapter<Node> GetAdapterForLists(
object storeKey);
public virtual GraphStoreDictionaryAdapter GetAdapterForRecords(
object storeKey);
public virtual GraphStoreNodeCollectionAdapter GetAdapterForRecordValues(
object storeKey);
public virtual GraphStoreListAdapter<Node> GetAdapterForOrderedNodes(
object storeKey);
public virtual GraphStoreNodeCollectionAdapter GetAdapterForUnorderedNodes(
object storeKey);
public virtual GraphStoreListAdapter<Node> GetAdapterForTuples(
object storeKey);
The (abstract) adapter classes referenced above provide the
protocols of corresponding collection abstractions. The following table shows
the correspondence:–
As an example, below is the class GraphStoreNodeCollectionAdapter:–
namespace System.Dataflow {
public abstract class GraphStoreNodeCollectionAdapter {
public abstract void Add(GraphStore store, object storeKey, Node item);
public abstract void Clear(GraphStore store, object storeKey);
public abstract bool Contains(GraphStore store, object storeKey,
Node item);
public abstract void CopyTo(GraphStore store, object storeKey,
Node[] array, int arrayIndex);
public abstract void Dispose(ref EmbeddableEnumeratorState state);
public abstract int GetCount(GraphStore store, object storeKey);
public abstract Node GetCurrent(ref EmbeddableEnumeratorState state);
public abstract EmbeddableEnumeratorState GetEnumerator(GraphStore store,
object storeKey);
public abstract bool IsReadOnly(GraphStore store, object storeKey);
public abstract bool MoveNext(ref EmbeddableEnumeratorState state);
public abstract void Reset(ref EmbeddableEnumeratorState state);
public abstract bool Remove(GraphStore store, object storeKey,
Node item);
}
}
As long as specific enumeration implementations can live
with the limited space to encode state afforded by the struct EmbeddableEnumeratorState,
all enumeration can be kept allocation-free. However, it is equally simple to
just store a reference to a standard (heap-allocated) enumerator in the object
reference held by that struct and use that to connect to regular enumerable
collections.
namespace System.Dataflow {
public struct EmbeddableEnumeratorState {
public GraphStore Store { get; set; }
public object StoreKey { get; set; }
public object State { get; set; }
public long Index { get; set; }
}
}
3.3.3. Custom
graph stores
Class GraphStore implements split/merge mechanisms to enable
simple custom graph store implementations while not getting in the way of sophisticated,
high-performance ones.
Simple graph stores merely operate in terms of edge
collections and GraphStore splits those edge collections into a collection of
nodes for unlabeled edges and a dictionary from label to node for labeled
edges. From there, it projects mutable views that correspond to the list and
tuple graph patterns. The following method illustrates how this is achieved by
revealing some of the implementation details of GraphStore:–
protected virtual IList<Node> GetTupleValues(Node node) {
var coll = new NodeAsSplitCollection { Node = node };
return new TupleView(coll);
}
Here, NodeAsSplitCollection is an internal class used by GraphStore
to implement a mutable view implementing ISplitCollection over an arbitrary
node. This is the splitting step.
namespace System.Dataflow {
public interface ISplitCollection : IEnumerable<Edge> {
Identifier Brand { get; }
ICollection<Node> UnlabeledEdges { get; }
bool IsReadOnly { get; }
IDictionary<Label, Node> LabeledEdges { get; }
}
}
The second step wraps a mutable tuple view (using the
internal class TupleView) around the split view, yielding the mutable IList<Node>
that represents the tuple elements.
Sophisticated graph stores operate directly in terms of
specialized views with an opportunity to fine-tune the representations chosen
for each of the standard graph patterns. Class DefaultGraphStore (§3.4) is an
example of such a sophisticated implementation that implements the various
graph patterns using separate representation classes and that hooks into the GraphStore
logic directly at the ISplitCollection level.
Below is the abstract surface area of GraphStore that every
graph store has to implement as well as the protected hooking surface to hook
subclasses into the default implementations in the base class:–
namespace System.Dataflow {
public abstract class GraphStore {
protected static ICollection<Edge> EmptyEdges { get; }
// Valid if IsNode(node) && GetKind(node) != Reference,
// Identifier.Empty otherwise. Never returns null or throws.
public abstract Identifier GetBrand(Node node);
protected virtual IEnumerator<Edge> GetEdgeEnumerator(Node node);
// Valid if IsCollection(node), empty immutable collection otherwise.
// Never returns null or throws.
// post: non-null collection of pairs, where pair.Key is either null
// or unique; may be empty
protected abstract ICollection<Edge> GetEdges(Node node);
// pre: IsRecord(node) && HasItem(node, label)
protected virtual Node GetLabeledEdge(Node node, Label label);
// pre: IsRecord(node)
protected virtual Node GetLabeledEdgeOrDefault(Node node, Label label);
// Valid if IsNode(node), empty enumeration otherwise.
// Never returns null or throws.
protected virtual IEnumerable<Node> GetLabeledValues(Node node);
// Valid if IsNode(node), empty enumeration otherwise.
// Never returns null or throws.
protected virtual IEnumerable<Label> GetLabels(Node node);
// Valid if IsList(node), empty immutable list otherwise.
// Never returns null or throws.
protected virtual IList<Node> GetListValues(Node node);
// Valid if IsNode(node), MNodeKind.None otherwise.
// Never returns null or throws.
public abstract NodeKind GetNodeKind(Node node);
// Valid if IsList(node) or IsTuple(node),
// empty immutable collection otherwise.
// Never returns null or throws.
protected virtual IList<Node> GetOrderedValues(Node node);
// Valid if IsTuple(node), empty immutable list otherwise.
// Never returns null or throws.
protected virtual IList<Node> GetTupleValues(Node node);
// Default: returns GetUnlabeledNodeCollection(node);
// override to direct contributions to lists/tuples/...
protected virtual ICollection<Node> GetUnlabeledContributionTarget(
Node node);
// Valid if IsCollection(node), empty immutable collection otherwise.
// Never returns null or throws.
protected virtual ICollection<Node> GetValues(Node node);
// For all Has/Is tests but IsNode:
// valid if IsNode(node), false otherwise.
// Never returns null or throws.
protected virtual bool HasLabel(Node node, Label label);
// If IsAtomic(node) then !node.IsDefault && IsNode(node)
// && !IsCollection(node)
protected virtual bool IsAtomic(Node node);
// If IsCollection(node) then !node.IsDefault && IsNode(node)
// && IsComposite(node)
protected virtual bool IsCollection(Node node);
// If IsComposite(node) then !node.IsDefault && IsNode(node)
// && !IsAtomic(node)
protected virtual bool IsComposite(Node node);
// If IsList(node) then !node.IsDefault && IsNode(node) && IsRecord(node)
protected virtual bool IsList(Node node);
// If IsRecord(node) then !node.IsDefault && IsNode(node)
// && IsComposite(node)
protected virtual bool IsRecord(Node node);
// If IsTuple(node) then !node.IsDefault && IsNode(node)
// && IsRecord(node)
protected virtual bool IsTuple(Node node);
}
}
The following shows the virtual members of GraphStore,
together with the default behavior exposed.
namespace System.Dataflow {
public abstract class GraphStore {
// Default key comparer uses object.Equals/GetHashCode
public virtual IEqualityComparer<object> KeyComparer { get; }
// Default returns SyntheticSource.Store.GetEdges(SyntheticSource)
public virtual ICollection<Edge> OuterEdges { get; }
// Initialized to direct instance of GraphReferenceResolver
public virtual GraphReferenceResolver ReferenceResolver { get; set; }
// Default uses ShadowGraphStore to create a composite node
// to hold outer edges
public virtual Node SyntheticSource { get; }
// Default coalescing logic; override to distribute logic elsewhere
// or to coalesce lazily.
public virtual void AddContribution(Node node, Edge edge);
// Default: empty
public virtual void AddPropertyChangedEventHandler(Node node,
PropertyChangedEventHandler eventHandler);
// Override to accelerate the case where the collection
// is known to be constructed over unlabeled edges only
public virtual Node CreateCollection(Identifier brand, NodeKind kind,
IEnumerable<Node> nodes);
// Override at least this one method to support store-specific
// construction. Default throws NotSupportedException, rendering
// the store essentially read-only.
public virtual Node CreateCollection(Identifier brand, NodeKind kind,
IEnumerable<Edge> edges);
// By default, all constants (incl. null) are representing
// themselves in a graph. Therefore, by default, only the default
// brand is accepted; otherwise this method throws.
public virtual Node CreateConstant(Identifier brand, object value);
// The following methods, by default, just call through the boxing
// CreateConstant method above. Override to avoid boxing.
public virtual Node CreateConstant(Identifier brand, byte[] value);
public virtual Node CreateConstant(Identifier brand, Date value);
public virtual Node CreateConstant(Identifier brand, DateTime value);
public virtual Node CreateConstant(Identifier brand,
DateTimeOffset value);
public virtual Node CreateConstant(Identifier brand, decimal value);
public virtual Node CreateConstant(Identifier brand, Double value);
public virtual Node CreateConstant(Identifier brand, Guid value);
public virtual Node CreateConstant(Identifier brand, int value);
public virtual Node CreateConstant(Identifier brand, long value);
public virtual Node CreateConstant(Identifier brand, bool value);
public virtual Node CreateConstant(Identifier brand); // creates null
public virtual Node CreateConstant(GraphReference value);
public virtual Node CreateConstant(Identifier brand, string value);
public virtual Node CreateConstant(Identifier brand, Time value);
// For adapter retrieval methods, see §3.3.2
// Default returns GetConstantClassification(node.StoreKey),
// assuming that constant values represent themselves.
// Returns ConstantClassification.Node if value cannot be classified,
// never throws.
public virtual ConstantClassification GetConstantClassification(
Node node);
// The following methods share the precondition:
// IsConstant(node) && GetConstantClassification(node) == ...
// Override to enable fast access to type-specific atom representations.
public virtual bool GetConstantBoolean(Node node);
public virtual byte[] GetConstantBytes(Node node);
public virtual Date GetConstantDate(Node node);
public virtual DateTime GetConstantDateTime(Node node);
public virtual DateTimeOffset GetConstantDateTimeOffset(Node node);
public virtual decimal GetConstantDecimal(Node node);
public virtual double GetConstantDouble(Node node);
public virtual GraphReference GetConstantGraphReference(Node node);
public virtual Guid GetConstantGuid(Node node);
public virtual int GetConstantInt32(Node node);
public virtual long GetConstantInt64(Node node);
public virtual string GetConstantString(Node node);
public virtual Time GetConstantTime(Node node);
// Valid if IsConstant(node), null otherwise, never throws.
public virtual object GetConstantValue(Node node);
// Valid if IsNode(node), empty immutable dictionary otherwise.
// Never returns null or throws.
// Default constructs a view over the result returned by GetEdges(node)
public virtual IDictionary<Label, Node> GetLabeledEdges(Node node);
// Valid if IsCollection(node), empty immutable collection otherwise.
// Never returns null or throws.
// Default constructs a view over the result returned by GetEdges(node)
public virtual ICollection<Node> GetUnlabeledEdges(Node node);
// By default, only SyntheticRoot (OuterEdges) is mutable.
public virtual bool IsReadOnly(Node node);
// Default returns ReferenceResolver.ResolveReferenceInScope
public virtual Node NavigateTo(Node scope, GraphReference reference);
// Default returns ReferenceResolver.BeginNavigateTo
public virtual IAsyncResult BeginNavigateTo(Node scope,
GraphReference reference, AsyncCallback callback,
object asyncState);
// Default returns ReferenceResolver.EndNavigateTo
public virtual Node EndNavigateTo(IAsyncResult asyncResult);
// By default, it is assumed that non-store-specific objects
// can be used as keys. For instance, stores may represent
// atomic nodes just by their (possibly boxed) constant value.
// OwnsKey enables strongly key-owning stores to protect against
// foreign store keys. However, OwnsKey cannot be used to reliably
// determine the owner for a store key, unless it is known that
// _all_ queried stores use strongly owned keys.
public virtual bool OwnsKey(object key);
// Default: empty
public virtual void RemovePropertyChangedEventHandler(Node node,
PropertyChangedEventHandler eventHandler);
// Default: calls
// ReferenceResolver.TryResolveAllReferences(SyntheticSource)
// and throws if some references cannot be resolved.
public virtual void ResolveAllReferences();
// pre: !IsReadOnly(node) && !IsReference(node)
// Default: throw NotSupportedException
public virtual void SetBrand(Node node, Identifier brand);
// pre: !IsReadOnly(node) && IsAtomic(node)
// Default: throw NotSupportedException
public virtual void SetConstantValue(Node node, object value);
// pre: !IsReadOnly(node) && IsCollection(node)
// Default: throw NotSupportedException
public virtual void SetEdge(Node node, Label label, Node value);
// pre: !IsReadOnly(list) && IsList(list)
// Default: throw NotSupportedException
public virtual void SetListValue(Node list, int index, Node element);
// pre: nodeVariable != null && IsNode(target)
// Note: Stores that ban setting of reference targets need to also
// ban reference resolution.
// Default: call nodeVariable(target)
public virtual void SetReferenceTarget(Action<Node> nodeVariable,
Node target);
// post: if string.Empty is returned,
// then no special string encoding is available.
// (The caller may substitute a default string value in such a case.)
// Default: builds a debugger-friendly size-limited summary
// of the graph reachable from the given node.
public virtual string ToString(Node node);
// Write the reachable graph from the given node.
// If includeRootNode, then include the given node, otherwise write
// a sequence of that node’s outgoing edges.
// Default: use ReferenceResolver.TryBreakAllSharing to convert edges
// into symbolic references as needed to eliminate sharing and cycles.
public virtual void WriteTo(GraphWriter writer, Node root,
bool includeRootNode)
3.4. DefaultGraphStore
The class DefaultGraphStore uses classes DefaultGraphStoreCollectionNode,
DefaultGraphStoreCompositeNode, DefaultGraphStoreConstantNode, DefaultGraphStoreListNode,
DefaultGraphStoreNode, DefaultGraphStoreRecordNode, and DefaultGraphStoreTupleNode
to implement the MGraph model using CLR objects in a direct way. These classes
are available to ease the implementation of similar custom graph stores.
Two base classes simplify the construction of similar
representation classes: DefaultGraphStoreNode and DefaultGraphStoreCollectionBase.
namespace System.Dataflow {
public abstract class DefaultGraphStoreNode {
public GraphStore Store { get; set; }
public bool IsReadOnly { get; set; }
}
}
All default representations maintain a back link to their
store, including the representations that hold atomic values. That is, DefaultGraphStore
does not use values to represent themselves, that is, to be their own store
key. The uniform back link is used to implement a strong version of GraphStore.OwnsKey
in DefaultGraphStore:–
public override bool OwnsKey(object key) {
var node = key as DefaultGraphStoreNode;
return node != null && node.Store == this;
}
Rationale: A
strong OwnsKey implementation helps catching errors in initial multi-store
graph code. However, such checking is not required and stores may opt to not
pay the price to enable such checking.
namespace System.Dataflow {
public abstract class DefaultGraphStoreCollectionBase
: DefaultGraphStoreNode, ISplitCollection
{
public static readonly Label HeadLabel = Label.Get("Head");
public static readonly Label TailLabel = Label.Get("Tail");
public virtual Identifier Brand { get; set; }
public virtual bool IsList { get; }
public virtual bool IsRecord { get; }
public virtual bool IsTuple { get; }
public abstract IDictionary<Label, Node> LabeledEdges { get; }
public abstract ICollection<Node> UnlabeledEdges { get; }
public virtual void Add(IEnumerable<Edge> edges);
public virtual void Add(IEnumerable<Node> nodes);
public virtual void Add(ILookup<Label, Node> nodesByLabel);
public virtual IEnumerator<Edge> GetEnumerator();
IEnumerator IEnumerable.GetEnumerator();
public virtual ICollection<Node> ViewAllNodes();
public virtual ICollection<Edge> ViewEdges();
public virtual IList<Node> ViewAsList();
public virtual IList<Node> ViewAsTuple();
}
}
Note that the above Add methods assume that coalescing
(merging) has already been performed.
The default implementations of View… use the same (internal)
view implementations as GraphStore, enabling pointwise implementation of some
graph patterns while leaving others to default mechanisms.
3.4.1. Specialized
Implementation of Tuple Graph Pattern
Most of the specialized implementations are omitted in this
document as their functionality is self evident. However, below are the details
of DefaultGraphStoreTupleNode to follow up on the implementation details around
the default implementation of the tuple graph pattern (§3.3.3).
namespace System.Dataflow {
public class DefaultGraphStoreTupleNode : DefaultGraphStoreCollectionBase {
public override bool IsList { get; }
public override bool IsRecord { get; }
public override bool IsTuple { get; }
public override IDictionary<Label, Node> LabeledEdges { get; }
public override ICollection<Node> UnlabeledEdges { get; }
public override void Add(IEnumerable<Edge> edges);
public override void Add(IEnumerable<Node> nodes);
public override IEnumerator<Edge> GetEnumerator();
public override ICollection<Node> ViewAllNodes();
public override IList<Node> ViewAsList();
public override IList<Node> ViewAsTuple();
}
}
3.5. Examples
3.5.1. Custom
CLR Objects from Graph
The following example shows a simple mapping implementation
that builds a web of strongly-typed custom CLR objects, using a two-pass
approach to deal with references. In a first pass, a graph is read into a
default store, including reference resolution. In a second pass, a custom
factory is used to traverse the graph and create a web of objects:–
string villageGraph = @"
{
Villagers =>
{
Jenn => { Name => 'Jennifer', Age => 28, Spouse => Rich },
Rich => { Name => 'Richard', Age => 26, Spouse => Jenn },
Charly => { Name => 'Charlotte', Age => 12 }
},
HaveSpouses => { Villagers.Rich, Villagers.Jenn }
}";
var root = Node.ReadFromString(villageGraph).Single().Node;
// With the graph store fully loaded, references are now resolved.
// Generate custom objects from graph store, retaining no back link
// to graph store.
var village = VillageFactory.VillageFromGraph(root);
var jenn =
(from v in village.Villagers
where v.Name == "Jennifer"
select v).Single();
var jennsAge = jenn.Age;
var richsSpouse =
(from v in village.Villagers
where v.Name == "Richard"
select v.Spouse).Single();
var richsSpousesAge = richsSpouse.Age;
var jennHasSpouse = village.HaveSpouses.Contains(jenn);
var charly =
(from v in village.Villagers
where v.Name == "Charlotte"
select v).Single();
var charlyHasSpouse = village.HaveSpouses.Contains(charly);
Assert.AreEqual(28, jennsAge);
Assert.AreEqual(28, richsSpousesAge);
Assert.IsTrue(jennHasSpouse);
Assert.IsFalse(charlyHasSpouse);
The simple factory used above is shown below. The factory
uses a dictionary to ensure that sharing and cycles in the original graph are
preserved in the generated object web.
static class VillageFactory {
public static Village VillageFromGraph(Node root) {
var villagers = new Dictionary<Node, Villager>();
foreach (var node in root.Edges["Villagers"].ViewAllNodes()) {
var villager = GetVillager(node, villagers);
}
var village = new Village();
foreach (var node in root.Edges["Villagers"].ViewAllNodes()) {
var villager = GetVillager(node, villagers);
foreach (var i in node.ViewAsRecord()) {
switch (i.Key.Text) {
case "Age": villager.Age = (int)i.Value; break;
case "Name": villager.Name = (string)i.Value; break;
case "Spouse": villager.Spouse = GetVillager(i.Value, villagers);
break;
}
}
village.Villagers.Add(villager);
}
foreach (var node in root.Edges["HaveSpouses"].ViewAllNodes()) {
village.HaveSpouses.Add(GetVillager(node, villagers));
}
return village;
}
static Villager GetVillager(Node node,
Dictionary<Node, Villager> villagers)
{
Villager villager;
if (!villagers.TryGetValue(node, out villager)) {
villagers.Add(node, new Villager());
}
return villager;
}
}
3.5.2. Arbitrary
CLR Objects built by Custom Graph Store
This example shows a simple custom graph store that directly
builds custom-typed CLR objects.
string villageGraph = @"
Village {
Villagers => {
Jenn => Villager { Name => 'Jennifer', Age => 28, Spouse => Rich },
Rich => Villager { Name => 'Richard', Age => 26, Spouse => Jenn },
Charly => Villager { Name => 'Charlotte', Age => 12 }
},
HaveSpouses => { Villagers.Rich, Villagers.Jenn }
}";
var typeMap = new Dictionary<Identifier, Type> {
{ typeof(Village).Name, typeof(Village) },
{ typeof(Villager).Name, typeof(Villager) }
};
Func<Identifier, Type> typeResolver =
id => {
Type t = null;
typeMap.TryGetValue(id, out t);
return t;
};
var village = ObjectGraphStore.Load<Village>(villageGraph, typeResolver);
var jenn =
(from v in village.Villagers
where v.Name == "Jennifer"
select v).Single();
var jennsAge = jenn.Age;
var richsSpouse =
(from v in village.Villagers
where v.Name == "Richard"
select v.Spouse).Single();
var richsSpousesAge = richsSpouse.Age;
var jennHasSpouse = village.HaveSpouses.Contains(jenn);
var charly =
(from v in village.Villagers
where v.Name == "Charlotte"
select v).Single();
var charlyHasSpouse = village.HaveSpouses.Contains(charly);
Assert.AreEqual(28, jennsAge);
Assert.AreEqual(28, richsSpousesAge);
Assert.IsTrue(jennHasSpouse);
Assert.IsFalse(charlyHasSpouse);
The store implementation used above is shown below. It uses
an externally provided mapping function (typeResolver in the code above) to map
brands to .NET types.
class ObjectGraphStore : GraphStore {
static Edge[] emptyEdges = new Edge[0];
public virtual Func<Identifier, Type> TypeResolver { get; set; }
public override Node CreateCollection(Identifier brand, NodeKind kind,
IEnumerable<Edge> edges)
{
var builder = new GraphObjectBuilder();
if (Identifier.IsNullOrEmpty(brand)) {
// All unbranded collections are forced into lists.
builder.Type = typeof(List<Edge>);
}
else {
// All branded collections are forced into specific objects.
Type type = TypeResolver(brand);
if (type == null) {
throw new NotSupportedException(
"The specified brand cannot be resovled to a type.");
}
builder.Type = type;
}
builder.Store = this;
if (edges != null) {
foreach (var edge in edges) { builder.Edges.Add(edge); }
}
return Node.CreateNodeForStoreKey(this, builder);
}
public override Identifier GetBrand(Node node) {
var builder = node.StoreKey as GraphObjectBuilder;
if (builder != null) { return Identifier.Get(builder.Type.Name); }
return Identifier.Empty;
}
protected override ICollection<Edge> GetEdges(Node node) {
var builder = node.StoreKey as GraphObjectBuilder;
if (builder == null) { return emptyEdges; }
return builder.Edges;
}
public override NodeKind GetNodeKind(Node node) {
var builder = node.StoreKey as GraphObjectBuilder;
if (builder != null) { return NodeKind.Record; }
return NodeKind.Atomic;
}
public static object GetValue(object key) {
var builder = key as GraphObjectBuilder;
if (builder != null) { return builder.GetObject(); }
return key;
}
public static T Load<T>(string source, Func<Identifier, Type> typeResolver)
{
var sreader = new StringReader(source);
var mreader = new GraphTextReader(sreader);
var store = GraphStore.ReadFrom(
() => new ObjectGraphStore { TypeResolver = typeResolver },
mreader);
return (T)GetValue(store.OuterEdges.Single().Node.StoreKey);
}
}
The custom store above uses the auxiliary class GraphObjectBuilder
to build individual objects. Builders build their object at most once,
preserving reference identity and thus sharing and cycles in the build web of
objects. The builder protocol uses two phases. In the first, the custom store creates
one builder per node in the graph, sets each builder’s Type property and
populates each builder’s Edges collection. In the second phase, the custom
store requests each builder’s object, which triggers the builder to populate
those objects’ properties from the Edges collection.
class GraphObjectBuilder {
static Type edgeCollectionType = typeof(ICollection<Edge>);
ICollection<Edge> edges = new List<Edge>();
object target;
public ICollection<Edge> Edges { get { return edges; } }
public Type Type { get; set; }
public object GetObject() {
if (target != null) { return target; }
target = Activator.CreateInstance(Type);
if (edgeCollectionType.IsAssignableFrom(Type)) {
var coll = (ICollection<Edge>)target;
coll.Clear();
foreach (var m in Edges) { coll.Add(m); }
}
else {
foreach (var m in Edges) {
var property = Type.GetProperty(m.Label.Text);
if (property == null) {
throw new NotSupportedException("Only properties can be set.");
}
object value = ObjectGraphStore.GetValue(m.Node.StoreKey);
if (edgeCollectionType.IsAssignableFrom(value.GetType())) {
var args = (ICollection<Edge>)value;
var getter = property.GetGetMethod();
var coll = getter.Invoke(target, null);
var adder = coll.GetType().GetMethod("Add");
foreach (var a in args) {
var arg = ObjectGraphStore.GetValue(a.Node.StoreKey);
adder.Invoke(coll, new object[] { arg });
}
}
else {
var setter = property.GetSetMethod();
setter.Invoke(target, new object[] { value });
}
}
}
return target;
}
}
3.5.3. Objects
Built by Custom Graph Store, no References
If it is known that a graph contains no sharing or cycles,
then a very simple custom graph store can be used to build custom object webs
directly, in a single pass:–
string villageTree = @"
Village {
Villagers => {
Jenn => Villager { Name => 'Jennifer', Age => 28 },
Rich => Villager { Name => 'Richard', Age => 26 },
Charly => Villager { Name => 'Charlotte', Age => 12 }
}
}";
var village = VillageTreeStore.Load(villageTree);
var jenn =
(from v in village.Villagers
where v.Name == "Jennifer"
select v).Single();
var jennsAge = jenn.Age;
Assert.AreEqual(28, jennsAge);
The custom store VillageTreeStore used above is shown below.
The main constraint that enables its simple form is that it does not support
references (and thus reference resolution).
The store does implement the minimally required graph
traversal methods. The web of objects can therefore be traversed/inspected as a
graph as well. Besides being generally useful, this feature enables this simple
store to establish sharing relationships: an already built object can be
traversed while building the graph. The resulting graph cannot contain cycles,
since the one-pass building approach of this store cannot build an object while
deferring resolution of references to objects not yet built. In other words,
the graphs buildable by this store are always directed acyclic graphs.
The store virtualizes the graph over the underlying objects
in a straightforward way that does not afford updates. The graph is not exactly
read-only as it supports the bottom-up addition of new nodes (and thus
underlying objects) at any time. However, it does not support mutation of
already constructed nodes (the edge collections returned are read-only).
If the underlying objects are directly mutated, then the
store will reflect such changes in the virtualized graph’s next traversal. This
works because the store itself is stateless. There is no notification mechanism
for graph clients though.
class VillageTreeStore : GraphStore {
static readonly Edge[] emptyEdges = new Edge[0];
public Village Village {
get { return SyntheticSource.Edges.First().Node.StoreKey as Village; }
}
public override Node CreateCollection(Identifier brand, NodeKind kind,
IEnumerable<Edge> edges)
{
if (brand == typeof(Village).Name) {
return CreateVillage(edges);
}
if (brand == typeof(Villager).Name) {
return CreateVillager(edges);
}
if (Identifier.IsNullOrEmpty(brand)) {
return CreateEnumerableCollection(edges);
}
throw new NotSupportedException();
}
Node CreateEnumerableCollection(IEnumerable<Edge> edges) {
return Node.CreateNodeForStoreKey(
this,
(edges ?? emptyEdges).Select(m => m.Node.StoreKey));
}
Node CreateVillage(IEnumerable<Edge> edges) {
var village = new Village();
foreach (var edge in edges) {
switch (edge.Label.Text) {
case "Villagers":
var villagers = (IEnumerable)edge.Node.StoreKey;
foreach (var v in villagers.Cast<Villager>()) {
village.Villagers.Add(v);
}
break;
default:
throw new NotSupportedException();
}
}
return Node.CreateNodeForStoreKey(this, village);
}
Node CreateVillager(IEnumerable<Edge> edges) {
var villager = new Villager();
foreach (var edge in edges) {
switch (edge.Label.Text) {
case "Age":
villager.Age = (int)edge.Node;
break;
case "Name":
villager.Name = (string)edge.Node;
break;
default:
throw new NotSupportedException();
}
}
return Node.CreateNodeForStoreKey(this, villager);
}
public override Identifier GetBrand(Node node) {
var key = node.StoreKey;
if (key is Village || key is Villager) { return key.GetType().Name; }
return Identifier.Empty;
}
protected override ICollection<Edge> GetEdges(Node node) {
var key = node.StoreKey;
var village = key as Village;
if (village != null) {
return new List<Edge> {
new Edge("Villagers",
Node.CreateNodeForStoreKey(this, village.Villagers))
};
}
var villager = key as Villager;
if (villager != null) {
return new List<Edge> {
new Edge("Age", Node.CreateNodeForStoreKey(this, villager.Age)),
new Edge("Name", Node.CreateNodeForStoreKey(this, villager.Name))
};
}
var coll = key as IEnumerable;
if (coll != null) {
return
(from v in coll.Cast<object>()
select new Edge(Label.Empty, Node.CreateNodeForStoreKey(this, v)))
.ToList();
}
return emptyEdges;
}
public override NodeKind GetNodeKind(Node node) {
var key = node.StoreKey;
if (key is Village || key is Villager) {
return NodeKind.Record;
}
if (key is IEnumerable && !(key is string)) {
return NodeKind.Collection;
}
return NodeKind.Atomic;
}
public static Village Load(string source) {
return GraphStore.ReadFromString<VillageTreeStore>(source).Village;
}
}
Note: The
techniques used in this example can be combined with the ones used for generic
CLR object building exhibited in §3.5.2 to move beyond hard-coded knowledge of
specific types. A future version of the “Oslo” SDK may provide more support for
such specialized graph store implementations.
4. Graph
Readers and Writers
The mapping of graphs from and to serial streams is
performed by GraphReader and GraphWriter, respectively. Concrete
implementations of these abstract classes are provided for the important
special case of text streams: GraphTextReader and GraphTextWriter.
4.1. GraphReader
The GraphReader base class has a low-level bottleneck
interface, combined with a set of virtual default implementations built on top
(listed separately, further below).
namespace System.Dataflow {
public abstract class GraphReader : IDisposable {
protected GraphReader();
// Available for StartCollection and for
// Constant tokens if present in the input stream;
// Identifier.Empty if a brand is not present in the input stream
// for those tokens; null for other tokens; never throws.
public abstract Identifier Brand { get; }
// Available for Constant tokens; ConstantClassification.None otherwise.
public abstract ConstantClassification ConstantClassification { get; }
// The number of StartCollections minus number of
// EndCollections, both counts excluding the current token
public abstract int Depth { get; }
// Available for StartCollection and for
// Constant tokens if present in the input stream;
// Label.Empty if a label is not present in the input stream
// for those tokens; null for other tokens; never throws.
public abstract Label Label { get; }
// See §4.1.1
public GraphReaderQuotas Quotas { get; }
// Current position;
// initially StartStream when a reader is created and no Read() calls
// have been made;
// EndStream once reader has been advanced past the end of the stream.
public GraphStreamTokenKind TokenKind { get; }
// Close the reader and any underlying streams held by it.
public abstract void Close();
protected virtual void Dispose(bool disposing);
public void Dispose();
// Available for Constant tokens; may throw on parse error.
public abstract object GetConstant();
// Throws if current token is EndStream. Otherwise, advance to
// the next token and set the Label, TokenKind, TextValue
// and ConstantKind properties according to the new token.
public abstract bool Read();
}
}
The set of (mostly virtual) functions built on top of the
reader’s bottleneck interface are listed below.
The following set of read methods can be specialized by
reader implementations to avoid boxing of values. By default, these all call
through ReadConstant, causing values to be boxed.
public virtual bool ReadBoolean();
public virtual byte[] ReadBytes();
public virtual Date ReadDate();
public virtual DateTime ReadDateTime();
public virtual DateTimeOffset ReadDateTimeOffset();
public virtual decimal ReadDecimal();
public virtual double ReadDouble();
public virtual GraphReference ReadGraphReference();
public virtual Guid ReadGuid();
public virtual int ReadInt32();
public virtual long ReadInt64();
public virtual object ReadNull();
public virtual string ReadText();
public virtual Time ReadTime();
The following method allows readers to skip forward over
nested non-atomic nodes, while doing minimal work. The reader must be at a Start…
token; after the call, the reader will be at the token that is immediately
after the matching End… token.
public virtual void Skip();
4.1.1. GraphReaderQuotas
Stream reading opens up systems for potential denial of
service attacks. The GraphReaderQuotas class limits the size complexity of
streams to preset limits that can be used to curb such attacks.
namespace System.Dataflow {
public class GraphReaderQuotas {
public int MaxConstantLength { get; set; }
public int MaxDepth { get; set; }
public int MaxLabelLength { get; set; }
public int MaxReferenceSegments { get; set; }
}
}
The quota setters can only be called while the reader has
not yet been used; all limits must be positive values. The following table
shows the quotas and their default limits:–
4.1.2. GraphTextReader
Class GraphTextReader exposes almost no new functionality
beyond base class GraphReader. It has a set of constructors that support text
streams from a number of sources and in a variety of encodings. The one added
member is the TextValue property that exposes the source-level textual encoding
of the most recently read token, as literally found in the text stream.
namespace System.Dataflow {
public class GraphTextReader : GraphReader {
public GraphTextReader(string filePath);
public GraphTextReader(string filePath, Encoding encoding);
public GraphTextReader(Stream stream);
public GraphTextReader(Stream stream, Encoding encoding);
public GraphTextReader(TextReader reader);
public override Identifier Brand { get; }
public override ConstantClassification ConstantClassification { get; }
public override int Depth { get; }
public override Label Label { get; }
public virtual string TextValue { get; }
public override void Close();
public override object GetConstant();
public override bool Read();
public override void VerifyReaderPosition(
GraphStreamTokenKind requiredTokenKind, Label requiredLabel);
}
}
4.2. GraphWriter
The GraphWriter base class has a low-level bottleneck
interface, combined with a set of virtual default implementations built on top
(listed separately, further below).
namespace System.Dataflow {
public abstract class GraphWriter : IDisposable {
// The number of StartCollections written so far minus number of
// EndCollections written so far.
public abstract int Depth { get; }
// The last written token. Initially, StartStream.
public abstract GraphStreamTokenKind TokenKind { get; }
// Close the writer and any underlying streams held by it.
// This should be idempotent.
public abstract void Close();
protected virtual void Dispose(bool disposing);
public void Dispose();
// Flush the writer and any underlying streams held by it
public abstract void Flush();
// Write the specified token. The label paramater may be
// specified only for StartCollection, Constant, and Reference
// tokens. The value paramater may be specified only for
// Constant and Reference tokens.
public void Write(GraphStreamTokenKind tokenKind,
Label label, Identifier brand, object value);
protected abstract void OnWrite(GraphStreamTokenKind tokenKind,
Label label, Identifier brand, object value);
}
}
The following convenience functions map specific write
requests to the common Write bottleneck interface.
public void WriteConstant(Label label, Identifier brand, object value);
public void WriteEnd();
public void WriteEndCollection();
public void WriteEndComposite();
public void WriteEndRecord();
public void WriteEndList();
public void WriteEndTuple();
public void WriteStart(NodeKind kind, Label label, Identifier brand);
public void WriteStartCollection(Label label, Identifier brand);
public void WriteStartComposite(Label label, Identifier brand);
public void WriteStartRecord(Label label, Identifier brand);
public void WriteStartList(Label label, Identifier brand);
public void WriteStartTuple(Label label, Identifier brand);
The following strongly typed write methods call WriteConstant
by default, causing boxing. They can be overridden to avoid boxing.
public virtual void WriteBoolean(Label label, Identifier brand, bool value);
public virtual void WriteBytes(Label label, Identifier brand, byte[] value);
public virtual void WriteDate(Label label, Identifier brand, Date value);
public virtual void WriteDateTime(Label label, Identifier brand,
DateTime value);
public virtual void WriteDateTimeOffset(Label label, Identifier brand,
DateTimeOffset value);
public virtual void WriteDecimal(Label label, Identifier brand,
decimal value);
public virtual void WriteDouble(Label label, Identifier brand, double value);
public virtual void WriteGraphReference(Label label,
GraphReference reference);
public virtual void WriteGuid(Label label, Identifier brand, Guid value);
public virtual void WriteInt32(Label label, Identifier brand, int value);
public virtual void WriteInt64(Label label, Identifier brand, long value);
public virtual void WriteNull(Label label, Identifier brand);
public virtual void WriteString(Label label, Identifier brand, string value);
public virtual void WriteTime(Label label, Identifier brand, Time value);
The following method copies from the current reader's token
to the end of the fragment, both inclusive. If the reader started on a StartXxx
token, then the fragment ends with the matching EndXxx token, inclusive.
Otherwise, just the current token is copied. After the call, the reader will be
at the token just after the copied token.
public virtual void Write(GraphReader reader);
4.2.1. GraphTextWriter
Class GraphTextWriter mostly implements GraphWriter. It
provides a set of constructors to write to a variety of stream targets, using a
variety of encodings. The one added feature allows control over indentation
width per nesting level: if set to 0, indentation is turned off.
namespace System.Dataflow {
public class GraphTextWriter : GraphWriter {
public GraphTextWriter(string filePath);
public GraphTextWriter(string filePath, Encoding encoding);
public GraphTextWriter(Stream stream);
public GraphTextWriter(Stream stream, Encoding encoding);
public GraphTextWriter(TextWriter writer);
public override int Depth { get; }
// The set can be called only before the first write call.
// The default is 2. A value of 0 value does no indentation.
public int Indentation { get; set; }
public override GraphStreamTokenKind TokenKind { get; }
public override void Close();
public override void Flush();
protected override void OnWrite(GraphStreamTokenKind tokenKind,
Label label, Identifier brand, object value);
}
}
4.3. Examples
string villageGraph = @"
{
Villagers =>
{
Jenn => { Name => 'Jennifer', Age => 28, Spouse => Rich },
Rich => { Name => 'Richard', Age => 26, Spouse => Jenn },
Charly => { Name => 'Charlotte', Age => 12 }
},
HaveSpouses => { Villagers.Rich, Villagers.Jenn }
}";
var input = new StringReader(villageGraph);
var reader = new GraphTextReader(input);
reader.Read();
while (reader.Label != "Rich") { reader.Read(); }
string name = "";
int age = 0;
GraphReference spouse = new GraphReference();
reader.Read();
while (reader.TokenKind != GraphStreamTokenKind.EndComposite
&& reader.TokenKind != GraphStreamTokenKind.EndStream)
{
switch (reader.Label.Text) {
case "Age": age = reader.ReadInt32(); break;
case "Name": name = reader.ReadText(); break;
case "Spouse": spouse = reader.ReadGraphReference(); break;
default: throw new NotSupportedException("Unsupported label.");
}
}
reader.Close();
Assert.AreEqual("Richard", name);
Assert.AreEqual(26, age);
Assert.AreEqual("Jenn", spouse.ToString());