By Zoiner Tejada, Hershey Technologies
Articles in this series
Published: December, 2008
Summary: The
value of using Workflow extends beyond building logic to support and
orchestrate human processes. The ability for WF to efficiently support long
running workflows applies equally well to automatic processes that have long
life-spans, such as polling service that wait for intervals measured in hours
or days and then wake up to perform some task. The key factor is that these
services must minimize their resource utilization when they are not performing
the processing task- leading to more desirable scalability characteristics.
An excellent example of this type of process is a feed
aggregator or blog monitor. In this scenario, we will define a Blog Monitoring
process that one might use to collect blog entries relevant to a user and
periodically send a summary of those entries to the user. A user configures the list of blogs (or
feeds) by entering the URL to the RSS or ATOM source and optionally specifying
a keyword that must be found in the summary of the blog entry. In addition, the
user configures two time-related events: when the service collects the entries
from the feeds and when the service should provide that notification containing
the matching feed items in aggregate. In
addition to being able to subscribe to the notifications provided by this
service, the user should also have the ability to stop receiving notices.
The business case for such functionality is, for example, to
allow a user to control when the summary arrives in e-mail separately from when
the processing might occur. Alternately, business requirements might dictate
the enforcement of a delay between the processing and the transmission such as
for complying with regulations or providing services with different SLA’s (such
as demo service that enforces a twelve hour delay). By using Workflow to implement this service
we gain the ability to define this logic graphically, and, if choosing to
enable persistence, the ability to persist the workflows when they are between
aggregation cycles. Most importantly, for the purposes of this article, the
implementations function as an example of a workflow orchestrating calls to
external systems.
The Blog Monitoring State Machine
We begin our review of the workflow implementation using the
State Machine based model. This implementation serves to highlight a unique
benefit of the State Machine Workflow: the inheritance of event handlers across
multiple states. This is made possible
because States themselves are composable- that is, a given State can contain
other States. The high-level view of the
workflow implementation is shown in Figure 1 and discussed in the text that
follows.
.jpg)
Figure 1 - High level
view of the Blog Monitor State Machine Workflow
Note: It is worthwhile observing that by definition the root
of all state machine workflows (the workflow itself) is a state, and by virtue
of adding states to it you are in fact building nested states. In effect, we could have added the CancelMonitoring
event driven activity to the root workflow itself and it would cover any states
added within the workflow scope.
Recall from the scenario introduction, that a user
configures the system to send syndication notifications by specifying the URL
of the feed, an optional keyword filter, and the times at which the feeds are
aggregated and the notice is sent respectively. An example of this
configuration is illustrated in Figure 2.
.jpg)
Figure 2 -
Configuring the Subscription
Once the user has completed the configuration, the process
is launched by clicking Subscribe. This effectively shuttles over the list of
feeds, filters and times as initialization parameters to a newly created
workflow instance. As can be seen from
Figure 1, the workflow will begin in the state called “PollingState”. Notice
how this state is nested within the state labeled MonitoringState, and has the
state NotificationState as a sibling. The reason for nesting the states like
this is to support the sharing of the logic that occurs when the user chooses
to stop the notification by clicking the Unsubscribe button (shown in Figure
3).
.jpg)
Figure 3 - The
Unsubscribe button (after Subscribing, but before initial notification).
This shared logic is defined within the CancelMonitoring
EventDrivenActivity. By configuring the workflow this way, regardless of
whether the workflow is currently processing feeds (the PollingState) or
preparing notifications (the NotificationState), when the user clicks
Unsubscribe and raises the Unsubscribe event against the workflow instance, the
workflow will make an orderly transition to FinalState and thus stop the
process. Figure 4 shows the transition
that occurs when the user clicks Unsubscribe.
.jpg)
Figure 4 - After
clicking unsubscribe.
Within the PollingState’s WaitForCollect activity, which
runs by converting the user’s “Collect At” target time into an interval to
wait, a Replicator activity sequentially creates and executes one instance of a
BlogMonitor custom activity for each feed requested by the user (illustrated in
Figure 5).
.jpg)
Figure 5 - The feed
processing waits for the Collect At time and then processes each feed
sequentially via a Replicator.
This custom activity makes use of the SyndicationFeed
available in the System.ServiceModel.Syndication namespace to process the
results of a simple HTTP get request (which is an XML document) against the URL
specified and returns a complex data structure that includes items such as the
Feed’s title, individual entries, their titles and summaries.
When the replicateMonitors Replicator Activity completes,
all feeds have been processed, so the state machine prepares to send out the
notification by transitioning to the NotificationState.
Upon arriving at the NotificationState, the workflow waits
for the delay derived from the user specified Send At time value to expire.
When it does, the workflow will execute the WaitForSendAt sequence (Figure 6),
which uses a CallExternalMethodActivity to update the user interface with the
data collected during feed processing (Figure 7 shows some sample output). Note
that in a real-world scenario, this could have just as easily sent an e-mail.
.jpg)
Figure 6 -
Notifications are sent out after the Send At time is reached.
.jpg)
Figure 7 - Results of
monitoring the WF RSS feed, filtered for summaries with the term
"workflow".
After the notification has been sent, the workflow goes back
to waiting for the Collect At time (on the following day), as shown by Figure 8
below.
.jpg)
Figure 8 - The
workflow returns to feed processing after sending notifications.
Parallelism Considerations
Observe that in this sample implementation, the blog feeds
are collected sequentially and the single thread of execution is held for what
might be a long while depending on both the length of time it takes to collect
the individual feed results and on the number of feeds requested. This was done
keep the scenario implementation simple and free of complicating distractions.
However, from a resource usage optimization standpoint this is not ideal.
Unfortunately the solution to this adds complexity to workflow design. For
example, one could modify BlogMonitor to make calls to an external local
service which would make the syndication calls asynchronously. Then one would
add an additional Event Driven sequence to the Polling State, that is called by
the local service when the asynchronous call completes with the feed results.
This Event Driven sequence would be structured such that only when all feed
results are in would a transition to the Notification State occur. Clearly this
adds complexity and ends up placing more logic outside of the workflow- a fair
indication that for achieving parallelism in this scenario, the state machine
may not be the best choice.
Mapping To A Sequential Workflow
The sequential workflow implementation of the Blog
Monitoring process (Figure 9) has many similarities to the state machine.
Preserved are the notions of overarching cancellation logic, delays for Collect
At and Send At as well as the replicator used to spawn as many Blog Monitors as
requested.
.jpg)
Figure 9 - The
complete Blog Monitor sequential workflow.
The key difference is that the entire workflow runs within
an EventHandlingScopeActivity, which effectively implements the
CancelMonitoring logic. It does this by its implementation of an Event Handler
as shown in Figure 10. When the Unsubscribe event is reached, a global Boolean
value called MonitoringCancelled is changed to true from its default of false.
This Boolean is evaluated against by the While loop, and when true, will stop the
additional loops. It is also evaluated at CollectNotCancelled and
SendNotCancelled, and when set to true causes them to skip any subsequent
processing so that the workflow completes.
.jpg)
Figure 10 - The
unsubscribe event handling sequence.
Related Links