There are few places where practicing good defensive coding is more important than in a collaborative application running in an environment over which you have no control. Any number of things could happen after your workflow is installed and activated that would render it totally, or perhaps worse, partially non-functional. Potentially dozens if not hundreds of people have the ability to break your workflow, either maliciously or accidently, and you can do little to stop them. For this reason, your workflow needs to be cautious, non-trusting, and as much as possible, self-healing.
We’ll cover some techniques for dealing with this potential problem in the following sections. As with many other things in this article, some of these elements are not unique to workflow development. They are simply good programming practices. The twist added by workflow is that you are working in a highly visual environment. It may not occur to you to think along these lines because a large part of what you’re doing does not involve writing code—it is simply configuring pre-built activities.
What happens to a workflow if the history list is deleted in the middle of processing? What if a user or another workflow deletes the list item that your workflow is processing against? These are just two very simple examples of why it is important to verify your objects before you use them. This is going to be different from simple error handling because ideally you want to try some sort of recovery that lets you continue processing. If a list you need does not exist or was deleted, perhaps you can recreate it. If a document is locked, you can wait for it to become unlocked, or you can perhaps notify the user who has it locked that you are waiting for it to be available. There are any number of possible paths you can take if we can first identify a problem before it is too late.
The problem is that you are working with pre-built components. If, for example, you are using the out-of-the-box LogToHistoryListActivity, how can you ensure that the History list for the workflow (which is not known until run time) exists and is available?
There are three approaches to this and other similar problems that fall within the purview of verify before use. The first two of these are available for both sequential and state machine workflows. The third option, available only for state machines, is one of the primary reasons that state machines are the preferred approach for building rock-solid workflows.
-
Make judicious use of Code activities. If you put one of the out-of-the-box Code activities before every single activity in your workflow that reaches outside of itself or your workflow, you could verify that things are as you need them to be. This has the serious downside of instantly doubling the number of activities in your workflow and muddying the presentation in the designer.
-
Most activities in your workflows (at least the out-of-the-box activities) have a MethodInvoking property. This property lets you specify a method that the workflow host will call just before it runs the activity. You can add whatever code you need into this method. That code executes just prior to the activity. It has the same effect as placing a Code activity before every other activity in the workflow without cluttering up the designer with multiple extra activities. This approach is nearly perfect for your needs, except for two problems:
-
There is no corresponding method that occurs after the activity is otherwise finished processing. Therefore, anything you do after the activity must be placed into the MethodInvoking property of the next activity. This is not very intuitive, and it is potentially a big problem if the next activity is not known at design time (conditional branching) or if the workflow is updated.
-
If you uncover a problem that you cannot remedy by using code, the only available options in the MethodInvoking property are to throw an error (we will cover error handling shortly) or cancel the workflow. Neither is particularly appealing.
-
Take advantage of the fact that state machines have essentially multiple individual sequential workflows—StateInitialization, EventDriven, and StateFinalization—contained within each state. This lets you check the environment in the StateInitialization activity before the main activity in the state executes, similar to the MethodInvoking option. However, unlike with MethodInvoking, you can also do some processing after the activity processes—via StateFinalization—or you can easily switch to a different state and therefore a different stream of processing.
Of the three approaches, the third is fairly obviously the best because it gives us the most flexibility. An example might help clarify this approach.
Imagine that you have a workflow operating on a document. At a certain point in the process, the status of the document changes to “Under Review.” There is a column in the SharePoint document library where you record the status, so the workflow has to update the value in the column. Your workflow designer looks something like Figure 4.
Figure 4. Subset of the Document Review workflow shown as a simple sequential workflow
This part of the process is simple. The review of a document starts, the workflow sets the document status to “Under Review,” and it waits for the review to be finished. The code in the set_Status Code activity is as follows.
this.workflowProperties.Item["Status"] = "Under Review";
this.workflowProperties.Item.Update();
You test the workflow, and it runs without a problem in your development, Test, and QA environments. So you package it and release it to a pilot group of customers.
Within a week of releasing the code, you start receiving bug reports back reporting failures. Sometimes the process seems to work; sometimes it fails. Can you see the problem? Can you see why it is destined to fail intermittently?
Here’s the problem: every once in a while when your workflow tries to update the status of the document, some other user has the document checked out. When that happens, the workflow fails. Even running as the System Account does not let you sneak past the “Checked Out” barrier.
The solution to this problem is to check whether the document is checked out before you try to update it. If it is not, you should check it out, so that no one else can access it while you work. If it is checked out, you can do any or all of the following:
-
Send e-mail to the user asking him or her to release the document.
-
Switch states to one in which you are waiting for the item to be released.
-
Force the document to be checked in so that you can check it out.
-
Any other action necessary for our process.
Although you could put this code directly into your Code activity, that approach buries the functionality in code and makes things a little harder to maintain. Instead, a better approach might be to use the power of a state machine and spread the functionality across StateInitialization, StateFinalization, and multiple states.
Let’s examine this approach. First of all, the designer for this solution looks like Figure 5.
Figure 5. A subset of the state machine to demonstrate proper defensive coding in a workflow
In the interest of keeping things focused, only those states that are necessary for this part of the example are shown here: UnderReview and WaitForCheckIn. This part of the process shows the workflow coming into the UnderReview state (the arrow at the top left that points downward). By viewing this simplistic subset of the process, you can see that there are two basic paths through the process:
-
The “main” flow of <Enter> … UnderReview … <Finish>
-
The alternate flow of <Enter> … UnderReview … Wait … UnderReview
In the case of the second path, if the document is not available to check out, you enter this secondary flow and wait for it to be available. When it is available, you loop back and reenter the UnderReview state. This can continue for as long as necessary for the document to become available. Every change to the document that is saved back into Office SharePoint Server triggers a re-check, but the process only continues when you can obtain exclusive access to the document.
Starting with the UnderReview state, the StateInitialization activity (initStateUnderReview) is where you see whether you can check out the document to update its Status column. The internal process of this activity looks like Figure 6.
Figure 6. StateInitialization designer for the workflow
The heart of this part of the process is the top Code activity (codeDoCheckOut) and the ifElseActivity (ifDocCheckedOut). The Code activity is responsible for seeing if a user currently has the document checked out. If the document is not checked out, the Code activity checks the document out to the System Account. In either case, the name of the user that has the document checked out (either a user or “System Account” if the workflow checked it out) is then stored in a class-level string variable.
After the Code activity does its thing, the process continues on to the ifDocCheckedOut activity. As Figure 6 shows, this ifElseActivity has two branches. The left side of the workflow process was able to successfully check out the document, otherwise the right side occurs. This check is performed with a very simple set of code in the Condition for the ifProcess branch of the ifElseActivity.
private void verifyCheckedOutTo(object sender, ConditionalEventArgs e)
{
if (sCheckedOutBy.ToLower() == "system account")
{
e.Result = true;
}
else
{
e.Result = false;
}
}
If the string variable that contains the name of the user that has the document currently checked out indicates System Account, the condition is true; otherwise it is false. Based upon this, the appropriate branch of the IfElseActivity executes.
Note:
|
|
The name of the user who has the document checked out is stored in a string variable instead of using the full SPUser object. Besides the fact that you really only need the user’s name and storing the rest would be wasteful, we also have to deal with the fact that when our workflow dehydrates, it needs to be able to serialize all of its members to save them in persistent storage. Strings can be serialized; but SPUser cannot.
|
The first two activities in the ifProcess (left) branch do just what their names imply: set the status of the document and check it back in. Because you have made it to this point in the process, you can be certain that you will not have a problem updating the properties of the document because you have already checked it out. After you set the status as you need to, you check it back in so that other users, or other processes, can access it.
The next activity, setStateFinal is really the next piece that is of interest. Again, the activity is aptly named. It is responsible for transitioning our sample workflow to its final (completed) state. Because you have successfully updated the document properties, your work here is finished. The process can now continue on to its next step, or else, as in the case of the sample process, simply end.
The final piece of the sample that we need to examine is the other branch of the ifDocCheckedOut ifElse activity. Looking at this, we can see what happens if our Code activity earlier was unable to check out the document.
Similar to the last activity of the first branch, this branch contains a SetState activity. In this case, it transitions our process to a new state: stateWaitforCheckIn. Here again, this state makes use of the multiple sub-processes available with the state machine, in this case, StateInitialization and EventDriven. In the StateInitialization phase, we take our actions to try to get the document checked in. In this example, that means sending e-mail to the person who has the document checked out. In your case, it could be forcing the document to be checked in, or any number of other possibilities.
The nature of a state machine means that the next phase of this state only executes when the event it is tied to triggers it—hence the event part of the EventDriven name. In this example, we have configured this event to be a change to the document the workflow is running against by using an instance of the onWorkflowItemChanged default activity, the first activity in Figure 7.
Figure 7. EventDriven process in the workflow fires when the payload document is changed
When this activity occurs, something has changed with the payload document; we do not know exactly what has changed, because any change on the document will trigger this activity. As you can see in the previous figure, this part of the EventDriven activity is fairly straightforward. On any change to the document, set the state back to UnderReview.
The first thing the UnderReview state does is check whether or not it can check out the document. However, this time through, because the document is available to be checked out (assuming that the user who had it checked out previously has in fact released their hold on it), the second branch of the IfElse activity fires, and the workflow is transitioned to the Final state. Now you're finished.
Although this is a simplistic example, it demonstrates well the idea of defensive coding for workflows. As your “main” flow of the process progresses, it must check to ensure that it can continue. If it cannot, it should divert processing off to a secondary (or tertiary, and so on) flow that can deal with the blocking conditions and when appropriate, redirect processing back to the main process flow to “try again.”
Perhaps Proactive Error Avoidance sounds a little grandiose for a workflow, but in reality, it is nothing out of the ordinary—nothing that any other application you write does not likely do already. All it means is that your application actively tries to prevent errors from occurring. Consider this your first line of defense, but it in no way diminishes the need for good defensive coding. It is also important to note that this activity takes place outside of the workflow itself.
The adage that states the best defense is a strong offense applies here. The best way to prevent problems from disrupting your workflow process is to take steps to prevent the problems in the first place. The key to this element is to know what your dependencies are. This is an element that we discussed earlier. It is now time to examine your dependencies list more closely.
Look for items that you can deal with programmatically, independent of your actual workflow process itself. These are going to be proactive attempts at heading off problems before your workflow is running, as opposed to the more reactive approach of the defensive coding strategies discussed above. For the most part, these items fall into the “required item x does not exist” category: something that your application or workflow created or installed has since been deleted or otherwise made inaccessible.
The mechanism for this type of approach is based entirely upon SharePoint event receivers. For example, if your workflow is dependent upon a certain list item existing, it is important that your workflow installer:
-
Creates or verifies the existence of the list.
-
Adds or verifies the item.
-
Registers an Event Receiver to prevent the item from being deleted.
You can take a similar approach for any of the items supported by SharePoint Event Receivers:
-
List Items
-
Webs
-
Sites
-
List Columns
For example, if your workflow creates a custom column named InternalID that links items in a SharePoint list to your core application, you want to make sure that this column is not removed from lists. To do this, your application should include an Event Receiver such as the following.
public override void FieldDeleting(SPListEventProperties properties)
{
base.FieldDeleting(properties);
if (properties.FieldName.ToLower() == "internalid");
{
properties.Cancel=true;
properties.ErrorMessage = @"The InternalID field is required by the
Contoso CaseTrak application. Please see the CaseTrak
documentation for options and instructions for deleting this
field";
}
}
Now, when a user (any user, even an administrator) tries to delete this column from your list, they see the following error page.
Figure 8. Error page shown when Event Receiver prevents deletion of a column
Furthermore, when your workflow runs, it is quite likely that the InternalID field is still available to you. Mission accomplished.
Note:
|
|
Even with proactive elements such as the one covered in this section in place, defensive coding is still important. If an administrator disables your Event Receiver, the code that prevents the column from being deleted never runs and is therefore unable to stop the column from being deleted. Defensive coding and Proactive Error Avoidance are not mutually exclusive, you must have them both.
|