Export (0) Print
Expand All
Separating DSL Semantics from Implementation
Expand Minimize
1 out of 1 rated this helpful - Rate this topic

A Wiki for Open-Vocabulary, Executable English over SQL

Cc168602.skyscrapr_banner(en-us,MSDN.10).gif

Dr. Adrian Walker

December 2007

Summary: This article covers how to use your browser to write English-language specifications, run them as if they were a program, and get English-language explanations. (11 printed pages)

Contents

Introduction: IDEs and Modeling Tools
Writing Business Rules Using Your Own Words and Phrases
Running the Rules
Explanations
Automatically Generated SQL
Conclusion
Sources
 

Introduction: IDEs and Modeling Tools

In an ideal modeling environment, business analysts and architects could write English-language requirements into their browsers—all the while, getting active help. Then, the environment would automatically translate the requirements into code, run the code, and explain the results in English, at the business level. Changes would be agile and auditable, and versions of the requirements and the code would be synchronized. As an analyst or architect, you could show your customers the effects of proposed changes in understandable terms, and you could do this in real time.

As a step in this direction, I'll describe an emerging online technology environment [IBL 2007] that works as a kind of wiki for executable English. Google and other search engines can find the executable English, so that it can be reused. As befits a wiki, shared use is free.

In a companion IASA paper [Walker 2007], we looked at how to plan to use the environment to help with business-IT alignment in current systems. In this article, we'll look at end-to-end implementation of a use case.

You can use the environment to write business rules into a browser, and to run the rules. The rules look rather like classical syllogisms, such as the following:

All men are mortal.

Socrates is a man.

Socrates is mortal.

You can write rules by using your own words and phrases, so that this technology is different from "controlled English," such as is described in [Fuchs 2006]. In controlled English, you would have to stick to writing certain kinds of words and phrases, and there would be a separate task of maintaining a dictionary and a grammatical section for the English language. In this environment, there is no need for such a task. However, we shall see that the English-language meanings are interpreted strictly.

As you write the rules, the system provides advice. As soon as you and the system are happy with your rules, you can ask a question and get an answer in the form of a table that has an English-language heading. To get an answer, the system runs your rules, and optionally generates and runs SQL. You can then get an explanation of an answer, in English, at the business level. Let's look at a use case, and see how to write and run it as business rules.

Writing Business Rules Using Your Own Words and Phrases

Let's look at an example of a business requirement. This one is about how an oil company can make supply-chain decisions to meet demands for its products. Figure 1 outlines how the supply chain works.

Cc168602.06_10_IDE_Fig01(en-us,MSDN.10).gif

Figure 1. An oil-industry supply chain

In ordinary English, the supply-chain requirement looks like the following:

We project that the target region NJ will need 1,000 gallons of product 'y' in October. We then ask what alternative routes and modes of transportation (truck, train, boat, pipe) do we have to get that product to the region. Next, we ask whether there's a refinery nearby that can produce the base product for finished product 'y'. With all of that, we finally say that we need a delivery plan that is optimized to deliver on time, make a profit, and beat the competition.

However, if there is not enough of product 'y', then, depending on the region and the customers, product 'x' or 'z' will do as well; they're just variations of 'y' that use different additives. But they'll only do just as well in region NJ for the season including October.

Ideally, we could just type those two paragraphs into a browser, add some sample data, and ask how to meet the NJ demand. However, that's more than we can achieve. An ideal system that could do that would have to work robustly for many use cases, not just this one. It would also have to know how to ask for more information—in this case, about how to choose the quantities of products to be combined.

What you can do is to type in some business rules, based on the two paragraphs, by using your own words and phrases.

It looks as though there could be more than one demand from NJ, and also demands from other regions. Let's assign a unique ID number to each demand. Then, we would like to write some rules to tell us what amount of each product to use for a demand with a particular ID.

We would like the answer as a table with a heading such as the following:

for demand this-id this-region for this-quantity this-finished-product we use this-amount this-product from this-refinery
 

The answer table will have seven columns, corresponding to the place holders that start with "this-".

To find an answer, we are going to need to know what the corresponding demand is. We'll also need to know how much of each product to use for the demand. Suppose that we type into the browser a rule such as the following:

estimated demand some-id in some-region is for some-quantity gallons of some-finished-product in some-month of some-year
for estimated demand that-id some-fraction of the order will be some-product from some-refinery
that-quantity * that-fraction = some-amount
for demand that-id that-region for that-quantity that-finished-product we use that-amount that-product from that-refinery
 

The first line of the rule will be used to look up a demand. The second line will figure out the fraction of the demand that will be met by each available product. The third line will do a simple multiplication. So, the rule says that, if we can establish some values for the three premises, we also have values for the answer line.

After the rule is typed into the browser, we can ask the system to check it. So far, we have told the system what the last line of the rule means, but we have not told it anything about the first two lines. So, the system will advise us that each of the first two lines should be either the heading of a table or a conclusion of some rule.

While writing the first two lines, we used our own words and phrases. The system did not ask for any further information about the third line, as it already knows how to multiply two numbers; that's predefined. (The online documentation lists the available predefined sentences, and, when writing these, we do have to stick to their defined forms.)

For the first line of the rule, we need some data in a table that has a matching English-language heading, such as the following:

estimated demand this-id in this-region is for this-quantity gallons of this-finished-product in this-month of this-year
523 NJ 1000 product-y October 2005
 

If we type this into the browser and ask the system for advice, it will say that we still must say something about the second sentence of our rule, about what fraction of the order will be contributed by each product. The ordinary English-language requirement statement did not say how to choose the fractions.

So, let's say that the fraction of each product that we will use will be proportional to the stock that we have on hand. We can type in a corresponding rule that looks like the following:

estimated demand some-id in some-region is for some-quantity gallons of some-finished-product in some-month of some-year
for demand that-id for that-finished-product refinery some-refinery can supply some-amount gallons of some-product
for demand that-id the refineries have altogether some-total gallons of acceptable base products
that-amount / that-total = some-long-fraction
that-long-fraction rounded to 2 place(s) after the decimal point is some-fraction
for estimated demand that-id that-fraction of the order will be that-product from some-refinery
 

If we type the rule into the browser, the system will no longer ask for more information about the "for estimated demand" sentence, but it will indicate that it now needs to know about the second and third sentences in the preceding rule.

Continuing like this, we can type rules and tables into the browser, until the system is satisfied that each sentence that we have written is either the last sentence of a rule or matches a table heading. When we have reached that point, the rules and tables are as in [Oil Supply 2007]. No further information is needed, and we are ready to run the rules to find out how to meet the demand.

Before we do that, let's take a quick look at an advantage of writing in this kind of executable English. Suppose that we are about to write some rules, but we think that someone else might already have written some that we could reuse. If our supply-chain example is already in the system, we could try a Google search.

For example, at the time of the writing of this article, typing the following:

some-fraction of the order will be some-product
 

into Google gets the result that is shown in Figure 2.

Cc168602.06_10_IDE_Fig02(en-us,MSDN.10).gif

Figure 2. Result of Google search for some executable English

The first part of the Google result is a link to a file that contains the rules and data. So, we could copy some rules from there and paste them into another browser window for reuse. The second part of the result from Google is a longer paper about this example.

Running the Rules

To run the rules, we switch the browser to a menu of questions. The system automatically arranges the sentences that we wrote in the rules, so that the following sentence:

for demand some-id some-region for some-quantity some-finished-product we use some-amount some-product from some-refinery

 

appears at the top of the menu. If we click on the sentence and then on an Ask button, the Web page changes to the following:

for demand this-id this-region for this-quantity this-finished-product we use this-amount this-product from this-refinery
523 NJ 1000 product-y 190.0 product-x ZyxCo Canada One
523 NJ 1000 product-y 310.0 product-y ZyxCo Canada One
523 NJ 1000 product-y 500.0 product-z ZyxCo Canada One
 

Of course, in general, there could be many more answers. Before clicking the "Ask" button, we could narrow the question down by, say, choosing NJ and product-z from some menus that the system automatically generates.

In many rules-based systems, changing the order in which the rules are typed also changes the answers. However, in this kind of technology, changing the order of the rules does not change the answer. So, writing is more like specifying than programming.

Explanations

As soon as we have the supply-chain answer, we can also get an explanation of each line. For the first line of the answer, the explanation starts out with three steps, such as the following:

1.

estimated demand 523 in NJ is for 1000 gallons of product-y in October of 2005
for estimated demand 523 0.19 of the order will be product-x from ZyxCo Canada One
1000 * 0.19 = 190
for demand 523 NJ for 1000 product-y we use 190 product-x from ZyxCo Canada One
 

2.

estimated demand 523 in NJ is for 1000 gallons of product-y in October of 2005
for demand 523 for product-y refinery ZyxCo Canada One can supply 300 gallons of product-x
for demand 523 the refineries have altogether 1600 gallons of acceptable base products
300 / 1600 = 0.1875
0.1875 rounded to 2 place(s) after the decimal point is 0.19
for estimated demand 523 0.19 of the order will be product-x from ZyxCo Canada One
 

3.

estimated demand 523 in NJ is for 1000 gallons of product-y in October of 2005
in October an order for product-y can consist in whole or part of product-x
in October the refinery ZyxCo Canada One has committed to schedule 300 gallons of product-x
we have truck transportation from refinery ZyxCo Canada One to region NJ
for demand 523 for product-y refinery ZyxCo Canada One can supply 300 gallons of product-x
 

If you would like to run the example [Run Oil Supply 2007], you would see that some of the lines in the explanation are hypertext links, so that you can selectively drill down into detail.

Automatically Generated SQL

So far, we have seen how to look at some English-language requirements, translate them into business rules in English, and run the rules to get answers and explanations. To do this, we used just a few lines of test data that we typed directly into a browser. Real examples often have many lines of data, and these are usually stored in a database-management system that we can query by using SQL.

Suppose that we have our sample data in a Microsoft SQL Server database, instead of alongside the rules that we can look at the browser. Then, we would need somehow to tell the rules where to find the data. We can do this by writing a special kind of rule into the browser.

Previously, we typed into the browser a table like the following:

estimated demand this-id in this-region is for
this-quantity gallons of this-finished-product in this-month of this-year

523 NJ 1000 product-y October 2005

Suppose that the data is actually in a table, named T6, in a MySQL database. Instead of the preceding table, we can type into the browser a rule like the following:

Cc168602.06_10_IDE_Fig03(en-us,MSDN.10).gif

estimated demand this-id in this-region is for this-qty gallons of this-finished-product in this-month of this-yr

The rule says that, to find an estimated demand, we can look on the Web for the site reengineeringllc.com. There, the system should find a MySQL database-management system that contains a database that also is named mysql. In the database, there should be a table named T6. The system should be able to get the data from the table by asking for it at port 3306, giving the ID mysql, and giving the password mysqlmysql.

If we write similar rules for the other tables [Oil Supply SQL 2007], we can again ask the following question:

for demand some-id some-region for some-quantity some-finished-product we use some-amount some-product from some-refinery

The answer is the same as before; but, this time the system has used the information in the rules to generate automatically a SQL query, and to hand it to the database-management system. One strength of SQL is that it is efficient over large amounts of data; thus, this gets us scalability.

It turns out that, although the rules look quite simple, they contain information that leads to SQL that would be too complicated to write reliably by hand.

The following is part of the generated SQL:

select distinct x6,T2.PRODUCT,T1.NAME,T2.AMOUNT,x5 from T6 tt1,T6 tt2,T5,T4,T3,T2,T1,T6,
(select x3 x6,T6.FINISHED_PRODUCT x7,T6.ID x8,tt1.ID x9,tt2.ID x10,sum(x4) x5 from T6,T6 tt1,T6 tt2,
((select T6.ID x3,T3.PRODUCT1,T1.NAME,T2.AMOUNT x4,T2.PRODUCT from T1,T2,T3,T4,T5,T6,T6 tt1,T6 tt2 where
T1.NAME=T2.NAME and T1.REGION=T6.REGION and T2.MONTH1=T4.MONTH1 and
T2.MONTH1=T6.MONTH1 and T2.PRODUCT=T3.PRODUCT2 and T4.MONTH1=T6.MONTH1 and
T3.PRODUCT1=T6.FINISHED_PRODUCT and T3.SEASON=T4.SEASON and T3.SEASON=T5.SEASON and
T4.SEASON=T5.SEASON and T6.ID=tt1.ID and T6.ID=tt2.ID and tt1.ID=tt2.ID)
union (select T6.ID x3,T2.PRODUCT,T1.NAME,T2.AMOUNT x4,T2.PRODUCT from T1,T2,T3,T4,T5,T6,T6 tt1,T6 tt2 where
T1.NAME=T2.NAME and T1.REGION=T6.REGION and T2.MONTH1=T6.MONTH1 and
T2.PRODUCT=T6.FINISHED_PRODUCT and T6.ID=tt1.ID and T6.ID=tt2.ID and tt1.ID=tt2.ID))
group by T6.FINISHED_PRODUCT,T6.ID,tt1.ID,tt2.ID,x3) where
T6.ID=tt2.ID and tt1.ID=T6.ID and T6.FINISHED_PRODUCT=x7 and T6.ID=x8 and tt1.ID=x8 and
tt2.ID=x8 and T1.NAME=T2.NAME and T1.REGION=tt2.REGION and T2.MONTH1=T4.MONTH1 and
T2.MONTH1=tt2.MONTH1 and T2.PRODUCT=T3.PRODUCT2 and T3.PRODUCT1=tt1.FINISHED_PRODUCT and
T3.PRODUCT1=tt2.FINISHED_PRODUCT and T3.SEASON=T4.SEASON and
T3.SEASON=T5.SEASON and T4.MONTH1=tt2.MONTH1 and
T4.SEASON=T5.SEASON and T6.ID=x6 and tt1.FINISHED_PRODUCT=tt2.FINISHED_PRODUCT and
tt1.ID=tt2.ID and tt1.ID=x6 and tt2.ID=x6
order by x6,T2.PRODUCT,T1.NAME,T2.AMOUNT,x5;
 

Of course, because the SQL query is generated and runs automatically, it's normally not necessary for anyone to see it. However, as in the non-SQL case, we can get a step-by-step English-language explanation.

Conclusion

Ideally, business analysts and architects could write requirements in English into their browsers, and a modeling environment would automatically translate the requirements into code, run the code, and explain the results in English, at the business level.

There is an online system that makes progress toward that ideal. Analysts can type in specifications that correspond to their requirements, in the form of business rules in open-vocabulary, open-syntax English. They can then use the browser to run the rules, and to get explanations of the results in English, at the business level. The system can also use the information in the rules to generate and run SQL queries automatically over networked databases. As an analyst or architect, you could use the system to show your customers the effects of proposed changes in understandable terms; and you could do this in real time.

We looked at a two-paragraph requirement for oil-industry supply-chain planning, and we described how to write the corresponding business rules into a browser—along the way, getting help from the system. Then, we ran the rules to get an answer in the form of a table that summarizes the plan. We then got an explanation of how the system used the rules to find the plan.

In the example, we used a small amount of test data. We also showed the system automatically generating SQL for larger amounts of data. The generated SQL is too complex to write reliably by hand, and it would normally not be necessary for anyone to look at it. When generated SQL is used to get an answer, we can still get English-language explanations of results.

In a companion IASA paper [Walker 2007], we looked at how to plan to use this kind of technology to help with work on current systems. In this article, we showed an end-to-end implementation of a use case.

You are cordially invited to use the new technology, by pointing a browser to www.reengineeringllc.com. You will see tutorials, as well as examples that you can view, run, and change. You are most welcome to write and run your own examples. As befits a wiki, shared use of the system is free.

Sources

About the author

Dr. Adrian Walker is the author of over 20 papers, and coauthor of an Addison-Wesley book on business-rule systems and databases. He has been an Assistant Professor at Rutgers University; Manager of Principles and Applications of Logic Programming, IBM Research Laboratory; and Manager, Internet Development at Eventra (a manufacturing supply-chain company). He is currently the Chief Technology Officer of Reengineering LLC.

 

This article was published in Skyscrapr, an online resource provided by Microsoft. To learn more about architecture and the architectural perspective, please visit skyscrapr.net.

 

Did you find this helpful?
(1500 characters remaining)
Thank you for your feedback
Show:
© 2014 Microsoft. All rights reserved.