Peter Koen and Christian Strömsdörfer
Summary: This article examines someimportant characteristics of software architecture for modern manufacturingapplications. This kind of software generally follows the same trends as othertypes of software, but often with important modifications. Atypicalarchitectural paradigms are regularly applied in manufacturing systems toaddress the tough requirements of real time, safety, and determinism imposed bysuch environments.
Introduction
The Automation Pyramid
Highly Distributed, Highly Diverse
Tough Requirements
Real-World Parallelism
Security + Safety + Determinism = Dependability
Architectural Paradigms
The Future of Manufacturing Applications
Cloud Computing and Manufacturing
Almost all items we use on a daily basis (cars, pencils,toothbrushes), and even much of the food we eat, are products of automatedmanufacturing processes. Although the processes are quite diverse (brewing beerand assembling a car body are quite different), the computing environmentscontrolling them share many characteristics.
Although we refer to “manufacturing” in this article, we couldalso use the term “automation,” the latter encompassing non-manufacturing environmentsas well (such as power distribution or building automation). Everything in thisarticle therefore also applies in those environments.
Software in the industry is often illustrated in a triangle(which, strangely enough, is called the “automation pyramid”; see Figure 1).The triangle shows three areas, with the upper two being rules executed by PChardware. Although PCs link the world of factory devices with the office andserver world of ERP applications, PCs running within the manufacturingapplication are distinct from those running outside. The manufacturingapplication works on objects of a raw physical nature (boiling liquids,explosive powders, heavy steel beams), whereas the objects the office and ERPapplications handle are purely virtual.
.jpg)
Figure 1. The automationpyramid
The three levels of the automation pyramid are the following:
The base of the pyramid and the biggest amount of devices in amanufacturing plant are the controllers driving the machinery. They follow therules and the guidance of an execution runtime, which controls how theindividual steps of the process are performed. Software at the informationlevel controls the manufacturing process. It defines workflows and activitiesthat need to be executed to produce the desired product. This, of course,involves algorithms that try to predict the state of the manufacturingmachinery and the supply of resources to optimally schedule execution times.Obviously, the execution of this plan needs to be monitored and evaluated aswell. This means that commands are flowing down from information to executionto the control level while reporting data is flowing back up.
The predictability of manufacturing processes and schedules is atough problem that requires architectural paradigms which may be unfamiliar todevelopers who deal only with virtual, memory-based objects.
The software discussed in this article lives in the lower part ofthat triangle, where PCs meet devices, and where the well-known patterns andparadigms for writing successful office, database, or entertainment software nolonger apply.
Most manufacturing tasks involve many steps executed by differentdevices. To produce a large number of items in a given amount of time,manufacturers employ hundreds or even thousands of devices to execute thenecessary steps—very often, in parallel. All the intelligent sensors,controllers, and workstations taking part in this effort collaborate andsynchronize with each other to churn the required product in a deterministicand reliable way. This is all one huge application, distributed over thefactory floor.
The computing systems are rather varied, too. (See Figure 2.) Ona factory floor or in a chemical plant, one would probably find all of thefollowing systems:
All of these systems must communicate effectively with each otherin order to collaborate successfully. We therefore regularly meet a full zoo ofdiverse communication strategies and protocols, such as:
.jpg)
Figure 2. Distribution
This diversity of operating systems and means of communicationmay not look like a system that must be fully deterministic and reliable, butjust the opposite is true. However, because of the highly differentiatedsingular tasks that must be executed to create the final product, each step inthis workflow uses the most effective strategy available. If data from amachine-tool controller is needed and it happens to be running Windows 95, DCOMis obviously the better choice to access the data than SOAP.
This last example also shows a common feature of factory floors:The software lives as long as the computer that hosts it. Machine tools areexpensive and are not replaced just because there is a new embedded version of MicrosoftWindows XP. If a piece of hardware dies, it is replaced with a compatible computer.This may mean that in a 10-year-old factory, a Windows 95 computer has to bereplaced with a new one that basically has the same functionality, but is basedon an OS like Windows CE or Linux. Unless a factory is brand-new, computersfrom various time periods are present in a single factory or plant and need toseamlessly interoperate.
Because of this diversity, manufacturing applications relyheavily on standards, especially in terms of communication. The variability inoperating systems is decreasing somewhat in favor of Windows or Linux solutionsand will certainly continue in the future. Also, communication is clearlymoving away from proprietary designs toward real-time Ethernet protocols andcomparable solutions where applicable. Especially in situations where manydifferent transportation and communication protocols have to work together inone place, the focus for future development is on technologies that can dealwith many different protocols, formats, and standards. Windows CommunicationFoundation (WCF) is one of the technologies that alleviates a lot of pain inplant repairs, upgrades, and future development. Using WCF, it is possible tointegrate newer components into existing communication infrastructures withoutcompromising the internal advances of the new component. There’s simply a WCFprotocol adapter that translates the incoming and outgoing communication to thespecific protocol used in the plant. This allows for lower upgrade and repaircosts than before.
Modern manufacturing environments offer a set of tough real-timeand concurrency requirements. And we are talking “real-time” here—timeresolutions of 50 µs and less. Manufacturing became a mainly electricalengineering discipline in the first half of the last century, and thoserequirements were state of the art when software started to take over in the 1960sand 1970s. The expectations have only risen. For a steel barrel that needs tobe stopped, 10 ms is close to an eternity.
Traditionally, these tough requirements have been met byspecialized hardware with proprietary operating systems, such as:
These extreme requirements are not expected from PC systems yet,but they, too, must react to very different scenarios in due time. And usuallythere is no second try. A typical task executed by a PC on a factory floor isto gather messages coming from PLCs. If an error occurs in one of them, itregularly happens that all the other PLCs also start sending messages, like a bunchof confused animals. If the PC in charge misses one of those messages (forinstance, because of a buffer overflow), the message is probably gone for good,the cause of the failure might never be detected, and the manufacturer mustsubsequently throw away all the beer that was brewed on that day.
To summarize:
Factories are rarely built to manufacture a single productsynchronously. Rather, they try to make as many copies of the productconcurrently as possible. This induces massive parallelism, with appropriateheavyweight synchronization. Today this is implemented as a kind of taskparallelism, with each computing unit running as a synchronous system takingcare of one of those tasks. All parallelization moves into the communication.(More on how this is embedded in the overall software architecture in thesection on architectural paradigms.)
With the advent of multicore chips, however, the current patternsand paradigms need to be revisited. It is still unclear what parallel executionwithin one of the aforementioned controllers could mean or whether parallelismmakes sense within the current execution patterns.
On PC systems, however, multicore architectures can readily beapplied for the typical tasks of PCs in manufacturing applications such as dataretrieval, data management, data visualization, or workflow control.
PCs in manufacturing, however, typically carry out tasks that cantake advantage of multicore architectures right away.
Although security and safety are extremely important almosteverywhere, manufacturing always adds life-threatening features such asexploding oil refineries or failing breaks in cars. If a computer virus sneaksinto a chemical plant undetected, everything and anything might happen, not theworst of which were depredating the company’s accounts.
It is better to use the term “dependability” here, because thisis what it is all about. Both the manufacturer and the customers consuming themanufacturer’s products depend on the correct execution of the production. Thiscomes down to the following features:
The security requirements make it more or less impossible toconnect a factory to the Internet or even to the office. Contact is thereforemade based on three principles:
As already mentioned, manufacturing applications do not controlvirtual objects such as accounts or contracts, but real physical objects suchas boiling liquids, explosive powders, or heavy steel beams.
The software controllingthese objects cannot rely on transactional or load-balancing patterns, becausethere simply is no rollback and no room for postponing an action.
This results in the application of a rather differentarchitectural paradigm in those environments alongside the more traditionalevent-driven mechanisms: using a cyclic execution model.
Cyclic execution means that the operating system executes thesame sequence of computing blocks within a defined period of time over and overagain. The amount of time may vary from sequence to sequence, but the compilerand the operating system guarantee that the total time for execution of allblocks always fits into the length of the cycle.
Events and data in the cyclic paradigms are “pulled” rather than“pushed,” but the “pull” does not happen whenever the software feels like it(which is the traditional “pull” paradigm in some XML parsers), but ratherperiodically. This is the main reason why the term “cyclic” is used instead of“pull” (or “polling”).
The cyclic execution model reduces complexity:
Of course, cyclic execution has a number of drawbacks, too. Ituses CPU power even if there is nothing to do. It also limits the duration ofany task by the duration time of the cycle, which is more likely to rangebetween 100 µs and 100 ms than a second or more. No CPU can do a lot ofprocessing within 100 µs.
A system cannot rely on cyclic execution only. In anyenvironment, things might happen that cannot wait until the cyclic systemfancies to read the appropriate variables to find out something’s just about toblow up. So in every cyclic system there is an event-driven demon waiting forthe really important stuff to kill the cycle and take over. The problem here isnot whether this kind of functionality is needed or to what degree, but ratherhow to marry these two approaches on a single computer in a way that allowsboth operating systems to work 100 percent reliably.
The question of how to deal with cyclic and event-driven patternswithin a single system becomes highly important to PCs that are part of adistributed manufacturing application. Here lives a system dominated by anevent-driven operating system, communicating with mainly cycle-driven systems.Depending on the job, things must be implemented in kernel or user mode.
Those are just two examples of how architects try to let the twoparadigms collaborate. None of them works extraordinarily well, although theknown solutions are all robust enough today for 24/7 factories. What makes themwork, in the end, is probably the huge amount of testing and calibration oftimers, threading, and so on that goes into the final products. This showsclearly that there is plenty of room for improvement.
.jpg)
Figure 3. Architecturalparadigms—resource usage with cyclic versus event-driven architecture
A lot of hope for future improvements rests on the principles ofService-Oriented Architecture. By allowing cyclic and event-driven systems tobe loosely coupled through messages, it is possible to more efficiently routeinformation in the system depending on severity of the event.
Cyclic systems could then do what they are best at: deliveringpredictable resource usage and results, and simply using a specific time in thecycle to put relevant information into a message queue that then would beserviced by other systems (cyclic or event-driven). This approach would alsolessen the chance of error from pulling data from a specific location/variableby queuing the necessary data.
The architectural trends in manufacturing software can besummarized as follows:
This development is intensified by the current trends invirtualization. Because manufacturing environments typically involve theduality of device operation by a real-time operating system and visualizationand manual control by a Windows system, the offer of having both available on asingle computer is extremely intriguing for technical and financial reasons.
This certainly also applies to cloud computing. The offer to havea distributed environment look like a single entity is too tempting not toexercise it. The mode, however, will differ from the typical cloud application.Factories employ their own “local” cloud, other than “the” cloud.
Currently, there is a lot of discussion and rethinking going onin the manufacturing space. What will the future look like? Should thedevelopment favor more cyclic or event-driven execution?
Multicore CPUs add an interesting angle to this discussion. Onthe one hand, it would mean that there is enough power to simply make everythingcycle based and have several cycles run in parallel. On the other hand, itwould favor event-driven systems, because the workload can be easilydistributed.
We think that the answer is somewhere in the middle. The mostinteresting scenarios are enabled through the implementation of SOA principles.By changing many parts of the solution into services, it is easier to hostcyclic and event-driven modules on the same computer or on several devices, andhave them interact. Manufacturing systems are applying modern technology andarchitecture paradigms to automatically distribute load and move operationsbetween various nodes of the system depending on availability of resources.When you think about it, manufacturing is very similar to large-scale Internetapplications. It’s a system of various, similar, but still a little bitdifferent devices and subsystems that operate together to fulfill the needs ofa business process. The user doesn’t care about the details of one specificdevice or service. The highest priority and ultimate result of a well-workingmanufacturing system is the synchronization and orchestration of devices tobuild a well-synchronized business process execution engine that results in aflawless product. It’s not much different from the principles you can see todaywith cloud computing, web services, infrastructure in the cloud, and Softwareas a Service applications.
Taking analogies of enterprise computing to other areas ofmanufacturing execution and control opens up a whole new set of technologies,like virtualization and high-performance computing. Virtualization would allowmoving, replicating, and scaling the solution in a fast way that simply isn’tpossible today. For computation-intensive tasks, work items could be“outsourced” to Microsoft Windows Compute Cluster Server computation farms andwouldn’t affect the local system anymore. Allowing for more cycles orrespectively for handling more events has a dramatic impact on thearchitectural paradigms in manufacturing, and no one can say with 100-percentcertainty what the model of the future will be. On-demand computationalresources could very well enable more use of event-driven architectures inmanufacturing.
While these technical advances sound tempting, there are still alot of problems that need to be solved. For example, there’s a need forreal-time operating systems and Windows to coexist on the same device. Thiscan’t be done with current virtualization technology, because there’s currentlyno implementation that supports guaranteed interrupt delivery for a real-timeOS.
When you compare the cloud offerings from multiple vendors to theneeds in manufacturing, cloud computing looks like the perfect solution to manyproblems: integrated, claim-based authentication, “unlimited” data storage,communication services, message relaying, and the one factor that has changedmanufacturing and software development as no other factor could have done: economicsof scale.
Unfortunately, there is one basic attribute of cloud servicesthat make the use of current cloud offerings in manufacturing pretty muchimpossible: the cloud itself.
Manufacturing is a highly sensitive process and any disruptioncould not only cost millions of dollars, but also be a serious risk to the livesof thousands of consumers. Just imagine what would happen if a hacker gotaccess to the process that controls the recipes for production of food items.It is pretty much unthinkable to ever have a manufacturing environmentconnected to the cloud.
Still, plenty can be learned from cloud-based infrastructure andservices. Let’s look at SQL Server Data Services as an example. SSDS is notsimply a hosted SQL Server in the cloud. It’s a completely new structured datastorage service in the cloud; designed for simple entities that are groupedtogether in containers and a very simple query, which allows direct access tothese entities. The big advantage of SSDS is that it is a cloud-based solutionthat hides the details of replication, backup, or any other maintenance taskfrom the application using it. By providing geo replication and using the hugedata centers from Microsoft, it can guarantee virtually unlimited data storageand indestructible data.
Hosting a system like this on premise in manufacturing plantswould solve many problems we are facing today: Manufacturing processes aregenerating a vast amount of data. Every single step of the production processneeds to be documented. This means we are looking at large amounts of verysimple data entities. The data-collection engine cannot miss any of thegenerated data; otherwise, the product might not be certified to sell it. Andone very interesting aspect of manufacturing systems is that once a devicestarts sending an unusually high amount of data, for example due to an errorcondition, this behavior quickly replicates on the whole system and suddenlyall devices are sending large volumes of log data. This is a situation that’svery difficult to solve with classical enterprise architecture database servers.
The automated replication feature is also of great use: Manymanufacturing systems span thousands of devices on literally thousands ofsquare feet of factory floors—many times, even across buildings or locations.Putting data in time close to where it is needed is essential to guarantee awell-synchronized process.
While cloud computing is still in its infancy and almost allvendors are busy building up their first generation of services in the cloud,the applicability of those services as an on-premise solution is oftenneglected to stay focused on one core competency.
Christian Strömsdörferis a senior software architect in a research department of Siemens Automation,where he is currently responsible for Microsoft technologies in automationsystems. He has 18 years of experience as a software developer, architect, andproject manager in various automation technologies (numerical, logical, anddrive controls) and specializes in PC-based high-performance software. Hereceived his M.A. in linguistics and mathematics from Munich University in1992. He resides in Nuremberg, Germany, with his wife and three children.
Peter Koen is a SeniorTechnical Evangelist with the Global Partner Team at Microsoft. He works on newconcepts, ideas, and architectures with global independent software vendors(ISVs). Siemens is his largest partner where he works with Industry Automation,Manufacturing Execution Systems, Building Technologies, Energy, Healthcare, andCorporate Technologies. Peter resides in Vienna, Austria and enjoys playing thepiano as a way to establish his work/life balance.
This article was published in the Architecture Journal, a printand online publication produced by Microsoft. For more articles from thispublication, please visit the Architecture Journal Web site.