Export (0) Print
Expand All
Expand Minimize
4 out of 4 rated this helpful - Rate this topic

Real-Time Behavior of the .NET Compact Framework

Windows CE .NET
 

Maarten Struys
Michel Verhagen
PTS Software

Applies to:
    Microsoft® Windows® CE .NET
    Microsoft Visual Studio® .NET
    Microsoft Visual Basic® .NET
    Microsoft Visual C#®
    Microsoft .NET Compact Framework


Download CFInRT_used_for_actual_measurements.exe.


Download RTCF.exe.


Download Win32CEAppInRT.exe.

Summary: With the arrival of Visual Studio .NET 2003, with integrated support for Smart Device Programmability, it is possible to develop applications for a broad range of devices by using managed code. Software developers can now use new exciting languages like Visual Basic .NET and Visual C# for device development. Although this sounds promising, one question is still to be answered: Is it possible to make use of the real-time capabilities of Windows CE .NET while using managed code to write applications for an embedded device? This article will answer that question and suggest a possible scenario in which real-time behavior can be combined with Microsoft .NET functionality. (13 printed pages)

Contents

A Managed World and an Unmanaged World
Platform Invoke at Work
A Real-Time Scenario
The Actual Test
The Results
Pitfalls
Proof of Results
Conclusion
Acknowledgements

A Managed World and an Unmanaged World

Some of the advantages of a managed environment like the Microsoft® Common Language Runtime, such as writing safer and platform independent software, could be a disadvantage in a real-time environment. Typically, you cannot afford to wait for a just-in-time (JIT) compiler to compile a method prior to using it, and you cannot wait for a garbage collector to clear previously allocated memory by removing unused resources. Both of these features can interfere with deterministic system behavior. It is possible to force the garbage collector to do its duty, by calling GC.Collect(). However, you want the garbage collector to perform its task by itself, because it is highly optimized. To allow hard real-time behavior, it would be great if there was a way in which you could distinguish between hard real-time functionality, written in native or unmanaged Microsoft Win32® code and other functionality, written in managed code. Making use of Platform Invoke or P/Invoke, you can just do that.

Platform Invoke at Work

According to MSDN® Help, P/Invoke is the functionality provided by the common language runtime to enable managed code to call unmanaged native dynamic-link library (DLL) entry points. In other words, P/Invoke gives you an escape route from managed Microsoft .NET code to unmanaged Win32 code. To be able to use this mechanism within Microsoft Windows® CE .NET, native Win32 functions that you want to call must be defined extern public within a DLL. Because the managed .NET environment does not know anything about C++ name mangling, the functions to be called from within a managed application should also have C naming conventions. To be able to use functionality from within a DLL, you need to build a wrapper class around the function entry points from within your managed application. Listing 1 shows an example of a small, unmanaged DLL. Listing 2 shows how to call it from managed code. Because this mechanism works for all exported DLL functions and because almost all Win32 API's are exported in coredll.dll, this mechanism also provides a way to call into almost any Win32 API. We used P/Invoke in our test to have a managed application calling into an unmanaged real-time thread.

// This is the function GetTimingInfo that exists in the
// unmanaged Win32 DLL. The function is fed with information,
// originating in an Interrupt Service Thread in the same 
// DLL. On request of the managed application, timing 
// information is copied using a double buffering mechanism.
RTCF_API DWORD GetTimingInfo(LPDWORD lpdwAvgPerfTicks,
    LPDWORD lpdwMax,
    LPDWORD lpdwMin,
    LPDWORD lpdwDeltaMax,
    LPDWORD lpdwDeltaMin)
{
    g_bRequestData = TRUE;
    if (WaitForSingleObject(g_hNewDataEvent,
        1000)==WAIT_OBJECT_0)
    {
        *lpdwAvgPerfTicks = g_dwBufferedAvgPerfTicks;
        *lpdwMax = g_dwBufferedMax;
        *lpdwMin = g_dwBufferedMin;
        *lpdwDeltaMax = g_dwBufferedDeltaMax;
        *lpdwDeltaMin = g_dwBufferedDeltaMin;
        return 1;
    }
    else
        return 0;
}

// GetTimingInfo prototype
#ifdef RTCF_EXPORTS
#define RTCF_API __declspec(dllexport)
#else
#define RTCF_API __declspec(dllimport)
#endif

extern "C"
{
    RTCF_API BOOL Init();
    RTCF_API BOOL DeInit();
    RTCF_API DWORD GetTimingInfo(LPDWORD lpdwAvgPerfTicks,
        LPDWORD lpdwMax,
        LPDWORD lpdwMin,
        LPDWORD lpdwDeltaMax,
        LPDWORD lpdwDeltaMin);
}

Listing 1. Win32 DLL to be called from within managed code

// Wrapper class to be able to P/Invoke into a DLL.
// Exported functions in the DLL are imported by this
// wrapper. Note the use of compiler attributes to identify
// the physical DLL that hosts the exported functions.
using System;
using System.Runtime.InteropServices;

namespace CFinRT
{
    public class WCEThreadIntf
    {
        [DllImport("RTCF.dll")]
        public static extern bool Init();
        [DllImport("RTCF.dll")]
        public static extern bool DeInit();
        [DllImport("RTCF.Dll")]
        public static extern uint GetTimingInfo(
            ref uint perfAvg,
            ref uint perfMax,
            ref uint perfMin,
            ref uint perfTickMax,
            ref uint perfTickMin);
    }
}

// Call an unmanaged function from within managed code 
public void CollectValue() 
{
    if (WCEThreadIntf.GetTimingInfo(ref aveSleepTime,
        ref maxSleepTime,
        ref minSleepTime,
        ref curMaxSleepTime,
        ref curMinSleepTime) != 0) 
    {
        curMaxSleepTime = (uint)(float)((curMaxSleepTime *
            scaleValue) / 1.19318);
        curMinSleepTime = (uint)(float)((curMinSleepTime *
            scaleValue) / 1.19318);
        aveSleepTime = (uint)(float)((aveSleepTime *
            scaleValue) / 1.19318);
        maxSleepTime = (uint)(float)((maxSleepTime *
            scaleValue) / 1.19318);
        minSleepTime = (uint)(float)((minSleepTime *
            scaleValue) / 1.19318);
    } 

    StoreValue();
    counter = (counter + 1) % samplesInMinute;
}

Listing 2. Calling into unmanaged code

A Real-Time Scenario

A system needs hard real-time functionality to retrieve information from an external source. The information is stored in the system and will be presented to the user in some graphical way. Figure 1 shows a possible scenario for this problem.

Figure 1. Real-time scenario using both managed and unmanaged code

A real-time thread living inside a native Win32 DLL receives an interrupt from an external source. The thread processes the interrupt and stores relevant information to be presented to the user. On the right side, a separate UI thread, written in managed code, reads information that was previously stored by the real-time thread. Given the fact that context switches between processes are expensive, you want the entire system to live within the same process. If you separate real-time functionality from user interface functionality by putting real-time functionality in a DLL and providing an interface between that DLL and the other parts of the system, you have achieved your goal of having one single process dealing with all parts of the system. Communication between the UI thread and the real-time (RT) thread is possible by means of using P/Invoke to get into the native Win32 code.

The Actual Test

You want to make the test representative, yet as simple as possible so it can also be repeated easily on other systems. For that purpose, the source code to run the experiment yourself is available for download. This test requires a way to feed interrupts into the system and a possibility to output probes to be able to measure the performance of the system. You feed the system by using a block wave, generated by a signal generator. Of course, the Windows CE .NET operating system should be capable of hosting the .NET Compact Framework. Paul Yao has written an article indicating which Windows CE .NET modules and components should be present to run managed applications. See Microsoft .NET Compact Framework for Windows CE .NET. The aim of the test is not just to be representative and reproducible; just find a suitable interrupt source for input. Listing 3 shows how to hook a physical interrupt to an Interrupt Service Thread.

RTCF_API BOOL Init()
{
    BOOL bRet = FALSE;
    DWORD dwIRQ = IRQ;    // in our case IRQ = 5

    // Get a SysIntr for the specified IRQ   
    if (KernelIoControl(IOCTL_HAL_TRANSLATE_IRQ,
        &dwIRQ,
        sizeof(DWORD),
        &g_dwSysIntr,
        sizeof(DWORD),
        NULL))
    {
        // create an event that will activate our IST
        g_hEvent = CreateEvent(NULL, FALSE, FALSE, NULL);

        if (g_hEvent)
        {
            // Connect the interrupt to our event and
            // create our Interrupt Service Thread.
            // The actual IST is shown in listing 4
            InterruptDisable(g_dwSysIntr);

            if (InterruptInitialize(g_dwSysIntr,
                g_hEvent, NULL, 0))
            {
                g_bFinish = FALSE;
                g_hThread = CreateThread(NULL,
                    0,
                    IST,
                    NULL,
                    0,
                    NULL);
                if (g_hThread)
                {
                    bRet = TRUE;
                }
                else
                {
                    InterruptDisable(g_dwSysIntr);
                    CloseHandle(g_hEvent);
                    g_hEvent = NULL;
                }
            }
        }
    }
    return bRet;
}

Listing 3. Connecting a physical interrupt to an interrupt service thread

To test the real-time behavior of an application making use of managed code and the .NET Compact Framework, we have created a Windows CE .NET platform, based on Standard SDK. We also included the RTM version of the .NET Compact Framework in the platform. The operating system runs on a Geode GX1 at 300 megahertz (MHz). We feed the system with a block wave, immediately connected to the IRQ5 line on the PC104 bus (pin 23). The frequency of the block wave is 10 kilohertz (kHz). On uprising flanks, an interrupt is generated. The interrupt is processed by an interrupt service thread (IST). In the IST we send out probe pulses to the parallel port to be able to view an output signal. We also store the time at which the IST was activated making use of the high resolution QueryPerformanceCounter API. To be able to measure timing information over a long period of time, we also store maximum and minimum time in addition to average time. The time from interrupt occurrence to probe output is an indication of IRQ — IST latency. The timing information acquired by the high resolution timer indicates when the IST is activated. Ideally this value should be 100 µ seconds for an interrupt rate of 10 kHz. All timing information is passed to the graphical user interface on regular intervals.

Because the .NET Compact Framework itself can not be used in hard real-time situations like the situations described earlier, we decided to use it for presentation purposes only and to use a DLL, written in embedded Microsoft Visual C++® 4.0 for all real-time functionality. For communication between the DLL and the .NET Compact Framework graphical user interface (GUI), a double buffering mechanism is used in combination with P/Invoke. The GUI requests new timing information on regular intervals, making use of a System.Threading.Timer object. The DLL decides when it has time available to pass information to the GUI. Until data is ready, the GUI is blocked. The refresh rate of the information presented in the GUI is user selectable. For our test we used a refresh rate of 50 msec.

The following pseudo code explains the operation of the IST and the mechanism by which the GUI retrieves information, stored in the native Win32 DLL.

Interrupt Service Thread:
Wait
On IRQ 5 send probe pulse to the parallel port
Measure time with QueryPerformanceCounter
Store measured time (min, max, current, average) locally
if (userInterfaceRequestsData) {
    copy measured time information
    reset statistic measure values
    set dataReady event
    userInterfaceRequestsData = false
}  

Managed code periodical update of display data:

disable timer     // See pitfalls
call with P/Invoke into the DLL
// The following code is implemented in the DLL
userInterfaceRequestsData = true
wait for dataReady event
return measured values
draw measured values on the display, each time using new graphics objects
update marker    // A running vertical bar on the display
enable timer

During the test we hooked up an oscilloscope and made printouts of both the scope and the Windows CE .NET graphical display 10 minutes into the experiment. Figure 2 shows the interrupt latency measured with an oscilloscope. In the best case, the latency is 14.0 µ seconds, in the worst case the latency is 54.4 µ seconds, meaning a jitter of 40.4 µ seconds. Figure 3 shows the periodic when the IST is activated. This figure is a screen shot of the actual user interface. Ideally the IST should run every 100 µ sec, which is also the average time during our measurement (the blue line in the middle). We also measured overall minimum (green) and maximum (red) times, in addition to minimum and maximum times over the sample period of 50 milliseconds (the white block). The deviation we found during the test period is limited to ±40 µ seconds.

Figure 2. Managed application: IRQ. IST latency

Figure 3. Managed application: IST activation times after running 10 minutes

The Results

We measured over a longer period of time to make sure that both the garbage collector and the JIT compiler were frequently active. Thanks to the folks at Microsoft, we were able to monitor the behavior of the .NET Compact Framework because they provided us with a performance counters registry key. Using this key, a number of performance counters within the .NET Compact Framework are activated. We mainly used this performance information to verify that the JIT compiler and the garbage collector actually ran. It also gave a nice indication about the number of objects used during the cause of the test.

// Our periodic timer method in which we want to collect new
// data and refresh the screen
private void OnTimer(object source) 
{
    // Temporarily stop the timer, to prevent against
    // a whole bunch of OnTimer calls to be invoked
    if (theTimer != null) 
    {
        theTimer.Change(Timeout.Infinite, dp.Interval);
    }
    Pen blackPen = new Pen(Color.Black);
    Pen yellowPen = new Pen(Color.Yellow);
    Graphics gfx = CreateGraphics();

    td.SetTimePointer(dp.CurrentSample, gfx, blackPen);

    for (int i = 0; i < dp.SamplesPerMeasure; i++) 
    {
        td.ShowValue(dp.CurrentSample, dp[i], gfx, i);
    }

    dp.CollectValue();
    td.SetTimePointer(dp.CurrentSample, gfx, yellowPen);

    gfx.Dispose();
    yellowPen.Dispose();
    blackPen.Dispose();

    // Restart the timer again for the next update
    if (theTimer != null) 
    {
        theTimer.Change(dp.Interval, dp.Interval);
    }
}

Listing 4. Handling timer messages in a managed world

As you can see in listing 4, we instantiate a number of objects each time we periodically update the screen. These objects, two pens and a graphics object, are created during each screen update. The functions td.ShowValue and td.SetTimerPointer also create brushes. Because td.SetTimerPointer is called twice per screen update, a total of six objects are created during each update of the screen. Because we update the screen every 50 msec, a total number of 120 objects are created each second. Over 10 minutes of execution, 72,000 objects are created. All of these objects are potentially subject to garbage collection. In table 1, the number of allocated objects roughly corresponds to these theoretical values.

Counter Value n mean min max
Total Program Run Time 603752 0 0 0 0
Peak Bytes Allocated 1115238 0 0 0 0
Number Of Objects Allocated 66898 0 0 0 0
Bytes Allocated 1418216 66898 21 8 24020
Number Of Simple Collections 0 0 0 0 0
Bytes Collected By Simple Collection 0 0 0 0 0
Bytes In Use After Simple Collection 0 0 0 0 0
Time In Simple Collect 0 0 0 0 0
Number Of Compact Collections 1 0 0 0 0
Bytes Collected By Compact Collections 652420 1 652420 652420 652420
Bytes In Use After Compact Collection 134020 1 134020 134020 134020
Time In Compact Collect 357 1 357 357 357
Number Of Full Collections 0 0 0 0 0
Bytes Collected By Full Collection 0 0 0 0 0
Bytes In Use After Full Collection 0 0 0 0 0
Time In Full Collection 0 0 0 0 0
GC Number Of Application Induced Collections 0 0 0 0 0
GC Latency Time 357 1 357 357 357
Bytes Jitted 14046 259 54 1 929
Native Bytes Jitted 70636 259 272 35 3758
Number of Methods Jitted 259 0 0 0 0
Bytes Pitched 0 0 0 0 0
Number of Methods Pitched 0 0 0 0 0
Number of Exceptions 0 0 0 0 0
Number of Calls 3058607 0 0 0 0
Number of Virtual Calls 1409 0 0 0 0
Number Of Virtual Call Cache Hits 1376 0 0 0 0
Number of PInvoke Calls 176790 0 0 0 0
Total Bytes In Use After Collection 421462 1 421462 421462 421462

Table 1. .NET Compact Framework performance results after running the test for five minutes

We have included performance counter results for both a 10 minute and a 100 minute run. This data was recorded during the actual test. As you can see, after running for 10 minutes, garbage collection occurred without noticeable fallbacks in performance. Table 2 shows the performance counters for a run of approximately 100 minutes. Full garbage collection occurred in this run. During this run, only 461,499 objects were created instead of the 720,000 expected objects. This is approximately 35 percent fewer objects than expected. The difference is likely to be caused by the performance counters that, according to Microsoft, result in a performance penalty of about 30 percent within the managed application. However, real-time behavior of the system was not influenced, as shown in figure 4.

Counter Value n mean min max
Execution Engine Startup Time 478 0 0 0 0
Total Program Run Time 5844946 0 0 0 0
Peak Bytes Allocated 1279678 0 0 0 0
Number Of Objects Allocated 461499 0 0 0 0
Bytes Allocated 8975584 461499 19 8 24020
Number Of Simple Collections 0 0 0 0 0
Bytes Collected By Simple Collection 0 0 0 0 0
Bytes In Use After Simple Collection 0 0 0 0 0
Time In Simple Collect 0 0 0 0 0
Number Of Compact Collections 11 0 0 0 0
Bytes Collected By Compact Collections 8514912 11 774082 656456 786476
Bytes In Use After Compact Collection 1679656 11 152696 147320 153256
Time In Compact Collect 5395 0 490 436 542
Number Of Full Collections 2 0 0 0 0
Bytes Collected By Full Collection 397428 2 198714 1916 395512
Bytes In Use After Full Collection 79924 2 39962 17328 62596
Time In Full Collection 65 2 32 2 63
GC Number Of Application Induced Collections 0 0 0 0 0
GC Latency Time 5460 13 420 2 542
Bytes Jitted 19143 356 53 1 929
Native Bytes Jitted 95684 356 268 35 3758
Number of Methods Jitted 356 0 0 0 0
Bytes Pitched 85304 326 261 35 3758
Number of Methods Pitched 385 0 0 0 0
Number of Exceptions 0 0 0 0 0
Number of Calls 21778124 0 0 0 0
Number of Virtual Calls 1067 0 0 0 0
Number Of Virtual Call Cache Hits 1029 0 0 0 0
Number of PInvoke Calls 1996991 0 0 0 0
Total Bytes In Use After Collection 5632119 13 433239 84637 493054

Table 2. .NET Compact Framework performance results after running the test for 100 minutes

Figure 4. Managed application: IST activation times after running 100 minutes

The remote process viewer provides more proof of the fact that the garbage collector and the JIT compiler did not influence real-time behavior. Figure 5 shows a screen dump of the remote process viewer for the managed application. All threads in the application (except the real-time thread with priority 0) run at normal priorities (251). During our measurements we did not find that the JIT compiler and garbage collector needed kernel blocking to perform their tasks.

Figure 5. Remote process viewer showing the managed application

Pitfalls

During the test, increasing the frequency of the block wave led to unexpected results in the managed application. Especially in the situation in which the screen needed to be redrawn frequently because areas of the screen were invalid, the application randomly hung up the system. Investigation of this problem showed unexpected behavior for experienced Win32 programmers. In a Win32 application, using a timer results in a WM_TIMER message each time a timer expires. However, in the message queue WM_TIMER messages are low priority messages, only posted when there are no other higher priority messages to be processed. This behavior can potentially lead to missing timer ticks, but because CreateTimer does not give you an accurate timer to begin with. This is not a issue, especially if the timer is used to update a graphical user interface (GUI). However, in the managed application, we use a System.Threading.Timer object to create a timer. This timer calls a delegate every time the timer expires. The delegate is called from within a separate thread that exists in a thread pool. If the system is too busy with other activities, for example redrawing an entire screen, more timer delegates, each in separate threads, are activated before previously activated delegates are finished. This might lead to consuming all available threads from the thread pool, causing the system to hang. The solution to prevent this behavior is found in listing 4. Each time a timer delegate is activated, we stop the timer object by invoking the Change method of the Timer object, to indicate that we do not want the next timer message until we have processed the current one. This might result in inaccurate timer intervals. In our case the timer is just used to refresh the screen so inaccurate timing is not an issue.

Proof of Results

To be able to compare the results of our experiment with typical results in the same setting, we also wrote a Win32 application that invoked the same DLL with real-time functionality. The Win32 application is functionally identical to the managed application. It provides the system with a graphical user interface in which timing information is displayed in a window. This application paints timing results upon reception of WM_TIMER messages, solely making use of Win32 APIs. We did not find any significant difference in performance, as figures 6 and 7 show. In figure 6 the interrupt latency is again measured with an oscilloscope. For the Win32 application, the latency is 14.4 µ seconds. In the worst case, the latency is 55.2 µ sec, meaning a jitter of 40.8 µ sec. These results are identical to the test run with a .NET Compact Framework managed application.

Figure 6. Win32 application: IST activation times after running 10 minutes

In figure 7, the periodic time is displayed when the IST is activated, again for the Win32 application. Again, the results are identical to the results of a .NET Compact Framework managed application. Both sources for the managed application and the Win32 application can be downloaded.

Figure 7. Win32 application: IST activation times after running 10 minutes

Conclusion

It is important that you understand that we are not suggesting the .NET Compact Framework for any real-time work by itself. We suggest that it can be used as a presentation layer. In such a system, the .NET Compact Framework can "peacefully coexist" with real-time functionality, not affecting the real time behavior of Windows CE .NET. In this article we have not benchmarked the graphics capabilities of the .NET Compact Framework. In our situation we did not find any significant difference between an application written entirely in Win32 and an application partly written in a managed environment with C#. Given the higher programmer productivity and the richness of the .NET Compact Framework, there are many advantages to writing presentation layers in managed code and writing hard real-time functionality in unmanaged code. The clear distinction between these different types of functionality is something you will get for free, by using this approach.

Acknowledgements

We have been thinking quite a while about testing the usability of the .NET Compact Framework in real-time scenarios. This test was only possible by cooperating with people and companies that could provide us with the proper hardware and measuring equipment. Therefore we like to thank Willem Haring of Getronics for his support, ideas and hospitality during this project. We also like to thank the folks at Delem for their hospitality and for providing us with the necessary equipment to execute our tests.

About the Authors

Michel Verhagen works at PTS Software in the Netherlands. Michel is a Windows CE .NET consultant, has 4 years experience with Windows CE. His main expertise lies in the area of Platform Builder.

Maarten Struys also works at PTS Software. There he is responsible for the real-time and embedded competence centre. Maarten is an experienced Windows (CE) developer, having worked with Windows CE since its introduction. Since 2000, Maarten is working with managed code in .NET environments. He is also a freelance journalist for the two leading magazines about embedded systems development in the Netherlands. He recently opened a website with information about .NET in the embedded world.

Additional Resources

For more information about Windows CE .NET, see the Windows Embedded Web site.

For online documentation and context-sensitive Help included with Windows CE .NET, see the Windows CE .NET product documentation.

For more information about Microsoft Visual Studio® .NET, see the Visual Studio Web site.

Did you find this helpful?
(1500 characters remaining)
Thank you for your feedback
Show:
© 2014 Microsoft. All rights reserved.