How to: Call Native Libraries from F#
Applies to: Functional Programming
Published: January 2010
Authors: Yin Zhu
Summary: This article discusses how to call existing native numerical libraries using Platform Invoke (P/Invoke) from F#. It explains the situations in which it is appropriate to use a native library and demonstrates how to use P/Invoke to call a native matrix multiplication procedure provided by the Netlib BLAS library.
This topic contains the following sections.
This article is associated with Real World Functional Programming: With Examples in F# and C# by Tomas Petricek with Jon Skeet from Manning Publications (ISBN 9781933988924, copyright Manning Publications 2009, all rights reserved). No part of these chapters may be reproduced, stored in a retrieval system, or transmitted in any form or by any means—electronic, electrostatic, mechanical, photocopying, recording, or otherwise—without the prior written permission of the publisher, except in the case of brief quotations embodied in critical articles or reviews.
Most numerical applications for .NET can be written as fully managed code or using existing libraries that wrap a highly optimized native implementation. There are two main situations where using a native library might be appropriate:
You may have a native library or may need to call a stable numerical library that is written in a native language, such as FORTRAN or C/C++. Rewriting the existing and tested code to a .NET language is often time-consuming and error-prone.
Certain numerical operations (such as matrix multiplication and vector dot product) are already available in native libraries. A highly-optimized native library which utilizes special CPU instructions can be more efficient than managed code written in F# or C#.
One way to call a native library from .NET languages is to compile it as a native Dynamically Linked Library (DLL) and use Platform Invoke (P/Invoke) to call the library. P/Invoke relies on the data marshaling the support provided by the .NET Framework and on the ability to declare the signature of a native function in the .NET language. F# makes the second task very easy by using C-like declarations.
This article shows how to use P/Invoke in F# to call a native library. The example uses the Basic Linear Algebra Subprograms (BLAS) library. The library provides a set of functions for vector and matrix operations. In practice, the wrapper for this library is already implemented by the F# MathProvider, but it demonstrates the technique nicely. The steps for calling other native libraries are similar. The BLAS implementation used in this article is taken from Netlib.
This article focuses on the F# side of the problem, so it doesn’t explain the compilation of the Netlib BLAS library using a FORTRAN compiler. More information about this topic (as well as about calling native C/C++ code) can be found in the resources referenced at the end of this article.
The result of compiling the Netlib implementation of BLAS is a dynamically linked library named blas.dll. It contains a set of numerical basic procedures for dense vectors and matrices, such as vector dot product and matrix multiplications. The following snippet shows a C signature of the general matrix multiplication subroutine:
extern void dgemm_ ( char *transa, char *transb, int *m, int *n, int *k, double *alpha, double *a, int *lda, double *b, int *ldb, double *beta, double *c, int *ldc );
The function takes three arrays representing matrices and several additional parameters. It multiplies two of the matrices, adds the third matrix, and stores the result in the memory allocated for the third matrix. Mathematically, it performs the following calculation: C^'=αAB+βC
The first two parameters specify whether the input matrices should be transposed, the parameters double *a, double *b and double *c are pointers to the three (two-dimensional) arrays storing the three matrices A, B, and C. The detailed descriptions of the individual parameters can be found in the Wikipedia article General Matrix Multiply.
Declaring Native Signatures
When declaring the prototype of a native routine, F# uses the native C notation. This makes it possible to paste the information from C header files directly to F# source code. The following snippet declares the prototype of the dgemm_ routine. A declaration like this should be placed inside an F# module (declared using the module keyword).
open System.Runtime.InteropServices [<DllImport("blas.dll",EntryPoint="dgemm_")>] extern void dgemm_ ( char *transa, char *transb, int *m, int *n, int *k, double *alpha, double *a, int *lda, double *b, int *ldb, double *beta, double *c, int *ldc );
The code declares the signature and specifies where to find the native implementation of the routine. The DllImport attribute is used to specify a P/Invoke stub declaration. The first argument gives the name of the DLL and the EntryPoint property gives the name of the routine in the native code. When the name of the native function is the same as the name of the managed method, then the property can be omitted, but the sample includes it to demonstrate the option. Note that the pointer definition uses double* with the C meaning. This means that double is equivalent to System.Double, which is the same as float in F#. It is possible to use any of these names.
Using Pinned Pointers
The most critical issue when using Platform Invoke is memory management. The lifetime of objects allocated by managed code is automatically controlled by the .NET Framework, and the garbage collector may relocate them. When passing a pointer to a managed object to native code, it is essential to avoid a relocation of the object. The mechanism that makes this possible is called pinning.
When an object is pinned, the garbage collector will not relocate or destroy the object. After the execution of the native code, the managed code can unpin the object, and the garbage collector can again freely manipulate it. In F#, the pinning functionality is encapsulated with the PinnedArray and PinnedArray2 types representing pinned single-dimensional and two-dimensional arrays, respectively.
The following snippet shows how to write a wrapper for dgemm_ that takes two matrices in the form of two-dimensional arrays as arguments and multiplies them:
#nowarn "51" #r "FSharp.PowerPack.dll" open Microsoft.FSharp.NativeInterop let matmul_blas (a:float[,]) (b:float[,]) = // Get dimensions of the input matrices let m = Array2D.length1 a let k = Array2D.length2 a let n = Array2D.length2 b // Allocate array for the result let c = Array2D.create n m 0.0 // Declare arguments for the call let mutable arg_transa = 't' let mutable arg_transb = 't' let mutable arg_m = m let mutable arg_n = n let mutable arg_k = k let mutable arg_alpha = 1.0 let mutable arg_ldk = k let mutable arg_ldn = n let mutable arg_beta = 1.0 let mutable arg_ldm = m // Temporarily pin the arrays use arg_a = PinnedArray2.of_array2D(a) use arg_b = PinnedArray2.of_array2D(b) use arg_c = PinnedArray2.of_array2D(c) // Invoke the native routine dgemm_( &&arg_transa, &&arg_transb, &&arg_m, &&arg_n, &&arg_k, &&arg_alpha, arg_a.Ptr, &&arg_ldk, arg_b.Ptr, &&arg_ldn, &&arg_beta, arg_c.Ptr, &&arg_ldm ) // Transpose the result to get m*n matrix Array2D.init m n (fun i j -> c.[j,i])
The dgemm_ routine takes all the arguments as pointers, so the F# code first needs to define mutable variables for each of the arguments. It gets the dimensions of the matrices, allocates a new array for the result, and then declares the mutable variables that store the dimensions and other arguments.
To prevent the relocation of heap-allocated arrays, it is necessary to convert them to pinned arrays. This is done using the PinnedArray2 type from the F# Power Pack. The method of_array2D turns an ordinary array into a PinnedArray2 object, which can be safely passed to the native code.
As already mentioned, the arguments of dgemm_ are pointers. The address of a mutable variable can be obtained using the && operator. The F# compiler emits a warning because using pointers may yield unverifiable code. In the above snippet, the warning is suppressed using the #nowarn "51" directive.
After calling the native routine, it is necessary to unpin the arrays so that the garbage collector can fully control them again. The PinnedArray2 object implements the System.IDisposable interface; therefore, the above snippet just assigned the array to a symbol using the use keyword. This guarantees that the array will be disposed of when it leaves the scope. Alternatively, it is possible to invoke the Free method, but this should be done in a try … finally block to make sure that the object is correctly freed in case an exception occurs.
This article showed how to invoke native math libraries from F# using the Platform Invoke mechanism. This technique makes it possible to write efficient applications that combine safe managed code and highly-optimized native code written in languages like C or FORTRAN. This scenario may be interesting when building new applications or extensions in F# on top of the existing and tested native libraries.
Calling a native library from F# involves several steps. The first step is to declare the signature of the native function, which is done using a standard C syntax. When allocating the memory for the native code, it is necessary to use pinned arrays to prevent the garbage collector from relocating the object on the heap. After that, the function can be invoked as usual.
In practice, many numerical libraries already provide wrappers for common native numerical libraries such as BLAS. More information about such libraries for F# can be found in the following articles:
To download the code snippets shown in this article, go to http://code.msdn.microsoft.com/Chapter-4-Numerical-3df3edee
This article is based on Real World Functional Programming: With Examples in F# and C#. Book chapters related to the content of this article are:
Book Chapter 10: “Efficiency of data structures” explains how to write efficient programs in F#. The chapter covers advanced functional techniques and explains how and when to use arrays and functional lists.
Book Chapter 12: “Sequence expressions and alternative workflows” contains detailed information on processing in-memory data (such as lists and seq<'T> values) using higher-order functions and F# sequence expressions.
Book Chapter 13: “Asynchronous and data-driven programming” explains how asynchronous workflows work and uses them to write an interactive script that downloads a large dataset from the Internet.
The following MSDN and blog articles provide more information about calling native code using Platform Invoke from F#:
Compiling Lapack for . (Yin Zhu's blog) explains how to compile the Netlib BLAS library using the MingGw FORTRAN compiler.
P/Invoke (Yin Zhu's blog) gives an example how to write a native C/C++ procedure, compile it to a DLL using Microsoft Visual C++, and then call it using P/Invoke from F#.
Marshaling Data with Platform Invoke provides more information about how data is marshaled on the boundary between the managed and the native code.
Previous article: How to: Implement Parallel Matrix Multiplication in F#
Next article: Overview: Building Data-Driven Websites