March 2019

Volume 34 Number 3

[Blockchain]

Verify e-Documents with Smart Contracts in Azure Blockchain Development Kit

By Stefano Tempesta | March 2019

The introduction of smart contracts in blockchain networks has created a business logic tier that was missing in the early iterations of blockchain. Smart contracts offer the ability to apply conditional logic to transactions before they’re executed. Still, smart contracts can operate only on data that’s stored on the blockchain digital ledger. Business processes, however, rarely run in isolation. They often need data integration with external systems and devices.

For example, processes may include transactions initiated on a distributed ledger that employs data sourced from an external system, service or device. External systems may be required to react to events raised by smart contracts in response to validation logic. This article describes how to automate document sign and verify workflows in SharePoint using the recently released Azure Blockchain Development Kit (aka.ms/bcdevkit) for persisting files’ hash and metadata on a blockchain digital ledger.

Azure Blockchain Development Kit

The release of the Azure Blockchain Development Kit, built on Microsoft’s serverless technology, represents a milestone in the adoption of blockchain technologies in the enterprise space. Thanks to the Blockchain Development Kit, you can now build solutions that seamlessly integrate blockchain with the best of Microsoft and third-party software applications. As mentioned on its release notes, the initial version of the kit prioritizes capabilities related to three key themes: connecting interfaces, integrating data and systems, and deploying smart contracts and blockchain networks.

Connection includes communication channels such as mobile and Web, SMS and voice, as well as IoT devices and even chat bots. Integration with line-of-business applications spans multiple systems, including SharePoint, OneDrive for Business, Dynamics 365, open source, and any API-enabled platforms, as well as legacy protocols like file systems, FTP servers, or SQL databases. The deployment of smart contracts and blockchain networks will help mainstream blockchain technology in enterprise software development, and introduce governance and DevOps to the blockchain software development practice.

Blockchain Development Kit works in combination with Azure Logic Apps and Flow, which provide a visual design environment for workflows that include more than 200 connectors to Microsoft and third-party systems and services. In concert, they dramatically simplify the development of end-to-end blockchain applications that access on- and off-chain data, handle events generated by the digital ledger, and leverage the Azure ecosystem for a seamless and integrated solution. Let’s explore a practical application in the context of enterprise content management.

Signing Digital Assets

With blockchain, you can imagine a world in which documents are embedded in digital code and stored in transparent, shared databases, where they’re protected from deletion, tampering and revision. In this world every agreement, every process, every task, and every payment would have a digital record and signature that could be identified, validated, stored, and shared. Intermediaries like lawyers, brokers and institutions might no longer be necessary. Individuals, organizations, and machines would freely transact and interact with one another with little friction. This is the immense potential of blockchain.

The potential application of content decentralization and distribution is enormous. With a single, immutable and verifiable record store, people will own their digital identity and records—think of identity or residence documents, medical records, educational or professional certificates and licenses. All these documents and their metadata can be issued on the blockchain and be digitally signed. No more fake certifications, no more degree mills, no more “photoshopped” papers.

Students, for example, may apply for further study, a job, or immigration to another country; and in the process may be required to prove their level of study or knowledge of language to attend university. Entities like recruiters, employers, governments and universities can verify the student’s credentials without relying on central authorities—in just minutes, and with no other intermediaries.

Figure 1 describes the mentioned scenario. Certificates are issued by an authority, such as an education institute (1), stored on a centralized document management server (2), or on a distributed file system like IPFS (ipfs.io) and signed with a cryptographic function. I’ll go into more about IPFS later in the article. The content hash and certificate’s metadata hash are then stored on the blockchain digital ledger (3) and attached to the user’s digital identity as a smart contract address that stores this information (4). This represents a sort of unique authenticity token, which identifies the document in a non-questionable way.

The Signing Actors and Process
Figure 1 The Signing Actors and Process

A common pattern is to generate a unique hash of the digital asset and a unique hash of the metadata that describes it. Those hashes are then stored on a blockchain. If authenticity of a document is ever questioned, the off-chain file can be re-hashed at a later time and that hash compared to the on-chain value. If the hash values match, the document is authentic, but if just a character in a document is modified, the hashes won’t match, making obvious that a change has occurred.

Build the Signing Logic App Flow

Let’s look at a potential implementation of this workflow using Azure Logic App. The Logic App flow will generate a document and metadata hashes, and store the former on SharePoint and the latter on an Ethereum network, using the Ethereum connector available as part of the Azure Blockchain Development Kit. The calculation of the hash value is done in an Azure Function built on the .NET runtime stack. The function is based on the HTTP trigger template, and it will be run as soon as it receives an HTTP request.

The code in Figure 2 implements the ComputeHashFunction Azure Function for computing a hash using the SHA256 algorithm. After reading the request body in the Run method, the function computes the hash using the SHA256 library available in the System.Security.Cryptography namespace. The hash value is returned as a UTF8-encoded string.

Figure 2 The ComputeHashFunction

public static class ComputeHashFunction
{
  [FunctionName("ComputeHashFunction")]
  public static async Task<IActionResult> Run(
    [HttpTrigger(AuthorizationLevel.Function,
      "get", "post", Route = null)] HttpRequest req, ILogger log)
  {
    string requestBody =
      await new StreamReader(req.Body).ReadToEndAsync();
    string hash = ComputeHash(requestBody);
    return (ActionResult)new OkObjectResult(hash);
  }
  private static string ComputeHash(string data)
  {
    // Create a SHA256 hash
    using (SHA256 sha256 = SHA256.Create())
    {
      byte[] bytes = sha256.ComputeHash(Encoding.UTF8.GetBytes(data));
      // Convert the byte array to a string
      return Encoding.UTF8.GetString(bytes);
    }
  }
}

The Logic App flow is triggered when a new document is uploaded to a SharePoint site. This event is handled by one of the “When a file is created …” actions on the SharePoint connector (as depicted in Figure 3). To configure this action, after entering your authentication credentials for SharePoint, you have to specify the site address of the SharePoint site to monitor for new files, and the specific folder where files are uploaded. You can also set the frequency of polling this folder and checking for new files. A reasonable setting is to check once per minute.

The Logic App Action that Handles the File Creation Event in SharePoint
Figure 3 The Logic App Action that Handles the File Creation Event in SharePoint

The next step in the flow is, as already anticipated, the hashing of the uploaded file’s content and metadata. As I’ve implemented the hashing function as an Azure Function, all you need to invoke this function is the Choose an Azure function action from the Azure Functions connector. Once you select ComputeHashFunction from the list of available functions, you’ll be prompted to specify the request body that will be passed to the function itself. This is a JSON object that will be transferred in input to the function, obtaining its hash value as output. I’ve defined the following attributes as file metadata, as shown in Figure 4: contentType, etag, id, name and path.

Attributes in the Request Body for the Hash Function
Figure 4 Attributes in the Request Body for the Hash Function

The previous step is needed to hash the file metadata. Now I must hash also the entire file content, to preserve it in an immutable state in the blockchain network. As before, add another Choose an Azure function action from the Azure Functions connector, but this time, instead of the several file attributes, pick File Content.

Once you’ve obtained the hash values for both file metadata and file content, it’s time to store it on the blockchain network. For this purpose, I’m using Azure Blockchain Workbench (aka.ms/abcworkbench) as the runtime environment for smart contracts running on Ethereum. Blockchain Workbench is expected to support multiple blockchain platforms, but for now I’ll stick to Ethereum.

Access to the digital ledger can be obtained by sending a message to the Azure Service Bus deployed as part of the Blockchain Workbench solution. An external system like a Logic App action can communicate with a smart contract hosted in Blockchain Workbench by sending a message to Service Bus. The message is picked by the Blockchain Workbench runtime and a new blockchain transaction is created, containing the message. Communication with Ethereum can happen only by generating a transaction that invokes a smart contract, as depicted in Figure 5.

Sending a Message to a Smart Contract
Figure 5 Sending a Message to a Smart Contract

To send a message from a Logic App flow to Service Bus you can use the Send message action on the Service Bus connector. A connection to an Azure Service Bus is identified by a connection name and a connection string. You can enter any convenient name as connection name, and you obtain the Service Bus connection string from the Azure Portal where it’s deployed. The message to send to the Service Bus also requires the following parameters:

  • requestId: A unique identifier for the request generated by the Logic App action
  • processedDateTime: Timestamp of the request being sent
  • userChainIdentifier: User address in the deployed Ethereum network
  • applicationName: Name of the smart contract being invoked on Ethereum
  • workflowName: Name of the workflow being invoked on Blockchain Workbench

I define these parameters as variables in the Logic App flow, by using the Initialize variable action from the Variables connector. The requestId variable can be set to guid, which is an expression that generates a unique GUID. The processedDateTime variable can be set to utcNow, which represents the current coordinated universal time. For userChainIdentifier, you can enter the address of a user in Blockchain Workbench that’s authorized to run the smart contract, whereas applicationName and workflowName are defined as per name and workflow of the smart contract that processes this transaction.

The next section describes the smart contract for processing these messages sent by the Logic App flow. Figure 6 summarizes the message body, in JSON format, to send to Service Bus. The expressions in <acute angle brackets> have to be replaced with the corresponding value.

Figure 6 Structure of the Message Sent to Azure Service Bus

{
  "requestId": "<The requestId variable>",
  "userChainIdentifier": "<User address in Azure Blockchain Workbench>",
  "applicationName": "<Smart contract name>",
  "workflowName": "<Smart contract workflow name>",
  "parameters": [
    {
      "name": "registryAddress",
      "value": "<Contract address in Azure Blockchain Workbench>"
    },
    {
      "name": "fileId",
      "value": "<File identifier>"
    },
    {
      "name": "location",
      "value": "<File path>"
    },
    {
      "name": "fileHash",
      "value": "<File content hash>"
    },
    {
      "name": "fileMetadataHash",
      "value": "<File metadata hash>"
    },
    {
      "name": "contentType",
      "value": "<File content type>"
    },
    {
      "name": "etag",
      "value": <File entity tag>
    },
    {
      "name": "processedDateTime",
      "value": "<The processedDateTime variable>"
    }          
  ],
  "connectionId": 1,
  "messageSchemaVersion": "1.0.0",
  "messageName": "CreateContractRequest"
}

Smart Contract for Processing Digital Assets

First of all, let me reinforce the message that digital assets aren’t stored on the blockchain. Hash values of the file metadata and content are. In this article, I describe the storage of documents on SharePoint, which is a centralized service. In a “pure” blockchain deployment, you may want to obtain decentralization also of the storage service. The Interplanetary File System (IPFS) is a peer-to-peer hypermedia protocol (which I mentioned earlier) that provides decentralized file storage. Integration with IPFS is beyond the scope of this article, but if you’re interested in knowing how this technology can help remove centralization of storage that isn’t part of a block in a blockchain, you can refer to the “IPFS in Azure” video on Channel 9 (bit.ly/2CURRq0).

As I’m using Azure Blockchain Workbench for running my smart contract, I need two files:

  • FileContract.sol to describe the smart contract itself, in Solidity programming language.
  • FileContract.json to configure the workflow that’s loaded in Azure Blockchain Workbench as an application.

The FileContract smart contract describes a file through its metadata, based on the values passed in the message sent by Logic App to Blockchain Workbench via Azure Service Bus. Here’s a snippet of the source code of the smart contract that defines these parameters:

contract FileContract
{
  // File metadata
  string public FileId; // File identifier
  string public Location; // File path
  string public FileHash; // File content hash
  string public FileMetadataHash; // File Metadata Hash
  string public ContentType; // File content type
  string public Etag; // File entity tag
  string public ProcessedDateTime; // Timestamp
  address public User; // User address

To store the file metadata on a blockchain, I need a file structure defined, as follows:

struct File {
  string FileId;
  address FileContractAddress;
}

A file entity is identified by its file ID and the address on blockchain of the FileContract smart contract that contains the metadata. This structure is saved in a private collection defined as a dictionary, whose key is the FileId string. The mapping keyword in Solidity defines a dictionary and its key and value types as follows:

mapping(string => File) private Registry;

To save a file entity (its ID and metadata), I simply add the constituent values to the Registry dictionary in the Save method. For simplicity, I’ve omitted any necessary control on validity of file ID and contract address, and whether the file already exists in the registry. Here’s the code:

function Save(string fileId, address fileContractAddress) public
{
  Registry[fileId].FileId = fileId;
  Registry[fileId].FileContractAddress = fileContractAddress;
}

The Verification Process

Users who need to verify their certificates with a third party do so by sharing the authenticity token (that is, the file contract address), which contains all the necessary information to verify that the document exists and is authentic and not counterfeited. Figure 7 describes the parties and actions involved in the verification process. The user retrieves the certificate to verify from its location (1) and initiates a new transaction on the blockchain network, transferring the authenticity token (2) to the verification authority. The authority obtains the signed content and metadata of the certificate being verified (3), which is stored on the immutable digital ledger, and then compares them with the equivalent hash values from the off-chain copy. If the values match, the document is verified (4).

The Verification Actors and Process
Figure 7 The Verification Actors and Process

Once documents and unstructured data are signed and verified—and a hash of their content and metadata are stored on a blockchain—it creates an immutable and independent, verifiable record of transactions. This process is referred to as proof of existence and proof of authenticity of digital assets.

Proof of existence refers to creating an unalterable date and time stamp for a specific object. This means that you can prove that a certain information object—like an e-mail, document or image—existed at a certain point in time.

Proof of authenticity asserts that an object is authentic—that is, it hasn’t been changed since it was stored at the indicated time instant. This is accomplished by digitally signing an object and thus creating a hash, its unique identifier. The identifier then gets committed into the distributed blockchain ledger, and the transaction gets time-stamped, as well. Because every entry in the blockchain is immutable, this means you have proof that this specific object existed at a certain point in time.

Using the same approach, an object can be verified and validated. A flow similar to the one I described for the signing process creates a unique identifier and verifies this unique identifier against the blockchain ledger. If there’s a match, the smart contract returns the original hash value. If not, the document being verified isn’t identical to the original copy and should not be trusted implicitly. Thus, you’re able, beyond any doubt, to prove that the document, or any digital object, is authentic and existed at a certain moment in time.

The FileContract smart contract exposes a GetFile method that, given a file ID in input, returns its contract address on the blockchain. From the file contract address it’s possible to obtain the file content and metadata hash values and compare them with the hash values of the document being verified, like so:

function GetFile(string fileId) public constant
returns(address fileContractAddress)
{
  return Registry[fileId].FileContractAddress;
}

Wrapping Up

Why use blockchain to sign and verify digital assets, when solutions for electronic signature already exist and are broadly adopted in the industry? In short, blockchains remove the need for a central certificate authority or central time-stamping server and enable digital signatures stored on a blockchain to live independently of the object being signed. This opens to opportunities for parallel signing and independent verification, with or without the object itself.

Traditional e-signing solutions store digital signatures inside the document. This means that whoever needs to check if a document is signed will have full read access to all the content in the document. Also, because the document changes with each signature, signing documents in parallel isn’t possible—everybody needs to sign the document sequentially. By signing documents on a blockchain, the object itself isn’t changed by the signature, and this enables you to sign documents in parallel and implement business rules based on mandates, 4-eyes, majority vote, seniority and the like.

Finally, but not less important, you can register multiple actions in a sequence on a blockchain. Each registration is linked to a specific case, document and task performed by the parties involved, creating a chain of transactions: an auditable trail. This audit trail can be verified by authorized third parties, providing transparency, compliance and, most importantly, trust.

To learn more about the Azure Blockchain Development Kit, you can find a host of videos on Channel 9, under the “Block Talk” show (aka.ms/bcblocktalk). If you wish, you can also stay up-to-date with the latest announcements from the Azure Blockchain product group by following the @MSFTBlockchain Twitter handle (twitter.com/MSFTBlockchain).

The Azure Blockchain Development Kit project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant Microsoft the rights to use your contribution. When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the request appropriately (that is, add labels or comments to your code).


Stefano Tempesta is a Microsoft Regional Director, MVP on AI and Business Applications, and member of the Blockchain Council. A regular speaker at international IT conferences, including Microsoft Ignite and Tech Summit, Tempesta’s interests extend to blockchain and AI-related technologies. He created “Blogchain Space” (blogchain.space), a blog about blockchain technologies, writes for MSDN Magazine and MS Dynamics World, and publishes machine learning experiments on the Azure AI Gallery.

Thanks to the following technical expert for reviewing this article: Jonathan Waldman


Discuss this article in the MSDN Magazine forum