Export (0) Print
Expand All

HDInsight

Updated: October 28, 2013

Windows Azure HDInsight is a service that deploys and provisions Apache™ Hadoop™ clusters in the cloud, providing a software framework designed to manage, analyze and report on big data. It makes the HDFS/MapReduce software framework and related projects such as Pig and Hive available in a simpler, more scalable, and cost efficient environment. The HDInsight SDK also provides the Microsoft Avro Library for data serialization.

The main conceptual documentation that outlines how to get started with the Windows Azure HDInsight Service is available at Windows Azure HDInsight Documentation.

The HDInsight Service uses Windows Azure PowerShell to configure, run, and post-process Hadoop jobs. The documentation for the Windows Azure PowerShell Management cmdlets used to manage HDInsight is available at Windows Azure Management Cmdlets.

The HDInsight Service has a .NET SDK that provides classes related the creation, configuration, submission, and monitoring of Hadoop jobs managed by a Windows Azure HDInsight Service. It also provides classes used to manage Windows Azure subscriptions using the HDInsight Service and to configure the clusters, storage accounts, MapReduce programs, and the Hive and Oozie components associated with the HDInsight clusters that are managed by a Windows Azure subscription.

The HDInsight .NET SDK also provides the Microsoft Avro Library, an implementation of the Avro data serialization system which employs rich, JSON-defined data structures and an object container to store persistent data. The Avro data format can be processed by many languages: C, C++, C#, Java, PHP, Python, and Ruby are currently supported.

The documentation for the .NET SDK is available at HDInsight SDK Reference Documentation.

See Also

Show:
© 2014 Microsoft