Performance of Hadoop on Windows in Hyper-V Environments

Microsoft IT White Paper

Writers: Sherman Wang, Liang Mo, Andy Miao

Published: April 2013

Applies to: HDInsight, Hadoop on Windows, Windows Server 2008 R2 with Hyper-V

Summary: Compelling use-cases from industry leaders are quickly changing Hadoop from an emerging technology to an industry standard. However, Hadoop requires considerable resources, and in the search for computing power, users are increasingly asking if it is possible to virtualize Hadoop—that is, create clusters on a virtual machine farm—to build a private cloud infrastructure .

This paper presents the result of internal benchmarks by Microsoft IT, in which the performance of a private cloud using virtual machines was compared to the same jobs running on servers dedicated to Hadoop. The goal was to determine whether Hadoop clusters hosted in Microsoft Hyper-V can be as efficient as physical clusters.

The results indicate that the performance impact of virtualization is small, and that Hadoop on Microsoft Hyper-V offers compelling performance as well as other benefits.

To review the document, please download the Performance of Hadoop on Windows in Hyper-V Environments Word document.