23 out of 52 rated this helpful - Rate this topic

GZipStream Class

Updated: August 2009

Provides methods and properties used to compress and decompress streams.

Namespace:  System.IO.Compression
Assembly:  System (in System.dll)
public class GZipStream : Stream

This class represents the gzip data format, which uses an industry standard algorithm for lossless file compression and decompression. The format includes a cyclic redundancy check value for detecting data corruption. The gzip data format uses the same algorithm as the DeflateStream class, but can be extended to use other compression formats. The format can be readily implemented in a manner not covered by patents. This class cannot be used to compress files larger than 4 GB.

Compressed GZipStream objects written to a file with an extension of .gz can be decompressed using many common compression tools; however, this class does not inherently provide functionality for adding files to or extracting files from .zip archives.

The compression functionality in DeflateStream and GZipStream is exposed as a stream. Data is read in on a byte-by-byte basis, so it is not possible to perform multiple passes to determine the best method for compressing entire files or large blocks of data. The DeflateStream and GZipStream classes are best used on uncompressed sources of data. If the source data is already compressed, using these classes may actually increase the size of the stream.

Notes to Inheritors:

When you inherit from GZipStream, you must override the following members: CanSeek, CanWrite, and CanRead.

The following example shows how to use the GZipStream class to compress and decompress a directory of files.

using System;
using System.IO;
using System.IO.Compression;

namespace zip
{

    public class Program
    {

        public static void Main()
        {
            // Path to directory of files to compress and decompress.
            string dirpath = @"c:\users\public\reports";

            DirectoryInfo di = new DirectoryInfo(dirpath);

            // Compress the directory's files.
            foreach (FileInfo fi in di.GetFiles())
            {
                Compress(fi);

            }

            // Decompress all *.gz files in the directory.
            foreach (FileInfo fi in di.GetFiles("*.gz"))
            {
                Decompress(fi);

            }


        }

        public static void Compress(FileInfo fi)
        {
            // Get the stream of the source file.
            using (FileStream inFile = fi.OpenRead())
            {
                // Prevent compressing hidden and already compressed files.
                if ((File.GetAttributes(fi.FullName) & FileAttributes.Hidden)
                        != FileAttributes.Hidden & fi.Extension != ".gz")
                {
                    // Create the compressed file.
                    using (FileStream outFile = File.Create(fi.FullName + ".gz"))
                    {
                        using (GZipStream Compress = new GZipStream(outFile,
                                CompressionMode.Compress))
                        {
                            // Copy the source file into the compression stream.
                            byte[] buffer = new byte[4096];
                            int numRead;
                            while ((numRead = inFile.Read(buffer, 0, buffer.Length)) != 0)
                            {
                                Compress.Write(buffer, 0, numRead);
                            }
                            Console.WriteLine("Compressed {0} from {1} to {2} bytes.",
                                fi.Name, fi.Length.ToString(), outFile.Length.ToString());
                        }
                    }
                }
            }
        }

        public static void Decompress(FileInfo fi)
        {
            // Get the stream of the source file.
            using (FileStream inFile = fi.OpenRead())
            {
                // Get original file extension, for example "doc" from report.doc.gz.
                string curFile = fi.FullName;
                string origName = curFile.Remove(curFile.Length - fi.Extension.Length);

                //Create the decompressed file.
                using (FileStream outFile = File.Create(origName))
                {
                    using (GZipStream Decompress = new GZipStream(inFile,
                            CompressionMode.Decompress))
                    {
                        //Copy the decompression stream into the output file.
                        byte[] buffer = new byte[4096];
                        int numRead;
                        while ((numRead = Decompress.Read(buffer, 0, buffer.Length)) != 0)
                        {
                            outFile.Write(buffer, 0, numRead);
                        }
                        Console.WriteLine("Decompressed: {0}", fi.Name);

                    }
                }
            }
        }

    }
}


System.Object
  System.MarshalByRefObject
    System.IO.Stream
      System.IO.Compression.GZipStream
Any public static (Shared in Visual Basic) members of this type are thread safe. Any instance members are not guaranteed to be thread safe.

Windows 7, Windows Vista, Windows XP SP2, Windows XP Media Center Edition, Windows XP Professional x64 Edition, Windows XP Starter Edition, Windows Server 2008 R2, Windows Server 2008, Windows Server 2003, Windows Server 2000 SP4, Windows Millennium Edition, Windows 98, Windows CE, Windows Mobile for Smartphone, Windows Mobile for Pocket PC, Xbox 360, Zune

The .NET Framework and .NET Compact Framework do not support all versions of every platform. For a list of the supported versions, see .NET Framework System Requirements.

.NET Framework

Supported in: 3.5, 3.0, 2.0

.NET Compact Framework

Supported in: 3.5

XNA Framework

Supported in: 3.0

Date

History

Reason

August 2009

Improved code example.

Information enhancement.

Did you find this helpful?
(1500 characters remaining)
Community Content Add
Annotations FAQ
Get left with .tar file
Is it just me or does this just take your .tar.gz file and leave you with a .tar file, not very helpful.

Please tell me if I'm missing something.
Native Simplicity IS Better!
If you just want to zip a file and don't need all the bells, whistles and complexities of third party code. This works fine in .net 2.0:

Imports System.IO
Imports System.IO.Compression

    Public Sub Compress(ByVal FilePath As String)
        Dim UncompressedData As Byte() = System.IO.File.ReadAllBytes(FilePath)
        Dim CompressedData As New MemoryStream()
        Dim GZipper As New GZipStream(CompressedData, CompressionMode.Compress, True)
        GZipper.Write(UncompressedData, 0, UncompressedData.Length)
        GZipper.Dispose()
        System.IO.File.WriteAllBytes(IO.Path.GetPathRoot(FilePath) + "\" + IO.Path.GetFileNameWithoutExtension(FilePath) + ".zip", CompressedData.ToArray)
        CompressedData.Dispose()
    End Sub
AND IT IS VB.NET NATIVE!!! NO C# to go bald over trying to read and understand.
Big files will throw a System.OutOfMemory exception
Here was my solution:

FileStream sourceStream = File.OpenRead(file.FullName);

FileInfo destFile =

new FileInfo(ArchiveLocation + nsyskey.ToString() + ".gz");

FileStream destStream = File.Create(destFile.FullName);

GZipStream compressedzipStream =

new GZipStream(destStream, CompressionMode.Compress, true);

bool isError = false;

byte[] buffer = new byte[0];

FileStream infile;

try

{

const int MAX_BUFFER_SIZE = 100000;

// Read the file to ensure it is readable.

int start = 0;

int last = System.Convert.ToInt32(infile.Length);

int bufferSize = MAX_BUFFER_SIZE;

while (start < last)

{
if (start + MAX_BUFFER_SIZE > last)

{

bufferSize = last - start;

}

buffer = newbyte[bufferSize];

int count = infile.Read(buffer, 0, bufferSize);

if (count != bufferSize)

{

infile.Close();

compressedzipStream.Close();

Console.WriteLine("Test Failed: Unable to read data from file");

return;

}

compressedzipStream.Write(buffer, 0, bufferSize);

start += bufferSize;

}

}

finally

{

infile.Close();

compressedzipStream.Close();

}

Decompression was very similar expect you convert the last 4 bytes to be your last and read from the gzipstream and write to the filestream

Reading a Gzip (.gz) file

If you just want to read a gzip-compressed text file, you can use stream composition:


      using (var fs = new System.IO.FileStream(path, FileMode.Open, FileAccess.Read))
using (var gzs = new GZipStream(fs, CompressionMode.Decompress))
{
using (var reader = new System.IO.StreamReader(gzs))
{
string line;
while ((line = reader.ReadLine()) != null)
{
System.Console.WriteLine(line);
}
}
}
  
How to Zip a directory, using DotNetZip

GZipStream doesn't handle ZIP files, but DotNetZip does. You can use DotNetZip (http://dotnetzip.codeplex.com ), a free 3rd-party library, to create and read zip files from within any .NET application.

This code in C#, zips all the files in a specified directory.

using (ZipFile zip = new ZipFile())
{
zip.AddDirectory(@"MyDocuments\ProjectX", "ProjectX");
zip.Comment = "This zip was created at " + System.DateTime.Now.ToString("G") ;
zip.Save(zipFileToCreate);
}



This code in C#, unzips a zipfile:

      string unpackDirectory = "ExtractedFiles";
using (ZipFile zip1 = ZipFile.Read(zipToUnpack))
{
// here, we extract every entry, but we could extract conditionally
// based on entry name, size, date, checkbox status, etc.
foreach (ZipEntry e in zip1)
{
e.Extract(unpackDirectory, ExtractExistingFileAction.OverwriteSilently);
}
}


DotNetZip is free.

ZipStorer: A Pure C# Class to Store Files in Zip

Notice DeflateStream cannot read/write a .zip file directly (neither GZipStream)
ZipStorer library provides support for Zip files for .net and .net compact frameworks in a simple and monolithic class: http://zipstorer.codeplex.com
Also non-compressed storage support for Silverlight.

Using GZipStream to read compressed XML
The example from the help really confuses things. I had a simple requirement to use XElement.Load to load an XMLdocument from a compressed file rather than from a regular file. Against a regular XML file this is easy:

Dim doc as XElement
doc = XElement.Load("testdata.xml")

In the end I found that doing this with a GZipStream was easy, but not until after I had wasted a load of time using the ReadAllBytes function to read the output from the GZipStream to a MemoryStream and then feeding that MemoryStream as a StreamReader to XElement.Load. This was bugging me- why not put the uncompressed output from the GZipStream straight into the StreamReader for XElement.Load. It works and this is the code I used:

Dim infile as FileStream' infile is the testdata.xml.gz file
Dim Decompressed as GZipStream' The output byte stream of uncompressed data
Dim charsDecompressed as StreamReader' The TextReader character input required by XElement.Load()
' A TextReader cannot be used as it is an abstract base class

Try
infile = New FileStream("testdata.xml.gz", FileMode.Open)
Decompressed = New GZipStream(infile, CompressionMode.Decompress)
charsDecompressed = New StreamReader(Decompressed)
SRD = XElement.Load(charsDecompressed)
charsDecompressed.Dispose()
infile.Close()
Catch ex As XmlException
' The XML itself was not properly formed
MessageBox.Show("The file appears to be corrupt.", "Fatal error", MessageBoxButtons.OK, MessageBoxIcon.Error)
Return False

Catch ex As InvalidDataException
' The decompression process failed somehow
MessageBox.Show("Extraction error while loading data file.", "Fatal error", MessageBoxButtons.OK, MessageBoxIcon.Error)
Return False

Catch ex As FileNotFoundException
' File could not be found
MessageBox.Show("The testdata.xml.gz file could not be found, "Fatal error", MessageBoxButtons.OK, MessageBoxIcon.Error)
Return False

Catch ex As Exception
' Catch everything else
MessageBox.Show("An unexpected error occurred: " & ex.Message, "Unknown error", MessageBoxButtons.OK, MessageBoxIcon.Error)
Return False

End Try





objCompressedStream.Length not supported

Hi,
I am doing like this:
GZipStream objCompressedStream = new GZipStream(objmod, CompressionMode.Compress, true);
objCompressedStream.Write(btReadArray, 0, aiNo_of_bytes_read);

Now when I am trying to get length of objCompressedStream using objCompressedStream.Length property it throws exception that operation is not supported.

Can anyone suggest the alternatives or some way to get the length of compressed stream

Thanks

Make sure you read these two posts

Using GZipStream for Compression in .NET [Brian Grunkemeyer] http://blogs.msdn.com/bclteam/archive/2005/06/15/429542.aspx

Gives the nice clean code for Compress and Decompress methods

Using a MemoryStream with GZipStream [Lakshan Fernando] http://blogs.msdn.com/bclteam/archive/2006/05/10/592551.aspx

To explain the very unintuitive aspect of this stream

Re: Nice...a slight suggestion
>I would leave optimization to the compiler, myself, and not bother with buffering the ReadAllBytes method

Doing it that way without a buffer will be much much slower.
I timed both the code you provide and original posters code - and without a buffer of at least 1024bytes - the code ran 3 times slower for a sample 25mb file.

ZLIB Compression
One more thing - if you are looking to handle Zlib Compression (IETF RFC 1950), that is in DotNetZip, too. There is a ZlibStream class to handle it nicely. Works very similarly to the GZipStream class.
creating or manipulating Zip files
There are 2 problems with this class.
  1. It does not handle ZIP files.
  2. It is dysfunctional, can actually inflate data in "compression". There's something wrong with the logic. It's a known problem but as yet unfixed.
On the first item, the Deflate compression algorithm is what is used in .zip files. A GZIP stream is just a DEFLATEd stream with a header and trailer surrounding it. A zip file is a set of DEFLATEd streams with metadata surrounding all of them. This class helps you read or write a GZIP stream, but it will not help you read or write a zip file. A GZIP file (sometimes with the extension .gz) is not the same thing as a ZIP file. Similar ideas but not the same thing.

The java.util.zip classes do handle .zip files. These classes are available through the VJ# runtime. But, there are a few problems with them: (1) the java classes are unwieldy to use in .NET. There are no progress events, for example. The enumeration is all Java-esque. etc.; (2) there are a bunch of bugs in those classes that have not been and will not be fixed in the VJ# runtime; (3) the VJ# runtime is huge; (4) The VJ# runtime is no longer supported! (not shipped with VS2008).

There is a good 3rd party library, though, that solves all these problems: DotNetZip. It provides a GZipStream that NEVER inflates your data, and actually deflates it substantially better than the built-in class. Also, DotNetZip can read and write .zip files.

Find it at:
http://www.codeplex.com/DotNetZip

It's free to use. It is written in C#, but you can use it from any language. It works in Winforms apps, console apps, ASP.NET apps, Powershell, anything you write in .NET. Fast, good compression (better than the built-in GZipStream). Easy to use. You'll find lots of examples on that site.
How to zip a directory
Any suggestions on how to zip an entire directory using GZipStream?
I'm currently using java.util.zip classes in the vjslib.dll. It works well but I'd rather avoid using an outside library if I can avoid it.
Nice...a slight suggestion

Sweet!

I would leave optimization to the compiler, myself, and not bother with buffering the ReadAllBytes method...laziness. (and create a ToArray() or ToMemoryStream() extension method on Stream, itself. Extension methods really help plug some huge holes!

const int EndOfStream = -1;
private static byte[] ReadAllBytes(Stream stream)
{
using (var ms = new MemoryStream())
{
for (int b = stream.ReadByte(); b != EndOfStream; b = stream.ReadByte())
ms.WriteByte((byte) b);
return ms.ToArray();
}
}

  • 7/19/2008
  • G1