Export (0) Print
Expand All

Microsoft Cabinet Format

 

In this Library Section

Cabarc User's Guide

Cabinet Format

FCI / FDI

LZX Format

MakeCAB User's Guide

MSZip Format

Microsoft Cabarc User's Guide

Copyright © 1997 Microsoft Corporation. All rights reserved.

Topics in this section

Introduction

Creating Cabinets

List Cabinet Contents

Extracting Cabinets

Introduction

The Cabinet Format

The cabinet format provides a way to efficiently package multiple files. The key features of the cabinet format are that multiple files may be stored in a single cabinet ("CAB file"); and that data compression is performed across file boundaries, significantly improving the compression ratio.

Depending upon the number of files to be compressed, and the expected access patterns (sequential or random access; whether most of the files will be requested at once or only a small portion), cabinets can be constructed in different ways. One key concept of the cabinet file is the folder. A folder is a collection of one or more files that are compressed together as a single entity. By compressing files in this way, the compression ratio is improved. The downside is that random access time suffers, since in order for any particular file in a folder to be decoded, all preceding files in the same folder must also be decoded.
Back to top

Cabarc

Cabarc is a utility that creates, extracts, and lists the contents of cabinet files (CABs), using a command line interface similar to that of popular archiving tools. Cabarc supports wildcards and recursive directory searches.

Back to: Cabarc User's Guide > Introduction

Command Line Usage

Cabarc is used as follows:

Usage: CABARC [options] command cabfile [@list] [files] [dest_dir]

Currently, only three commands are supported; N (create new cabinet), L (list contents of an existing cabinet), and X (extract files from a cabinet). These commands are described in the following pages.

Options must appear before the command name, and cannot be combined (for example, to set the –r and –p options, use –r –p, and not –rp).

Back to: Cabarc User's Guide > Introduction

Creating Cabinets

Cabinets are created using the n command, followed by the name of the cabinet to create, followed by a filename list, as shown below:

cabarc n mycab.cab prog.c prog.h prog.exe readme.txt

The above command creates the cabinet mycab.cab containing the files "prog.c", "prog.h", "prog.exe", and "readme.txt", in a single folder, using the default compression mode, MSZIP.

Back to: Cabarc User's Guide

Wildcards

Cabarc supports wildcards in the filename list, as shown in the example below:

cabarc n mycab.cab prog.* readme.txt

Back to: Cabarc User's Guide > Creating Cabinets

Folders

By default, all files are added to a single folder (compression history) in the cabinet. It is possible to tell cabarc to begin a new folder, by inserting the plus (+) symbol as a file to be added, as shown below:

cabarc n mycab.cab test.c main.c + test.exe *.obj

The above command creates the cabinet "mycab.cab" with one folder containing "test.c" and "main.c", and a second folder containing "test.exe" and all files matching "*.obj".

Back to: Cabarc User's Guide > Creating Cabinets

Path Name Preservation

By default, directory names are not preserved in the cabinet; only the filename component is stored. For example, the following command will result in the filename "prog.c" being stored in the cabinet:

cabarc n mycab.cab c:\source\myproj\prog.c

In order to preserve path names, the –p option should be used:

cabarc –p n mycab.cab c:\mysource\myproj\prog.c

This command will cause the file to be named "mysource\myproj\prog.c" in the cabinet. Note that the c:\ prefix is still stripped from the filename; cabarc will not allow absolute paths to be stored in the cabinet, nor will it extract such absolute paths.

Back to: Cabarc User's Guide > Creating Cabinets

Path Stripping

In many situations it may be desirable to preserve some of the path name, but not all of it. For example, one might wish to archive everything in the c:\mysource\myproj\ directory, but store only the myproj\ component of the path. This can be accomplished with the path stripping option, -P (capital P).

cabarc –p –P mysource\ n mycab.cab c:\mysource\myproj\prog.c

The –P option strips any strings which begin with the provided string (wildcards are not supported in this case; it is a simple text match). Any absolute path prefixes such as c:\ or \ are stripped before the comparison takes place, so these characters should not be included in the –P option.

The –P option may be used multiple times to strip out multiple paths; cabarc builds a list of all paths to be stripped, and applies only the first one which matches. For example:

cabarc –p –P mysrc\ –P yoursrc\ n mycab.cab c:\mysrc\myproj\*.* d:\yoursrc\yourproj\*.c

The trailing slash at the end of the path name is important; entering –P mysrc instead of –P mysrc\ would cause files to be added as "\myproj\<filename>".

Back to: Cabarc User's Guide > Creating Cabinets

Recursive Directory Search

Cabarc can archive files in a directory and all of its subdirectories, by use of the –r option. For example, the command shown below will archive all files ending in .h that are in c:\msdev\include\, c:\msdev\include\sys, and c:\msdev\include\gl (assuming these directories exist on your system).

cabarc –r –p n mycab.cab c:\msdev\include\*.h

The –p option is used here to preserve the path information when the files are added to the cabinet; without this option, only the filename components would be stored, although sometimes it might be desirable behavior to not use –p.

Back to: Cabarc User's Guide > Creating Cabinets

Reserve Space for Code Signature

Cabarc can reserve space in the cabinet for a code signature. This is done using the –s option, which reserves a specified amount of empty space in the cabinet. For code signing, 6144 bytes need to be reserved:

cabarc –s 6144 n mycab.cab test.exe

Note that the –s option does not actually write the code signature; it merely reserves space for it in the cabinet. The appropriate code signing utility must be used to fill out the code signature.

Back to: Cabarc User's Guide > Creating Cabinets

Set Cabinet ID

Cabinet files have a 16-bit cabinet ID field that is designed for application use. The default value of this field is zero, however, the –i option of cabarc can be used to set this field to any 16-bit value:

cabarc –i 12345 n mycab.cab test.exe

Back to: Cabarc User's Guide > Creating Cabinets

Set Compression Type

The default compression type for a cabinet is MSZIP. However, the compression type can be changed with the –m option. Currently only MSZIP compression (-m MSZIP) and no compression (-m NONE) are supported.

The following command stores files in the cabinet with no compression:

Back to: Cabarc User's Guide > Creating Cabinets

cabarc –m NONE n mycab.c *.*

File List From a File

Cabarc can input its list of files from a text file, instead of from the command line, by using @files ("at files"). This is done by prefixing with the @ symbol the name of the file which contains the file list. For example:

cabarc n mycab.cab @filelist.txt

The text file must list the physical file names of the files to be added, one per line. As is the case when specifying filenames on the command line, the plus (+) symbol can be used as a filename to specify the beginning of a new folder. If a filename contains any embedded spaces, it must be enclosed as quotes, as shown below:

test.c

myapp.exe

"output file.exe"

The reason for requiring quotes is that each physical filename may be followed on the same line by an optional logical filename, which specifies the name under which the file will be stored in the cabinet:

test.c myapp.c

myapp.exe

"output file.exe" foobar.exe

If the logical filename contains spaces, then it must also be enclosed in quotes. Note that the logical filename overrides the –p (preserve path names) and –P (strip path name) options -the file will be added to the cabinet exactly as indicated. Wildcards are allowed in the physical filename, but in this situation a logical filename is not allowed.

The "@" feature may be used multiple times, to retrieve file lists from multiple files. Cabarc does not check for the presence of duplicate files, so if the same physical file appears in multiple file lists, it will be added to the cabinet multiple times.

The "@" feature may be combined with filenames on the command line. Files are added in the order in which they are parsed on the command line. Example:

cabarc n mycab.cab @filelist1.txt *.c @filelist2.txt *.h

Note: The "@" feature is available only when creating cabinets, not when extracting or listing cabinets.

Back to: Cabarc User's Guide > Creating Cabinets

List Cabinet Contents

It is possible to view the contents of a cabinet using the L (list) command, as shown below:

cabarc l mycab.cab

Cabarc will display the Set ID in the cabinet (see the –s option for cabinet creation), as well as the name of each file in the cabinet, along with its file size, file date, file time, and file attributes.

Back to: Cabarc User's Guide

Extracting Cabinets

The X (extract) command extracts files from a cabinet. The simplest use of the X command is shown below, which causes all files to be extracted from the cabinet:

cabarc x mycab.cab

Alternatively, it is possible to selectively extract files, by providing a list of filenames and/or wildcards:

cabarc x mycab.cab readme.txt *.exe *.c

By default, full path names (if they are present in the cabinet) are not preserved upon extraction. For example, if a file named mysrc\myproj\test.c is present in the cabinet, then the command cabarc x mycab.cab will cause the file test.c to be extracted into the current directory. In order to preserve file names upon extraction, the –p option must be used. This option will cause any required directories to be created if necessary.

Only the filename component is considered in the matching process; the pathname is discounted. For example, cabarc x mycab.cab test.c will cause the file mysrc\myproj\test.c to be extracted to the current directory as test.c, as will cabarc x mycab.cab *.c (which will also extract any other files matching *.c).

By default, the extracted files are stored in the current directory (and its subdirectories, if –p is used). However, it is possible to specify a destination directory for the extracted files. This is accomplished by appending a directory name to the command line. The directory name must end in a backslash ( \ ). Examples:

cabarc x mycab.cab c:\somedir\

cabarc x mycab.cab *.exe c:\somedir\

Back to: Cabarc User's Guide

Microsoft Cabinet File Format

Copyright © 1997 Microsoft Corporation. All rights reserved.

Topics in this section

Introduction

Specification

Sample Cabinet File

Notes

Introduction

This specification defines the Microsoft cabinet file format. Cabinet files are compressed packages containing a number of related files. The format of a cabinet file is optimized for maximum compression. Cabinet files support a number of compression formats, including MSZIP, LZX, or uncompressed. This document does not define these internal compression formats. For data compression formats, refer to the documents titled Microsoft MSZIP Data Compression Format and Microsoft LZX Data Compression Format.

Back to: Cabinet File Format

Specification

This segment of the documentation includes the following topics:

Conventions

Overview

Detailed Structure Specification

Back to: Cabinet File Format

Conventions

The types u1, u2, and u4 are used to represent unsigned 8-, 16-, and 32-bit integer values, respectively. All multi-byte quantities are stored in little-endian order, where the least significant byte comes first.

The cabinet file format is described here using a C-like structure notation, where successive fields appear in the structure sequentially without padding or alignment. Header fields followed by (optional) may or may not be present, depending on the values in the CFHEADER flags byte.

Back to: Cabinet File Format > Specification

Overview

Each file stored in a cabinet is stored completely within a single folder. A cabinet file may contain one or more folders, or portions of a folder. A folder can span across multiple cabinets. Such a series of cabinet files form a set. Each cabinet file contains name information for the logically adjacent cabinet files. Each folder contains one or more files. Throughout this discussion, cabinets are said to contain "files". This is for semantic purposes only. Cabinet files actually store streams of bytes, each with a name and some other common attributes. Whether these byte streams are actually files or some other kind of data is application-defined.

A cabinet file contains a cabinet header (CFHEADER), followed by one or more cabinet folder (CFFOLDER) entries, a series of one or more cabinet file (CFFILE) entries, and the actual compressed file data in CFDATA entries. The compressed file data in the CFDATA entry is stored in one of several compression formats, as indicated in the corresponding CFFOLDER structure. The compression encoding formats used are detailed in separate documents.

Back to: Cabinet File Format > Specification

Detailed Structure Specification

This segment of the documentation includes the following topics:

CFHEADER

CFFOLDER

CFFILE

CFDATA

Back to: Cabinet File Format > Specification

CFHEADER

The CFHEADER structure provides information about this cabinet file.

struct CFHEADER
{
  u1  signature[4]inet file signature */
  u4  reserved1     /* reserved */
  u4  cbCabinet    /* size of this cabinet file in bytes */
  u4  reserved2     /* reserved */
  u4  coffFiles/* offset of the first CFFILE entry */
  u4  reserved3     /* reserved */
  u1  versionMinor   /* cabinet file format version, minor */
  u1  versionMajor   /* cabinet file format version, major */
  u2  cFolders  /* number of CFFOLDER entries in this */
                        /*    cabinet */
  u2  cFiles      /* number of CFFILE entries in this cabinet */
  u2  flags        /* cabinet file option indicators */
  u2  setID        /* must be the same for all cabinets in a */
                        /*    set */
  u2  iCabinet;         /* number of this cabinet file in a set */
  u2  cbCFHeader;       /* (optional) size of per-cabinet reserved */
                        /*    area */
  u1  cbCFFolder;       /* (optional) size of per-folder reserved */
                        /*    area */
  u1  cbCFData;         /* (optional) size of per-datablock reserved */
                        /*    area */
  u1  abReserve[];      /* (optional) per-cabinet reserved area */
  u1  szCabinetPrev[];  /* (optional) name of previous cabinet file */
  u1  szDiskPrev[];     /* (optional) name of previous disk */
  u1  szCabinetNext[];  /* (optional) name of next cabinet file */
  u1  szDiskNext[];     /* (optional) name of next disk */
};
u1 signature[4]
Contains the characters 'M','S','C','F' (bytes 0x4D, 0x53, 0x43, 0x46). This field is used to assure that the file is a cabinet file.

Back to: Cabinet File Format > Specification > CFHEADER

u4 reserved1
Reserved field, set to zero.

Back to: Cabinet File Format > Specification > CFHEADER

u4 cbCabinet
Total size of this cabinet file in bytes.

Back to: Cabinet File Format > Specification > CFHEADER

u4 reserved2
Reserved field, set to zero.

Back to: Cabinet File Format > Specification > CFHEADER

u4 coffFiles
Absolute file offset of first CFFILE entry.

Back to: Cabinet File Format > Specification > CFHEADER

u4 reserved3
Reserved field, set to zero.

Back to: Cabinet File Format > Specification > CFHEADER

u1 versionMinor
u1 versionMajor
Cabinet file format version.
Currently, versionMajor = 1 and versionMinor = 3.

Back to: Cabinet File Format > Specification > CFHEADER

u2 cFolders
The number of CFFOLDER entries in this cabinet file.

Back to: Cabinet File Format > Specification > CFHEADER

u2 cFiles
The number of CFFILE entries in this cabinet file.

Back to: Cabinet File Format > Specification > CFHEADER

u2 flags
Bit-mapped values that indicate the presence of optional data:
#define cfhdrPREV_CABINET       0x0001
#define cfhdrNEXT_CABINET       0x0002
#define cfhdrRESERVE_PRESENT    0x0004

flags.cfhdrPREV_CABINET is set if this cabinet file is not the first in a set of cabinet files. When this bit is set, the szCabinetPrev and szDiskPrev fields are present in this CFHEADER.

flags.cfhdrNEXT_CABINET is set if this cabinet file is not the last in a set of cabinet files. When this bit is set, the szCabinetNext and szDiskNext fields are present in this CFHEADER.

flags.cfhdrRESERVE_PRESENT is set if this cabinet file contains any reserved fields. When this bit is set, the cbCFHeader, cbCFFolder, and cbCFData fields are present in this CFHEADER.

Other bit positions in the flags field are reserved.

Back to: Cabinet File Format > Specification > CFHEADER

u2 setID
An arbitrarily derived (random) value that binds a collection of linked cabinet files together. All cabinet files in a set will contain the same setID. This field is used by cabinet file extractors to assure that cabinet files are not inadvertently mixed. This value has no meaning in a cabinet file that is not in a set.

Back to: Cabinet File Format > Specification > CFHEADER

u2 iCabinet
Sequential number of this cabinet in a multi-cabinet set. The first cabinet has iCabinet=0. This field, along with setID, is used by cabinet file extractors to assure that this cabinet is the correct continuation cabinet when spanning cabinet files.

Back to: Cabinet File Format > Specification > CFHEADER

u2 cbCFHeader(optional)
If flags.cfhdrRESERVE_PRESENT is not set, this field is not present, and the value of cbCFHeader defaults to zero. Indicates the size in bytes of the abReserve field in this CFHEADER. Values for cbCFHeader range from 0 to 60,000.

Back to: Cabinet File Format > Specification > CFHEADER

u1 cbCFFolder(optional)
If flags.cfhdrRESERVE_PRESENT is not set, then this field is not present, and the value of cbCFFolder defaults to zero. Indicates the size in bytes of the abReserve field in each CFFOLDER entry. Values for cbCFFolder range from 0 to 255.

Back to: Cabinet File Format > Specification > CFHEADER

u1 cbCFData(optional)
If flags.cfhdrRESERVE_PRESENT is set, then this field is not present, and the value for cbCFData defaults to zero. Indicates the size in bytes of the abReserve field in each CFDATA entry. Values for cbCFData range from 0 to 255.

Back to: Cabinet File Format > Specification > CFHEADER

u1 abReserve[cbCFHeader](optional)
If flags.cfhdrRESERVE_PRESENT is set and cbCFHeader is non-zero, then this field contains per-cabinet-file application information. This field is defined by the application and used for application-defined purposes.

Back to: Cabinet File Format > Specification > CFHEADER

u1 szCabinetPrev[](optional)
If flags.cfhdrPREV_CABINET is not set, then this field is not present. NUL-terminated ASCII string containing the file name of the logically previous cabinet file. May contain up to 255 bytes plus the NUL byte. Note that this gives the name of the most-recently-preceding cabinet file that contains the initial instance of a file entry. This might not be the immediately previous cabinet file, when the most recent file spans multiple cabinet files. If searching in reverse for a specific file entry, or trying to extract a file that is reported to begin in the "previous cabinet", szCabinetPrev would give the name of the cabinet to examine.

Back to: Cabinet File Format > Specification > CFHEADER

u1 szDiskPrev[](optional)
If flags.cfhdrPREV_CABINET is not set, then this field is not present. NUL-terminated ASCII string containing a descriptive name for the media containing the file named in szCabinetPrev, such as the text on the diskette label. This string can be used when prompting the user to insert a diskette. May contain up to 255 bytes plus the NUL byte.

Back to: Cabinet File Format > Specification > CFHEADER

u1 szCabinetNext[](optional)
If flags.cfhdrNEXT_CABINET is not set, then this field is not present. NUL-terminated ASCII string containing the file name of the next cabinet file in a set. May contain up to 255 bytes plus the NUL byte. Files extending beyond the end of the current cabinet file are continued in the named cabinet file.

Back to: Cabinet File Format > Specification > CFHEADER

u1 szDiskNext[](optional)
If flags.cfhdrNEXT_CABINET is not set, then this field is not present. NUL-terminated ASCII string containing a descriptive name for the media containing the file named in szCabinetNext, such as the text on the diskette label. May contain up to 255 bytes plus the NUL byte. This string can be used when prompting the user to insert a diskette.

Back to: Cabinet File Format > Specification > CFHEADER

CFFOLDER

Each CFFOLDER structure contains information about one of the folders or partial folders stored in this cabinet file. The first CFFOLDER entry immediately follows the CFHEADER entry and subsequent CFFOLDER records for this cabinet are contiguous. CFHEADER.cFolders indicates how many CFFOLDER entries are present.

Folders may start in one cabinet, and continue on to one or more succeeding cabinets. When the cabinet file creator detects that a folder has been continued into another cabinet, it will complete that folder as soon as the current file has been completely compressed. Any additional files will be placed in the next folder. Generally, this means that a folder would span at most two cabinets, but if the file is large enough, it could span more than two cabinets.

CFFOLDER entries actually refer to folder fragments, not necessarily complete folders. A CFFOLDER structure is the beginning of a folder if the iFolder value in the first file referencing the folder does not indicate the folder is continued from the previous cabinet file.

The typeCompress field may vary from one folder to the next, unless the folder is continued from a previous cabinet file.

Back to: Cabinet File Format > Specification

struct CFFOLDER
{
  u4  coffCabStart;  /* offset of the first CFDATA block in this 
                     /*    folder */
  u2  cCFData;       /* number of CFDATA blocks in this folder */
  u2  typeCompress;  /* compression type indicator */
  u1  abReserve[];   /* (optional) per-folder reserved area */
};
u4 coffCabStart
Absolute file offset of first CFDATA block for this folder. For a standard cabinet file this value should be less than CFHEADER.cbCabinet.

Back to: Cabinet File Format > Specification > CFFOLDER

u2 cCFData
Number of CFDATA structures for this folder that are actually in this cabinet. A folder can continue into another cabinet and have more CFDATA blocks in that cabinet, and a folder may have started in a previous cabinet. This number represents only the CFDATA structures for this folder that are at least partially recorded in this cabinet.

Back to: Cabinet File Format > Specification > CFFOLDER

u2 typeCompress
Indicates the compression method used for all CFDATA entries in this folder. The valid values are defined in each compression format's specification.

Back to: Cabinet File Format > Specification > CFFOLDER

u1 abReserve[CFHEADER.cbCFFolder](optional)
If CFHEADER.flags.cfhdrRESERVE_PRESENT is set and cbCFFolder is non-zero, then this field contains per-folder application information. This field is defined by the application and used for application-defined purposes.

Back to: Cabinet File Format > Specification > CFFOLDER

CFFILE

Each CFFILE entry contains information about one of the files stored (or at least partially stored) in this cabinet. The first CFFILE entry in each cabinet is found at absolute offset CFHEADER.coffFiles. In a standard cabinet file the first CFFILE entry immediately follows the last CFFOLDER entry. Subsequent CFFILE records for this cabinet are contiguous.

CFHEADER.cFiles indicates how many of these entries are in the cabinet. The CFFILE entries in a standard cabinet are ordered by iFolder value, then by uoffFolderStart. Entries for files continued from the previous cabinet will be first, and entries for files continued to the next cabinet will be last.

Back to: Cabinet File Format > Specification

struct CFFILE
{
  u4  cbFile;           /* uncompressed size of this file in bytes */
  u4  uoffFolderStart;  /* uncompressed offset of this file in the folder */
  u2  iFolder;          /* index into the CFFOLDER area */
  u2  date;             /* date stamp for this file */
  u2  time;             /* time stamp for this file */
  u2  attribs;          /* attribute flags for this file */
  u1  szName[];         /* name of this file */
};
u4 cbFile
Uncompressed size of this file in bytes.

Back to: Cabinet File Format > Specification > CFFILE

u4 uoffFolderStart
Uncompressed byte offset of the start of this file's data. For the first file in each folder, this value will usually be zero. Subsequent files in the folder will have offsets that are typically the running sum of the cbFile values.

Back to: Cabinet File Format > Specification > CFFILE

u2 iFolder
Index of the folder containing this file's data. A value of zero indicates this is the first folder in this cabinet file. The special iFolder values ifoldCONTINUED_FROM_PREV and ifoldCONTINUED_PREV_AND_NEXT indicate that the folder index is actually zero, but that extraction of this file would have to begin with the cabinet named in CFHEADER.szCabinetPrev. The special iFolder values ifoldCONTINUED_PREV_AND_NEXT and ifoldCONTINUED_TO_NEXT indicate that the folder index is actually one less than CFHEADER.cFolders, and that extraction of this file will require continuation to the cabinet named in CFHEADER.szCabinetNext.
#define ifoldCONTINUED_FROM_PREV      (0xFFFD)
#define ifoldCONTINUED_TO_NEXT        (0xFFFE)
#define ifoldCONTINUED_PREV_AND_NEXT  (0xFFFF)

Back to: Cabinet File Format > Specification > CFFILE

u2 date
Date of this file, in the format ((year–1980) << 9)+(month << 5)+(day), where month={1..12} and day={1..31}. This "date" is typically considered the "last modified" date in local time, but the actual definition is application-defined.

Back to: Cabinet File Format > Specification > CFFILE

u2 time
Time of this file, in the format (hour << 11)+(minute << 5)+(seconds/2), where hour={0..23}. This "time" is typically considered the "last modified" time in local time, but the actual definition is application-defined.

Back to: Cabinet File Format > Specification > CFFILE

u2 attribs
Attributes of this file; may be used in any combination:
#define  _A_RDONLY       (0x01)  /* file is read-only */
#define  _A_HIDDEN       (0x02)  /* file is hidden */
#define  _A_SYSTEM       (0x04)  /* file is a system file */
#define  _A_ARCH         (0x20)  /* file modified since last backup */
#define  _A_EXEC         (0x40)  /* run after extraction */
#define  _A_NAME_IS_UTF  (0x80)  /* szName[] contains UTF */

All other attribute bit values are reserved.

Back to: Cabinet File Format > Specification > CFFILE

char szName[]
NUL-terminated name of this file. Note that this string may include path separator characters. When attribs._A_NAME_IS_UTF is set, this string can be converted directly to Unicode, avoiding locale-specific dependencies. See "UTF Encoding Method" for more information. When attribs._A_NAME_IS_UTF is not set, this string is subject to interpretation depending on locale.

Back to: Cabinet File Format > Specification > CFFILE

CFDATA

Each CFDATA record describes some amount of compressed data. The first CFDATA entry for each folder is located using CFFOLDER.coffCabStart. Subsequent CFDATA records for this folder are contiguous. In a standard cabinet all the CFDATA entries are contiguous and in the same order as the CFFOLDER entries that refer them.

Back to: Cabinet File Format > Specification

struct CFDATA
{
  u4  csum;         /* checksum of this CFDATA entry */
  u2  cbData;       /* number of compressed bytes in this block */
  u2  cbUncomp;     /* number of uncompressed bytes in this block */
  u1  abReserve[];  /* (optional) per-datablock reserved area */
  u1  ab[cbData];   /* compressed data bytes */
};
u4 csum
Checksum of this CFDATA structure, from CFDATA.cbData through CFDATA.ab[cbData-1]. See "Checksum Method" for more information. May be set to zero if the checksum is not supplied.

Back to: Cabinet File Format > Specification > CFDATA

u2 cbData
Number of bytes of compressed data in this CFDATA record. When cbUncomp is zero, this field indicates only the number of bytes that fit into this cabinet file.

Back to: Cabinet File Format > Specification > CFDATA

u2 cbUncomp
The uncompressed size of the data in this CFDATA entry. When this CFDATA entry is continued in the next cabinet file, cbUncomp will be zero, and cbUncomp in the first CFDATA entry in the next cabinet file will report the total uncompressed size of the data from both CFDATA blocks.

Back to: Cabinet File Format > Specification > CFDATA

u1 abReserve[CFHEADER.cbCFData](optional)
If CFHEADER.flags.cfhdrRESERVE_PRESENT is set and cbCFHeader is non-zero, then this field contains per-datablock application information. This field is defined by the application and used for application-defined purposes.

Back to: Cabinet File Format > Specification > CFDATA

u1 ab[cbData]
The compressed data bytes, compressed using the CFFOLDER.typeCompress method. When cbUncomp is zero, these data bytes must be combined with the data bytes from the next cabinet's first CFDATA entry before decompression.

When CFFOLDER.typeCompress indicates that the data is not compressed, this field contains the uncompressed data bytes. In this case, cbData and cbUncomp will be equal unless this CFDATA entry crosses a cabinet file boundary.

Back to: Cabinet File Format > Specification > CFDATA

A Sample Cabinet File

       0   1   2   3   4   5   6    7    8   9   A   B   C   D   E   F
000   4D   53  43  46  00  00  00 00-FD  00  00  00  00  00  00  00  MSCF
010   2C   00  00  00  00  00  00 00-03  01  01  00  02  00  00  00  
020   22   06  00  00  5E  00  00 00-01  00  00  00  4D  00  00  00  
030   00   00  00  00  00  00  6C 22-BA  59  20  00  68  65  6C  6C  hell
040   6F   2E  63  00  4A  00  00 00-4D  00  00  00  00  00  6C  22  o.c
050   E7   59  20  00  77  65  6C 63-6F  6D  65  2E  63  00  BD  5A  welcome.c
060   A6   30  97  00  97  00  23 69-6E  63  6C  75  64  65  20  3C  #include <
070   73   74  64  69  6F  2E  68 3E-0D  0A  0D  0A  76  6F  69  64  stdio.h>    void
080   20   6D  61  69  6E  28  76 6F-69  64  29  0D  0A  7B  0D  0A  main(void)  {
090   20   20  20  20  70  72  69 6E-74  66  28  22  48  65  6C  6C  printf("Hell
0A0   6F   2C  20  77  6F  72  6C 64-21  5C  6E  22  29  3B  0D  0A  o, world!\n");
0B0   7D   0D  0A  23  69  6E  63 6C-75  64  65  20  3C  73  74  64  }  #include <std
0C0   69   6F  2E  68  3E  0D  0A 0D-0A  76  6F  69  64  20  6D  61  io.h>    void ma
0D0   69   6E  28  76  6F  69  64 29-0D  0A  7B  0D  0A  20  20  20  in(void)  {
0E0   20   70  72  69  6E  74  66 28-22  57  65  6C  63  6F  6D  65  printf("Welcome
0F0   21   5C  6E  22  29  3B  0D 0A-7D  0D  0A  0D  0A              !\n");  }

This is a very simple example of a cabinet file which contains two small text files, stored uncompressed for clarity.

Back to: Cabinet File Format

   Offset   Description
   00..23   CFHEADER
   00..03   signature = 0x4D, 0x53, 0x43, 0x46
   04..07   reserved1
   08..0B   cbCabinet = 0x000000FD (253)
   0C..0F   reserved2
   10..13   coffFiles = 0x0000002C
   14..17   reserved3
   18..19   versionMinor, Major = 1.3
   1A..1B   cFolders = 1
   1C..1D   cFiles = 2
   1E..1F   flags = 0 (no reserve, no previous or next cabinet)
   20..21   setID = 0x0622
   22..23   iCabinet = 0

   24..2B   CFFOLDER[0]
   24..27   coffCabStart = 0x0000005E
   28..29   cCFData = 1
   2A..2B   typeCompress = 0 (none)

   2C..43   CFFILE[0]
   2C..2F   cbFile = 0x0000004D (77 bytes)
   30..33   uoffFolderStart = 0x00000000
   34..35   iFolder = 0
   36..37   date = 0x226C = 0010001 0011 01100 = March 12, 1997
   38..39   time = 0x59BA = 01011 001101 11010 = 11:13:52 AM
   3A..3B   attribs = 0x0020 = _A_ARCHIVE
   3C..43   szName = "hello.c" + NUL

   44..5D   CFFILE[1]
   44..47   cbFile = 0x0000004A (74 bytes)
   48..4B   uoffFolderStart = 0x0000004D
   4C..4D   iFolder = 0
   4E..4F   date = 0x226C = 0010001 0011 01100 = March 12, 1997
   50..51   time = 0x59E7 = 01011 001111 00111 = 11:15:14 AM
   52..53   attribs = 0x0020 = _A_ARCHIVE
   54..5D   szName = "welcome.c" + NUL

   5E..FD   CFDATA[0]
   5E..61   csum = 0x30A65ABD
   62..63   cbData = 0x0097 (151 bytes)
   64..65   cbUncomp = 0x0097 (151 bytes)
   66..FD   ab[0x0097] = uncompressed file data

Notes

Checksum Method

The computation and verification of checksums found in CFDATA entries cabinet files is done using a function named CSUMCompute. Its actual source code is provided for reference. When checksums are not supplied by the cabinet file creating application, the checksum field is set to zero. Cabinet extracting applications do not compute or verify the checksum if the field is set to zero.

CHECKSUM CSUMCompute(void *pv, UINT cb, CHECKSUM seed)
{
    int         cUlong;                 // Number of ULONGs in block
    CHECKSUM    csum;                   // Checksum accumulator
    BYTE       *pb;
    ULONG       ul;

    cUlong = cb / 4;                    // Number of ULONGs
    csum = seed;                        // Init checksum
    pb = pv;                            // Start at front of data block

    //** Checksum integral multiple of ULONGs
    while (cUlong-- > 0) {
        //** NOTE: Build ULONG in big/little-endian independent manner
        ul = *pb++;                     // Get low-order byte
        ul |= (((ULONG)(*pb++)) <<  8); // Add 2nd byte
        ul |= (((ULONG)(*pb++)) << 16); // Add 3nd byte
        ul |= (((ULONG)(*pb++)) << 24); // Add 4th byte

        csum ^= ul;                     // Update checksum
    }

    //** Checksum remainder bytes
    ul = 0;
    switch (cb % 4) {
        case 3:
            ul |= (((ULONG)(*pb++)) << 16); // Add 3nd byte
        case 2:
            ul |= (((ULONG)(*pb++)) <<  8); // Add 2nd byte
        case 1:
            ul |= *pb++;                    // Get low-order byte
        default:
            break;
    }
    csum ^= ul;                         // Update checksum

    //** Return computed checksum
    return csum;
}

The checksums for non-split CFDATA blocks are computed first on the compressed data bytes, then on the CFDATA header area, starting at the CFDATA.cbData field:

CFDATA.cbData = cbCompressed;
CFDATA.cbUncomp = cbUncompressed;
csumPartial = CSUMCompute(&CFDATA.ab[0],CFDATA.cbData,0);
CFDATA.csum = CSUMCompute(&CFDATA.cbData,sizeof(CFDATA) –
sizeof(CFDATA.csum),csumPartial);

When blocks are split across cabinet file boundaries, the checksum for the partial block at the end of a cabinet file is computed first on the partial field of compressed data bytes, then on the header:

CFDATA.cbData = cbPartialData;
CFDATA.cbUncomp = 0;
csumPartial = CSUMCompute(&CFDATA.ab[0],cbPartialData,0);
CFDATA.csum = CSUMCompute(&CFDATA.cbData,sizeof(CFDATA) –
sizeof(CFDATA.csum),csumPartial);

The checksum for the residual block in the next cabinet file is computed first on the remainder of the field of compressed data bytes, then on the header:

CFDATA.cbData = cbResidualData;
CFDATA.cbUncomp = cbUncompressed;
csumPartial = CSUMCompute(&CFDATA.ab[cbPartialData],cbResidualData,0);
CFDATA.csum = CSUMCompute(&CFDATA.cbData,sizeof(CFDATA) –
sizeof(CFDATA.csum),csumPartial);

UTF Encoding Method

UTF (universal text format) is used to compactly represent a broad range of Unicode characters while favoring size for the most common characters. Unicode characters are translated to sequences of one, two, or three bytes per character.

When a string containing Unicode characters larger than 0x007F are encoded in the CFFILE.szName field, the _A_NAME_IS_UTF attribute should be included in the file's attributes. When no characters larger than 0x007F are in the name, the _A_NAME_IS_UTF attribute should not be set. If byte values larger than 0x7F are found in CFFILE.szName, but the _A_NAME_IS_UTF attribute is not set, the characters should be interpreted according to the current locale.

Unicode characters with values 0x0000 through 0x007F are represented by a single byte of the same value.

The first byte emitted for Unicode characters 0x0080 through 0x07FF is 0xC0+(unicodevalue >> 6), and the second byte is 0x80+(unicodevalue & 0x003F).

Unicode characters 0x0800 through 0xFFFF are represented by byte1 = 0xE0+(unicodevalue >> 12), byte2 = 0x80+((unicodevalue >> 6) & 0x3F), and byte3 = 0x80+(unicodevalue & 0x3F).

Microsoft FCI/FDI Library Description

Copyright © 1996-1997 Microsoft Corporation. All rights reserved.

Topics in this section

Introduction

FCI

FDI

Introduction

The FCI (File Compression Interface) and FDI (File Decompression Interface) libraries provide the ability to create and extract files from cabinets (also known as "CAB files"). In addition, the libraries provide compression and decompression capability to reduce the size of file data stored in cabinets.

The FCI and FDI functions are available through cabinet.dll.

FCI and FDI support multiple simultaneous contexts, so it is possible to create or extract multiple cabinets simultaneously within the same application. If the application is multi-threaded, it is also possible to run a different context in each thread; however, it is not permitted for the application to use the same context simultaneously in multiple threads (e.g. one cannot call FCIAddFile from two different threads, using the same FCI context).

FCI and FDI operate using the technique of function callbacks; some of the parameters of the FCI and FDI APIs are pointers to functions in the client application. The parameters and purpose of these functions are explained fully in this document. The fci_int.h and fd_int.h header files provide macros for declaring the callback functions, and use keywords such HUGE, FAR, and DIAMONDAPI, which ensure that the functions are properly defined for both 32-bit and 16-bit operation. For example, in the case of the memory allocation and memory free functions, the following definitions exist in fci_int.h:

#define FNFCIALLOC(fn) void HUGE * FAR DIAMONDAPI fn(ULONG cb)
#define FNFCIFREE(fn) void FAR DIAMONDAPI fn(void HUGE *pv)

These declarations can be used as follows:

FNFCIALLOC(mem_alloc)
{
      return malloc(cb);
}

FNFCIFREE(mem_free)
{
      return free(memory);
}

some_function()
{
      hfci = FCICreate(
            &erf, 
            filedest, 
            mem_alloc, 
            mem_free,
            etc.
      );
}

It should be noted that the FCI callback function names all begin with the string "FCI". In addition, the FCI and FDI i/o functions (open, close, read, write, seek) take different parameters, and cannot be used interchangeably.

The FDI i/o functions take parameters which are identical to those of the C run-time library routines _open, close, read, write, and lseek. The FCI i/o functions take similar parameters, with the addition of an error pointer in which to return an i/o error, and the client's context pointer originally passed in to the FCICreate API.

Two example applications are provided; testfci and testfdi. These applications demonstrate how all of the FCI and FDI APIs, respectively, may be used.

Back to: FCI/FDI Library

FCI

The five FCI (File Compression Interface) APIs are:

APIDescription
FCICreate Create an FCI context
FCIAddFile Add a file to the cabinet under construction
FCIFlushCabinet Complete the current cabinet
FCIFlushFolder Complete the current folder and start a new folder
FCIDestroy Destroy an FCI context

Back to: FCI/FDI Library

FCICreate

Back to: FCI/FDI Library

HFCI DIAMONDAPI FCICreate(
      PERF               perf, 
      PFNFCIFILEPLACED   pfnfiledest, 
      PFNFCIALLOC        pfnalloc, 
      PFNFCIFREE         pfnfree, 
      PFNFCIOPEN         pfnopen, 
      PFNFCIREAD         pfnread, 
      PFNFCIWRITE        pfnwrite, 
      PFNFCICLOSE        pfnclose, 
      PFNFCISEEK         pfnseek, 
      PFNFCIDELETE       pfndelete, 
      PFNFCIGETTEMPFILE  pfnfcigtf, 
      PCCAB              pccab, 
      void FAR *         pv 
);

Back to: FCI/FDI Library > FCI > FCICreate

Parameters

perf

Pointer to an error structure

pfnfiledest

Function to call when a file is placed

pfnalloc

Memory allocation function

pfnfree

Memory free function

pfnopen

Function to open a file

pfnread

Function to read data from a file

pfnwrite

Function to write data to a file

pfnclose

Function to close a file

pfnseek

Function to seek to a new position in a file

pfntemp

Function to obtain a temporary file name

pfndelete

Function to delete a file

pccab

Parameters for creating cabinet

pv

Client context parameter

Back to: FCI/FDI Library > FCI > FCICreate

Description

The FCICreate API creates an FCI context that is passed to other FCI APIs.

The perf parameter should point to a global or allocated ERF structure. Any errors returned by FCICreate or subsequent FCI APIs using the same context will cause the ERF structure to be filled out.

The pfnalloc and pfnfree parameters should point to memory allocation and memory free functions which will be called by FCI to allocate and free memory. These two functions take parameters identical to the standard C malloc and free functions.

The pfnopen, pfnread, pfnwrite, pfnclose, pfnseek, and pfndelete parameters should point to functions which perform file open, file read, file write, file close, file seek, and file delete operations respectively. These functions must accept parameters similar to those for the standard _open, _read, _write, _close, _lseek, and remove functions, with the addition of two additional parameters to the list; err and pv. The err parameter is an int *, and upon entry into the function, *err will equal zero. However, if the function returns failure, *err should be set to an error code of the application's choosing, which will be returned via perf (the error code is not used by FCI, and is not required to conform to C run-time library errno conventions). The pv parameter will equal the client's context parameter passed in to FCICreate.

The pfntemp parameter should point to a function which returns the name of a suitable temporary file. Three parameters will be passed to this function; pszTempName, an area of memory to store the filename, cbTempName, the size of the memory area, and pv, the client's context pointer. The filename returned by this function should not occupy more than cbTempName bytes. FCI may open several temporary files at once, so it is important to ensure that a different filename is returned each time, and that the file does not already exist. The function should return TRUE for success, or FALSE for failure.

The pfnfiledest parameter should point to a function which will be called whenever the location of a file or file segment on a particular cabinet has been finalized. This information is useful only when files are being stored across multiple cabinets. The parameters passed to this function are pccab, a pointer to the CCAB structure of the cabinet on which the file has been stored, pszFile, the filename of the file which has been placed, cbFile, the file size, and fContinuation, a Boolean which signifies whether the file is a later segment of a file which has been split across cabinets. In addition, the client context value, pv, is also passed as a parameter.

The pccab parameter should point to an initialized CCAB structure, which will provide FCI with details on how to build the cabinet. The CCAB fields are explained below:

The cb field, the media size, specifies the maximum size of a cabinet which will be created by FCI. If necessary, multiple cabinets will be created. To ensure that only one cabinet is created, a sufficiently large number should be used for this parameter.

The cbFolderThresh field specifies the maximum number of compressed bytes which may reside in a folder before a new folder is created. A higher folder threshold improves compression performance (since creating a new folder resets the compression history), but increases random access time to the folder.

The iCab field is used by FCI to count the number of cabinets that have been created so far. This value can also be read by the application to determine the name of a cabinet. See the GetNextCab parameter of the FCIAddFile API for details.

The iDisk field is used in a similar manner to iCab. See the GetNextCab parameter of the FCIAddFile API for details.

The setID field is for the use of the application, and can be initialized with any number. The set ID is stored in the cabinet.

The szDisk field should contain a disk-specific string (such as "Disk1", "Disk2", etc.) corresponding to the disk on which the cabinet is placed. Alternatively, if cabinets are not spanning multiple disks, the string can simply be a null string. This field is stored in the cabinet and is used upon extraction to prompt the user to insert the correct disk. See the FCIAddFile API for details.

The szCab field should contain a string which contains the name of the first cabinet to be created (e.g. "APP1.CAB"). In the event of multiple cabinets being created, the GetNextCab function called by the FCIAddFile API allows subsequent cabinet names to be specified.

The szCabPath field should contain the complete path of where to create the cabinet (e.g. "C:\MYFILES\").

The cbReserveCFHeader, cbReserveCFFolder, and cbReserveCFData fields can be set to create per-cabinet, per-folder, and per-datablock reserved sections in the cabinet. For example, setting cbReserveCFHeader to 6144 is commonly used to reserve a 6k space in the cabinet file as needed for codesigning. The other reserved sections are not commonly used.

Back to: FCI/FDI Library > FCI > FCICreate

Returns

If successful, a non-NULL HFCI context pointer is returned. If unsuccessful, NULL is returned, and the error structure pointed to by perf is filled out.

Back to: FCI/FDI Library > FCI > FCICreate

FCIAddFile

Back to: FCI/FDI Library

BOOL DIAMONDAPI FCIAddFile(
      HFCI                  hfci, 
      char                 *pszSourceFile, 
      char                 *pszFileName, 
      BOOL                  fExecute, 
      PFNFCIGETNEXTCABINET  GetNextCab, 
      PFNFCISTATUS          pfnProgress, 
      PFNFCIGETOPENINFO     pfnOpenInfo, 
      TCOMP                 typeCompress 
);

Back to: FCI/FDI Library > FCI > FCIAddFile

Parameters

hfci

FCI Context pointer originally returned by FCICreate

pszSourceFile

Name of file to add (should include path information)

pszFileName

Name under which to store the file in the cabinet fExecute
Boolean indicating whether the file should be executed when it is extracted

GetNextCab

Function called to obtain specifications on the next cabinet to create

pfnProgress

Progress function called to update the user

pfnOpenInfo

Function called to open a file and return file date, time and attributes

typeCompress

Compression type to use

Back to: FCI/FDI Library > FCI > FCIAddFile

Description

The FCIAddFile API adds a file to the cabinet under construction.

The hfci parameter must be the context pointer returned by a previous call to FCICreate.

The pszSourceFile parameter specifies the location of the file to be added to the cabinet, and should therefore include as much path information as possible (e.g. "C:\MYFILES\TEST.EXE").

The pszFileName parameter specifies the name of the file inside the cabinet, and should not include any path information (e.g. "TEST.EXE").

The fExecute parameter specifies whether the file should be executed automatically when the cabinet is extracted. When set, the _A_EXEC attribute will be added to the file entry in the CAB. This mechanism is used in some Microsoft self-extracting executables, and could be used for this purpose in any custom extract application.

The GetNextCab parameter should point to a function which is called whenever FCI wishes to create a new cabinet, which will happen whenever the size of the cabinet is about to exceed the media size as specified in the cb field of the CCAB structure passed to FCICreate. The GetNextCab function is called with three parameters which are explained below:

The first parameter, pccab, is a pointer to a copy of the CCAB structure of the cabinet which has just been completed. However, the iCab field will have been incremented by one. When this function returns, the next cabinet will be created using the fields in this structure, so these fields should be modified as is necessary. In particular, the szCab field (the cabinet name) should be changed. If creating multiple cabinets, typically the iCab field is used to create the name; for example, the GetNextCab function might include a line that does:

sprintf(pccab->szCab, "FOO%d.CAB", pccab->iCab);

Similarly, the disk name, media size, folder threshold, etc. parameters may also be modified.

The second parameter, cbPrevCab, is an estimate of the size of the cabinet which has just been completed.

The last parameter, pv, is the application-defined value originally passed to FCICreate.

The GetNextCab function should return TRUE for success, or FALSE to abort cabinet creation.

The pfnProgress parameter should point to a function that is called periodically by FCI so that the application may send a progress report to the user. The progress function has four parameters; typeStatus, which specifies the type of status message, cb1 and cb2, which are numbers, the meaning of which is dependent upon typeStatus, and pv, the application-specific context pointer.

The typeStatus parameter may take on values of statusFile, statusFolder, or statusCabinet. If typeStatus equals statusFile then it means that FCI is compressing data blocks into a folder. In this case, cb1 is either zero, or the compressed size of the most recently compressed block, and cb2 is either zero, or the uncompressed size of the most recently read block (which is usually 32K, except for the last block in a folder, which may be smaller). There is no direct relation between cb1 and cb2; FCI may read several blocks of uncompressed data before emitting any compressed data; if this happens, some statusFile messages may contain, for example, cb1 = 0 and cb2 = 32K, followed later by other messages which contain cb1 = 20K and cb2 = 0.

If typeStatus equals statusFolder then it means that FCI is copying a folder to a cabinet, and cb1 is the amount copied so far, and cb2 is the total size of the folder. Finally, if typeStatus equals statusCabinet, then it means that FCI is writing out a completed cabinet, and cb1 is the estimated cabinet size that was previously passed to GetNextCab, and cb2 is the actual resulting cabinet size.

The progress function should return 0 for success, or -1 for failure, with an exception in the case of statusCabinet messages, where the function should return the desired cabinet size (cb2), or possibly a value rounded up to slightly higher than that.

The pfnOpenInfo parameter should point to a function which opens a file and returns its datestamp, timestamp, and attributes. The function will receive five parameters; pszName, the complete pathname of the file to open; pdate, a memory location to return a FAT-style date code; ptime, a memory location to return a FAT-style time code; pattribs, a memory location to return FAT-style attributes; and pv, the application-specific context pointer originally passed to FCICreate. The function should open the file using a file open function compatible with those passed in to FCICreate, and return the resulting file handle, or -1 if unsuccessful.

The typeCompress parameter specifies the type of compression to use, which may be either tcompTYPE_NONE for no compression, or tcompTYPE_MSZIP for Microsoft ZIP compression. Other compression formats may be supported in the future.
Back to: FCI/FDI Library > FCI > FCIAddFile

Returns

If successful, TRUE is returned. If unsuccessful, FALSE is returned, and the error structure pointed to by perf (from FCICreate) is filled out.

Back to: FCI/FDI Library > FCI > FCIAddFile

FCIFlushCabinet

Back to: FCI/FDI Library

BOOL DIAMONDAPI FCIFlushCabinet(
      HFCI                  hfci, 
      BOOL                  fGetNextCab, 
      PFNFCIGETNEXTCABINET  GetNextCab, 
      PFNFCISTATUS          pfnProgress 
);

Back to: FCI/FDI Library > FCI > FCIFlushCabinet

Parameters

hfci

FCI Context pointer originally returned by FCICreate

fGetNextCab

Specifies whether the function pointed to by the supplied GetNextCab parameter, will be called

GetNextCab

Function called to obtain specifications on the next cabinet to create

pfnProgress

Progress function called to update the user

Back to: FCI/FDI Library > FCI > FCIFlushCabinet

Description

The FCIFlushCabinet API forces the current cabinet under construction to be completed immediately and written to disk. Further calls to FCIAddFile will cause files to be added to another cabinet. It is also possible that there exists pending data in FCI's internal buffers that will may require spillover into another cabinet, if the current cabinet has reached the application-specified media size limit.

The hfci parameter must be the context pointer returned by a previous call to FCICreate.

The fGetNextCab flag determines whether the function pointed to by the supplied GetNextCab parameter, will be called. If fGetNextCab is TRUE, then GetNextCab will be called to obtain continuation information. Otherwise, if fGetNextCab is FALSE, then GetNextCab will be called only if the cabinet overflows.

The pfnProgress parameter should point to a function which is called periodically by FCI so that the application may send a progress report to the user. This function works in an identical manner to the progress function passed to FCIAddFile.

Back to: FCI/FDI Library > FCI > FCIFlushCabinet

Returns

If successful, TRUE is returned. If unsuccessful, FALSE is returned, and the error structure pointed to by perf (from FCICreate) is filled out.

Back to: FCI/FDI Library > FCI > FCIFlushCabinet

FCIFlushFolder

Back to: FCI/FDI Library

BOOL DIAMONDAPI FCIFlushFolder(
      HFCI                  hfci, 
      PFNFCIGETNEXTCABINET  GetNextCab, 
      PFNFCISTATUS          pfnProgress 
);

Back to: FCI/FDI Library > FCI > FCIFlushFolder

Parameters

hfci

FCI Context pointer originally returned by FCICreate

GetNextCab

Function called to obtain specifications on the next cabinet to create

pfnProgress

Progress function called to update the user

Back to: FCI/FDI Library > FCI > FCIFlushFolder

Description

The FCIFlushFolder API forces the current folder under construction to be completed immediately, effectively resetting the compression history at this point (if compression is being used).

The hfci parameter must be the context pointer returned by a previous call to FCICreate.

The supplied GetNextCab function will be called if the cabinet overflows, which is a possibility if the pending data buffered inside FCI causes the application-specified cabinet media size to be exceeded.

The pfnProgress parameter should point to a function which is called periodically by FCI so that the application may send a progress report to the user. This function works in an identical manner to the progress function passed to FCIAddFile.

Back to: FCI/FDI Library > FCI > FCIFlushFolder

FCIDestroy

Back to: FCI/FDI Library

BOOL DIAMONDAPI FCIDestroy(
      HFCI  hfci
);

Back to: FCI/FDI Library > FCI > FCIDestroy

Parameters

hfci

FCI Context pointer originally returned by FCICreate

Back to: FCI/FDI Library > FCI > FCIDestroy

Description

The FCIDestroy API destroys an hfci context, freeing any memory and temporary files associated with the context.

Back to: FCI/FDI Library > FCI > FCIDestroy

Returns

If successful, TRUE is returned. If unsuccessful, FALSE is returned. The only reason for failure is that the hfci passed in was not a proper context handle.

Back to: FCI/FDI Library > FCI > FCIDestroy

FDI

The five FDI (File Decompression Interface) APIs are:

APIDescription
FDICreate Create an FCI context
FDIIsCabinet Determines whether a file is a cabinet, and returns information if so
FDICopy Extracts files from cabinets
FDIDestroy Destroy an FDI context

Back to: FCI/FDI Library

FDICreate

Back to: FCI/FDI Library

HFCI DIAMONDAPI FDICreate(
      PFNALLOC  pfnalloc, 
      PFNFREE   pfnfree, 
      PFNOPEN   pfnopen, 
      PFNREAD   pfnread, 
      PFNWRITE  pfnwrite, 
      PFNCLOSE  pfnclose, 
      PFNSEEK   pfnseek, 
      int       cpuType, 
      PERF      perf 
);

Back to: FCI/FDI Library > FDI

Parameters

pfnalloc

Memory allocation function

pfnfree

Memory free function

pfnopen

Function to open a file

pfnread

Function to read data from a file

pfnwrite

Function to write data to a file

pfnclose

Function to close a file

pfnseek

Function to seek to a new position in a file

cpuType

Type of CPU

perf

Pointer to an error structure

Back to: FCI/FDI Library > FDI > FDICreate

Description

The FDICreate API creates an FDI context that is passed to other FDI APIs.

The pfnalloc and pfnfree parameters should point to memory allocation and memory free functions which will be called by FDI to allocate and free memory. These two functions take parameters identical to the standard C malloc and free functions.

The pfnopen, pfnread, pfnwrite, pfnclose, and pfnseek parameters should point to functions which perform file open, file read, file write, file close, and file seek operations respectively. These functions should accept parameters identical to those for the standard _open, _read, _write, _close, and _lseek functions, and should likewise have identical return codes. Note that the FDI i/o functions do not take the same parameters as the FCI i/o functions.

It is not necessary for these functions to actually call _open etc.; these functions could instead call fopen, fread, fwrite, fclose, and fseek, or CreateFile, ReadFile, WriteFile, CloseHandle, and SetFilePointer, etc. However, the parameters and return codes will have to be translated appropriately (e.g. the file open mode passed in to pfnopen).

The cpuType parameter should equal one of cpu80386 (indicating that 80386 instructions may be used), cpu80286 (indicating that only 80286 instructions may be used), or cpuUNKNOWN (indicating that FDI should determine the CPU type). The cpuType parameter is looked at only by the 16-bit version of FDI; it is ignored by the 32-bit version of FDI.

The perf parameter should point to a global or allocated ERF structure. Any errors returned by FDICreate or subsequent FDI APIs using the same context will cause the ERF structure to be filled out.

Back to: FCI/FDI Library > FDI > FDICreate

Returns

If successful, a non-NULL HFDI context pointer is returned. If unsuccessful, NULL is returned, and the error structure pointed to by perf is filled out.

Back to: FCI/FDI Library > FDI > FDICreate

FDIIsCabinet

Back to: FCI/FDI Library

BOOL DIAMONDAPI FDIIsCabinet(
      HFDI             hfdi, 
      int              hf, 
      PFDICABINETINFO  pfdici 
);

Back to: FCI/FDI Library > FDI > FDIIsCabinet

Parameters

hfdi

FDI Context pointer originally returned by FDICreate

hf

File handle returned by a call to the application's file open function

pfdici

Pointer to a cabinet info structure

Back to: FCI/FDI Library > FDI > FDIIsCabinet

Description

The FDIIsCabinet API determines whether a given file is a cabinet, and if so, returns information about the cabinet in the provided FDICABINETINFO structure.

The hfdi parameter is the context pointer returned by a previous call to FDICreate.

The hf parameter must be a file handle on the file being examined. The file handle must be of the same type as those used by the file i/o functions passed to FDICreate.

The pfdici parameter should point to an FDICABINETINFO structure, which will receive the cabinet details if the file is indeed a cabinet. The fields of this structure are as follows:

The cbCabinet field contains the length of the cabinet file, in bytes. The cFolders field contains the number of folders in the cabinet. The cFiles field contains the total number of files in the cabinet. The setID field contains the set ID (an application-defined magic number) of the cabinet. The iCabinet field contains the number of this cabinet in the set (0 for the first cabinet, 1 for the second, and so forth). The fReserve field is a Boolean indicating whether there is a reserved area present in the cabinet. The hasprev field is a Boolean indicating whether this cabinet is chained to the previous cabinet, by way of having a file continued from the previous cabinet into the current one. The hasnext field is a Boolean indicating whether this cabinet is chained to the next cabinet, by way of having a file continued from this cabinet into the next one.

Back to: FCI/FDI Library > FDI > FDIIsCabinet

Returns

If the file is a cabinet, then TRUE is returned and the FDICABINETINFO structure is filled out. If the file is not a cabinet, or some other error occurred, then FALSE is returned. In either case, it is the responsibility of the application to close the file handle passed to this function.

Back to: FCI/FDI Library > FDI > FDIIsCabinet

FDICopy

Back to: FCI/FDI Library

BOOL FAR DIAMONDAPI FDICopy(
         HFDI           hfdi, 
   char  FAR           *pszCabinet, 
   char  FAR           *pszCabPath, 
   int                  flags, 
         PFNFDINOTIFY   pfnfdin, 
         PFNFDIDECRYPT  pfnfdid, 
   void  FAR           *pvUser 
);

Back to: FCI/FDI Library > FDI > FDICopy

Parameters

hfdi

FDI Context pointer originally returned by FDICreate

pszCabinet

Name of cabinet file, excluding path information

pszCabPath

File path to cabinet file

flags

Flags to control the extract operation

pfnfdin

Pointer to a notification (status update) function

pfnfdid

Pointer to a decryption function

pvUser

Application-specified value to pass to notification function

Back to: FCI/FDI Library > FDI > FDICopy

Description

The FDICopy API extracts one or more files from a cabinet. Information on each file in the cabinet is passed back to the supplied pfnfdin function, at which point the application may decide to extract or not extract the file.

The hfdi parameter is the context pointer returned by a previous call to FDICreate.

The pszCabinet parameter should be the name of the cabinet file, excluding any path information, from which to extract files. If a file is split over multiple cabinets, FDICopy does allow subsequent cabinets to be opened.

The pszCabPath parameter should be the file path of the cabinet file (e.g. "C:\MYCABS\"). The contents of pszCabPath and pszCabinet will be strung together to create the full pathname of the cabinet.

The flags parameter is used to set flags for the decoder. At this time there are no flags defined, and the flags parameter should be set to zero.

The pfnfdin parameter should point to a file notification function, which will be called periodically to update the application on the status of the decoder. The pfnfdin function takes two parameters; fdint, an integral value indicating the type of notification message, and pfdin, a pointer to an FDINOTIFICATION structure.

The fdint parameter may equal one of the following values; fdintCABINET_INFO (general information about the cabinet), fdintPARTIAL_FILE (the first file in the cabinet is a continuation from a previous cabinet), fdintCOPY_FILE (asks the application if this file should be copied), fdintCLOSE_FILE_INFO (close the file and set file attributes, date, etc.), or fdintNEXT_CABINET (file continued on next cabinet).

The pfdin parameter will point to an FDINOTIFICATION structure with some or all of the fields filled out, depending on the value of the fdint parameter. Four of the fields are used for general data; cb (a long integer), and psz1, psz2, and psz3 (pointers to strings), the meanings of which are highly dependent on the fdint value. The pv field will be the value the application originally passed in as the pvUser parameter to FDICopy.

The pfnfdin function must return a value to FDI, which tells FDI whether to continue, abort, skip a file, or perform some other operation. The values that can be returned depend on fdint, and are explained below.

Note that it is possible that future versions of FDI will have additional notification messages. Therefore, the application should ignore values of fdint it does not understand, and return zero to continue (preferably), or -1 (negative one) to abort.

If fdint equals fdintCABINET_INFO then the following fields will be filled out; psz1 will point to the name of the next cabinet (excluding path information); psz2 will point to the name of the next disk; psz3 will point to the cabinet path name; setID will equal the set ID of the current cabinet; and iCabinet will equal the cabinet number within the cabinet set (0 for the first cabinet, 1 for the second cabinet, etc.) The application should return 0 to indicate success, or -1 to indicate failure, which will abort FDICopy. An fdintCABINET_INFO notification will be provided exactly once for each cabinet opened by FDICopy, including continuation cabinets opened due to files spanning cabinet boundaries.

If fdint equals fdintCOPY_FILE then the following fields will be filled out; psz1 will point to the name of a file in the cabinet; cb will equal the uncompressed size of the file; date will equal the file's 16-bit FAT date; time will equal the file's 16-bit FAT time; and attribs will equal the file's 16-bit FAT attributes. The application may return one of three values; 0 (zero) to skip (i.e. not copy) the file; -1 (negative one) to abort FDICopy; or a non-zero (and non-negative-one) file handle for the destination to which to write the file. The file handle returned must be compatible with the PFNCLOSE function supplied to FDICreate. The fdintCOPY_FILE notification is called for each file that starts in the current cabinet, providing the opportunity for the application to request that the file be copied or skipped.

If fdint equals fdintCLOSE_FILE_INFO then the following fields will be filled out; psz1 will point to the name of a file in the cabinet; hf will be a file handle (which originated from fdintCOPY_FILE); date will equal the file's 16-bit FAT date; time will equal the file's 16-bit FAT time; attributes will equal the file's 16-bit FAT attributes (minus the _A_EXEC bit); and cb will equal either zero (0) or one (1), indicating whether the file should be executed after extract (one), or not (zero). It is the responsibility of the application to execute the file if cb equals one. The fdintCLOSE_FILE_INFO notification is called after all of the data has been written to a target file. The application must close the file (using the provided hf handle), and set the file date, time, and attributes. The application should return TRUE for success, or FALSE or -1 (negative one) to abort FDICopy. FDI assumes that the target file was closed, even if this callback returns failure; FDI will not attempt to use PFNCLOSE to close the file.

If fdint equals fdintPARTIAL_FILE then the following fields will be filled out; psz1 will point to the name of the file continued from a previous cabinet; psz2 will point to the name of the cabinet on which the first segment of the file exists; psz3 will point to the name of the disk on which the first segment of the file exists. The fdintPARTIAL_FILE notification is called for files at the beginning of a cabinet which are continued from a previous cabinet. This notification will occur only when FDICopy is started on the second or subsequent cabinet in a series, which has files continued from a previous cabinet. The application should return zero (0) for success, or -1 (negative one) for failure, which will abort FDICopy.

If fdint equals fdintNEXT_CABINET then the following fields will be filled out; psz1 will point to the name of the next cabinet on which the current file is continued; psz2 will point to the name of the next disk on which the current file is continued; psz3 will point to the cabinet path information; and fdie will equal a success or error value. The fdintNEXT_CABINET notification is called only when fdintCOPY_FILE was instructed to copy a file in the current cabinet that is continued in a subsequent cabinet. It is important that the cabinet path name, psz3, be validated before returning (psz3, which points to a 256 byte array, may be modified by the application; however, it is not permissible to modify psz1 or psz2). The application should ensure that the cabinet exists and is readable before returning; if necessary, the application should issue a disk change prompt and ensure that the cabinet file exists. When this function returns to FDI, FDI will verify that the setID and iCabinet fields of the supplied cabinet match the expected values for that cabinet. If not, FDI will continue to send fdintNEXT_CABINET notification messages with the fdie field set to FDIERROR_WRONG_CABINET, until the correct cabinet file is specified, or until this function returns -1 (negative one) to abort the FDICopy call. If after returning from this function, the cabinet file is not present and readable, or has been damaged, then the fdie field will equal one of the following values; FDIERROR_CABINET_NOT_FOUND, FDIERROR_NOT_A_CABINET, FDIERROR_UNKNOWN_CABINET_VERSION, FDIERROR_CORRUPT_CABINET, FDIERROR_BAD_COMPR_TYPE, FDIERROR_RESERVE_MISMATCH, FDIERROR_WRONG_CABINET. If there was no error, fdie will equal FDIERROR_NONE. The application should return 0 (zero) to indicate success, or -1 (negative one) to indicate failure, which will abort FDICopy

The pfndid parameter is reserved for encryption, and is currently not used by FDI. This parameter should be set to NULL. 

The pvUser parameter should contain an application-defined value that will be passed back as a field in the FDINOTIFICATION structure of the notification function. It not required, this field may be safely set to NULL.

Back to: FCI/FDI Library > FDI > FDICopy

Returns

If successful, TRUE is returned. If unsuccessful, FALSE is returned, and the error structure pointed to by perf (from FDICreate) is filled out.

Back to: FCI/FDI Library > FDI > FDICopy

FCIDestroy

Back to: FCI/FDI Library

BOOL DIAMONDAPI FDIDestroy(
      HFDI  hfdi
);

Back to: FCI/FDI Library > FDI > FDIDestroy

Parameters

hfdi

FDI Context pointer originally returned by FDICreate

Back to: FCI/FDI Library > FDI > FDIDestroy

Description

The FDIDestroy API destroys an hfdi context, freeing any memory and temporary files associated with the context.

Back to: FCI/FDI Library > FDI > FDIDestroy

Returns

If successful, TRUE is returned. If unsuccessful, FALSE is returned. The only reason for failure is that the hfdi passed in was not a proper context handle.

Back to: FCI/FDI Library > FDI > FDIDestroy

Microsoft LZX Data Compression Format

Copyright © 1997 Microsoft Corporation. All rights reserved.

Topics in this section

Introduction

Concepts

LZ77
Bitstream
Window Size
Trees
Repeated Offsets
Constants

LZX Compressed Data Format

Cabinet Block Size
Header Structure
Encoder Preprocessing
Block Structure
Uncompressed Block Format
Verbatim Block
Aligned Offset Block
Encoding the Trees and Pre-Trees
Compressed Literals
Match Offset => Formatted Offset
Formatted Offset => Position Slot, Position Footer
Position Footer => Verbatim Bits, Aligned Offset Bits
Match Length => Length Header, Length Footer
Length Header, Position Slot => Length/Position Header
Encoding a Match
Decoding a Match or an Uncompressed Character

Introduction

This document is a design specification for the format of LZX compressed data used in the LZX compression mode of Microsoft's CAB file format. The purpose of this document is to allow anyone to encode or decode LZX compressed data. This document describes only the format of the output –it does not provide any specific algorithms for match location, tree generation, etc.

Before proceeding with the design specification itself, a few important concepts are described in the following pages.

Back to: LZX Data Compression Format

Concepts

This section includes:

LZ77
Bitstream
Window Size
Trees
Repeated Offsets
Constants

Back to: LZX Data Compression Format

LZ77

LZX is an LZ77 based compressor that uses static Huffman encoding and a sliding window of selectable size. Data symbols are encoded either as an uncompressed symbol, or as an (offset, length) pair indicating that length symbols should be copied from a displacement of -offset symbols from the current position in the output stream. The value of offset is constrained to be less than the size of the sliding window.

Back to: LZX Data Compression Format > Concepts

Bitstream

An LZX bitstream is a sequence of 16 bit integers stored in the order least-significant-byte most-significant-byte. Given an input stream of bits named a, b, c, ..., x, y, z, A, B, C, D, E, F, the output byte stream (with byte boundaries highlighted) would be as shown below.

Output byte stream

Bb417343.bitstream(en-us,MSDN.10).gif

Back to: LZX Data Compression Format > Concepts

Window Size

The window size must be a power of 2, from 215 to 221. The window size is not stored in the compressed data stream, and must instead be passed to the decoder before decoding begins.

The window size determines the number of window subdivisions, or "position slots", as shown in the following table:

Windows Size / Position Slot Table

Window SizePosition Slots Required
32K30
64K32
128K34
256K36
512K38
1 MB40
2 MB42

Back to: LZX Data Compression Format > Concepts Trees

LZX uses canonical Huffman tree structures to represent elements. Huffman trees are well known in data compression and are not described here. Since an LZX decoder uses only the path lengths of the Huffman tree to reconstruct the identical tree, the following constraints are made on the tree structure:

  1. For any two elements with the same path length, the lower-numbered element must be further left on the tree than the higher numbered element. An alternative way of stating this constraint is that lower-numbered elements must have lower path traversal values; for example, 0010 (left-left-right-left) is lower than 0011 (left-left-right-right).
  2. For each level, starting at the deepest level of the tree and then moving upwards, leaf nodes must start as far left as possible. An alternative way of stating this constraint is that if any tree node has children then all tree nodes to the left of it with the same path length must also have children.
  3. Zero length Huffman codes are not permitted, therefore a tree must contain at least 2 elements. In the case where all tree elements are zero frequency, or all but one tree element is zero frequency, the resulting tree must consist of the two Huffman codes "0" and "1". In the latter case, constraint #1 still applies.

LZX uses several Huffman tree structures. The most important tree is the main tree, which comprises 256 elements corresponding to all possible ASCII characters, plus 8 * NUM_POSITION_SLOTS (see above) elements corresponding to matches. The second most important tree is the length tree, which comprises 249 elements.

Other trees, such as the aligned offset tree (comprising 8 elements), and the pre-trees (comprising 20 elements each), have a smaller role.

Back to: LZX Data Compression Format > Concepts

Repeated Offsets

LZX extends the conventional LZ77 format in several ways, one of which is in the use of repeated offset codes. Three match offset codes, named the repeated offset codes, are reserved to indicate that the current match offset is the same as that of one of the three previous matches which is not itself a repeated offset.

The three special offset codes are encoded as offset values 0, 1, and 2 (i.e. encoding an offset of 0 means "use the most recent non-repeated match offset", an offset of 1 means "use the second most recent non-repeated match offset", etc.). All remaining offset values are displaced by +3, as is shown in the table below, which prevents matches at offsets WINDOW_SIZE, WINDOW_SIZE-1, and WINDOW_SIZE-2.

Correlation Between Encoded Offset and Real Offset

Encoded OffsetReal Offset
0Most recent non-repeated match offset
1Second most recent non-repeated match offset
2Third most recent non-repeated match offset
31 (closest allowable)
42
53
64
75
86
500498
x+2x
WINDOW_SIZE-1

(maximum possible)

WIDOW_SIZE-3

The three most recent non-repeated match offsets are kept in a list, the behavior of which explained below:

Let R0 be defined as the most recent non-repeated offset
Let R1 be defined as the second most recent non-repeated offset
Let R2 be defined as the third most recent non-repeated offset

The list is managed similarly to an LRU (least recently used) queue, with the exception of the cases when R1 or R2 is output. In these cases, which are fairly uncommon, R1 or R2 is simply swapped with R0, which requires fewer operations than would an LRU queue. The compression penalty from doing so is essentially zero and it removes a small computational overhead from the decoder.

The initial state of R0, R1, R2 is (1, 1, 1).

Management of the Repeated Offsets List

Match Offset X where... Operation
X ≠ R0 and X ≠ R1 and X ≠ R2 R2 ← R1

R1 ← R0

R0 ← X

X = R0None
X = R1Swap R0 ⇔ R1
X = R2Swap R0 ⇔ R2

Back to: LZX Data Compression Format > Concepts

Constants

The following named constants are used frequently in this document:

ConstantDescriptionValue
MIN_MATCHSmallest allowable match length2
MAX_MATCHLargest allowable match length257
NUM_CHARSNumber of uncompressed character types256
WINDOW_SIZEWindow sizeVaries
NUM_POSITION_SLOTSNumber of window subdivisionsDependent upon WINDOW_SIZE
MAIN_TREE_ELEMENTSNumber of elements in main treeNUM_CHARS + NUM_POSITION_SLOTS*8
NUM_SECONDARY_LENGTHSNumber of elements in length tree249

Back to: LZX Data Compression Format > Concepts

LZX Compressed Data Format

LZX compressed data consists of a header indicating the file translation size (which is described later), followed by a sequence of compressed blocks. A stream of uncompressed input may be output as multiple compressed LZX blocks to improve compression, since each compressed block contains its own statistical tree structures.

Bb417343.lzx(en-us,MSDN.10).gif

This section includes:

Cabinet Block Size
Header Structure
Encoder Preprocessing
Block Structure
Uncompressed Block Format
Verbatim Block
Aligned Offset Block
Encoding the Trees and Pre-Trees
Compressed Literals
Match Offset &#8658; Formatted Offset
Formatted Offset &#8658; Position Slot, Position Footer
Position Footer &#8658; Verbatim Bits, Aligned Offset Bits
Match Length &#8658; Length Header, Length Footer
Length Header, Position Slot &#8658; Length/Position Header
Encoding a Match
Decoding a Match or an Uncompressed Character

Back to: LZX Data Compression Format

Cab Block Size

The cabinet file format requires that for any particular CFDATA block, the indicated number of compressed input bytes must represent exactly the indicated number of uncompressed output bytes. Furthermore, each CFDATA block must represent 32768 uncompressed bytes, with the exception of the last CFDATA block in a folder, which may represent less than 32768 uncompressed bytes.

The LZX block size is independent of the CFDATA block size; an LZX block can represent 200,000 uncompressed bytes, for example. In order to ensure that an exact number of input bytes represent an exact number of output bytes, after each 32768th uncompressed byte is represented, the output bit buffer is byte aligned on a 16-bit boundary by outputting 0-15 zero bits. The bit buffer is flushed in an identical manner after the final CFDATA block in a folder. Furthermore, the compressor may not emit any matches that span a 32768-byte boundary in the input (for example, at position 65528 in the input, the compressor cannot emit a match with a length of 50; the maximum allowable match length at this point would be 6).

One additional constraint is that, for any given CFDATA block, the compressed size of a CFDATA block may not occupy more than 32768+6144 bytes (i.e. 32K of uncompressed input may not grow by more than 6K when compressed).

Back to: LZX Data Compression Format > LZX Format

Header Structure

The header consists of either a zero bit indicating no encoder preprocessing, or a one bit followed by a file translation size, a value which is used in encoder preprocessing.

0 
1Most significant 16 bits of file translation sizeLeast significant 16 bits of file translation size

Back to: LZX Data Compression Format > LZX Format

Encoder preprocessing

The encoder may optionally perform a preprocessing stage on all CFDATA input blocks (size <= 32K) which improves compression on 32-bit Intel 80x86 code. The translation is performed before the data is passed to the compressor, and therefore an appropriate reverse translation must be performed on the output of the decompressor. A bit indicating whether preprocessing was used is stored in the compression header (see above).

The preprocessing stage translates 80x86 CALL instructions, which begin with the E8 (hex) opcode, to use absolute offsets instead of relative offsets.

Preprocessing is disabled after the 32768th CAB input frame in a folder (where a CAB input frame is 32768 bytes) in order to avoid signed/unsigned arithmetic complexity. This change can obviously occur only when a folder represents at least 1 gigabyte of uncompressed data.

CALL Byte Sequence (E8 followed by 32 bit offset)

E8 r0 r1 r2 r3

Performing the Relative-to-Absolute Conversion

relative_offset ← r0 + r1*28 + r2*216 + r3*224 
new_value ← conversion_function(current_location, relative_offset)
a0 ← bits 0-7 of new_value
a1 ← bits 8-15 of new_value
a2 ← bits 16-23 of new_value
a3 ← bits 24-31 of new_value

Translated CALL Byte Sequence

E8 a0 a1 a2 a3

The diagram below illustrates the relative-to-absolute conversion function, where curpos is the current offset within all uncompressed data seen in the current cabinet folder, and file_size is the file translation size from the compression header (file_size is unrelated to the size of the actual file being decompressed).

The translation is performed "in place" on the input data without using extra codes to indicate whether a translation occurred (i.e. there is a direct mapping from a 32-bit value to a 32-bit value), therefore there is a one-to-one correlation between pre- and post- translated values.

Offset Translation Diagram

Bb417343.offset_trans(en-us,MSDN.10).gif

From the diagram one can see that values in the range of 0x80000000 (-231) to -curpos, and file_size to 0x7FFFFFFFF (+231) are left unchanged. The translation algorithm operates as follows on an input block of size input_size, where 0 <= input_size <= 32768. No translation may be performed on the last 6 bytes of the input block.

if (input_size < 6)
return         /* don't perform translation if < 6 input bytes */

for (i = 0; i < input_size; i++)

   if (input_data[i] == 0xE8)
      if (i >= input_size-6)
   break;
      endif
      
      ... perform translation illustrated above …
   endif

Back to: LZX Data Compression Format > LZX Format

Block Structure

Each block of compressed data begins with a 3 bit header describing the block type, followed by the block itself. The allowable block types are:

0Undefined
1Verbatim block
2Aligned offset block
3Uncompressed block
4-7Undefined

Back to: LZX Data Compression Format > LZX Format

Uncompressed Block Format

An uncompressed block begins with 1 to 16 bits of zero padding to align the bit buffer on a 16-bit boundary. At this point, the bitstream ends, and a bytestream begins. The data that follows is encoded as bytes for performance. Following the zero padding, new values for R0, R1, and R2 are output in little-endian form, followed by the uncompressed data bytes themselves.

1-16 bits 4 bytes4 bytes4 bytesn bytes
zero paddingR0

(LSB first)

R1

(LSB first)

R2

(LSB first)

Uncompressed data

Back to: LZX Data Compression Format > LZX Format

Verbatim Block

A verbatim block consists of the following:

EntryCommentsSize
Number of uncompressed bytes accounted for in this blockRange of 1...224 24 bits
Pre-tree for first 256 elements of main tree20 elements, 4 bits each80 bits
Path lengths of first 256 elements of main treeEncoded using pre-treeVariable
Pre-tree for remainder of main tree20 elements, 4 bits each80 bits
Path lengths of remaining elements of main treeEncoded using pre-treeVariable
Pre-tree for length tree20 elements, 4 bits each80 bits
Path lengths of elements in length treeEncoded using pre-treeVariable
Compressed literalsDescribed laterVariable

Back to: LZX Data Compression Format > LZX Format

Aligned Offset Block

An aligned offset block consists of the following:

EntryCommentsSize
Number of uncompressed bytes accounted for in this blockRange of 1...224 24 bits
Pre-tree for first 256 elements of main tree20 elements, 4 bits each80 bits
Path lengths of first 256 elements of main treeEncoded using pre-treeVariable
Pre-tree for remainder of main tree20 elements, 4 bits each80 bits
Path lengths of remaining elements of main treeEncoded using pre-treeVariable
Aligned offset tree8 elements, 3 bits each24 bits
Compressed literalsDescribed laterVariable

The aligned offset tree comprises only 8 elements, each of which is encoded as a 3 bit path length. Since the size of this tree is so small, no additional compression is performed on it.

Back to: LZX Data Compression Format > LZX Format

Encoding the Trees and Pre-Trees

Since all trees used in LZX are created in the form of a canonical Huffman tree, the path length of each element in the tree is sufficient to reconstruct the original tree. The main tree and the length tree are each encoded using the method described below. However, the main tree is encoded in two components as if it were two separate trees, the first tree corresponding to the first 256 tree elements (uncompressed symbols), and the second tree corresponding to the remaining elements (matches).

Since trees are output several times during compression of large amounts of data, LZX optimizes compression by encoding only the delta path lengths between the current and previous trees. In the case of the very first such tree, the delta is calculated against a tree in which all elements have a zero path length.

Each tree element may have a path length from 0 to 16 (inclusive) where a zero path length indicates that the element has a zero frequency and is not present in the tree. Tree elements are output in sequential order starting with the first element. Elements may be encoded in one of two ways -if several consecutive elements have the same path length, then run length encoding is employed; otherwise the element is output by encoding the difference between the current path length and the previous path length of the tree, mod 17. These output methods are described below:

Tree Codes

CodeOperation
0-16Len[x] = (prev_len[x] + code) mod 17
17Zeroes = getbits(4)

Len[x] = 0 for next (4 + Zeroes) elements

18Zeroes = getbits(5)

Len[x] = 0 for next (20 + Zeroes) elements

19Same = getbits(1)

Decode new Code

Value = (prev_len[x] + Code) mod 17

Len[x] = Value for next (4 + Same) elements

Each of the 17 possible values of (len[x] - prev_len[x]) mod 17, plus three additional codes used for run-length encoding, are not output directly as 5 bit numbers, but are instead encoded via a Huffman tree called the pre- tree. The pre-tree is generated dynamically according to the frequencies of the 20 allowable tree codes. The structure of the pre-tree is encoded in a total of 80 bits by using 4 bits to output the path length of each of the 20 pre-tree elements. Once again, a zero path length indicates a zero frequency element.

Pre-Tree

Length of tree code 04 bits
Length of tree code 14 bits
Length of tree code 24 bits
......
Length of tree code 184 bits
Length of tree code 194 bits

The "real" tree is then encoded using the pre-tree Huffman codes.

Back to: LZX Data Compression Format > LZX Format

Compressed Literals

The compressed literals that make up the bulk of either a verbatim block or an aligned offset block immediately follow the tree data (as shown in the diagram for each block type). These literals, which comprise matches and unmatched characters, will, when decompressed, correspond to exactly the number of uncompressed bytes indicated in the block header.

The representation of an unmatched character in the output is simply the appropriate element 0…(NUM_CHARS-1) Huffman-encoded using the main tree.

The representation of a match in the output involves several transformations, as shown in the following diagram. At the top of the diagram are the match length (MIN_MATCH…MAX_MATCH) and the match offset (0…WINDOW_SIZE-4). The match offset and match length are split into sub-components and encoded separately.

As mentioned previously, in order to remain compatible with the cabinet file format, the compressor may not emit any matches that span a 32768-byte boundary in the input.

Diagram of Match Sub-Components

Bb417343.match_sub(en-us,MSDN.10).gif

Back to: LZX Data Compression Format > LZX Format

Match Offset ⇒ Formatted Offset

The match offset, range 1...(WINDOW_SIZE-4), is converted into a formatted offset by determining whether the offset can be encoded as a repeated offset, as shown below. It is acceptable to not encode a match as a repeated offset even if it is possible to do so.

Converting a Match Offset to a Formatted Offset

if offset == R0 then
   formatted offset ← 0
else if offset == R1 then
   formatted offset ← 1
else if offset == R2 then
   formatted offset ← 2
else
   formatted offset ← offset + 2
endif

Back to: LZX Data Compression Format > LZX Format

Formatted Offset ⇒ Position Slot, Position Footer

The formatted offset is subdivided into a position slot and a position footer. The position slot defines the most significant bits of the formatted offset in the form of a base position as shown in the table on the following page. The position footer defines the remaining least significant bits of the formatted offset. As the table shows, the number of bits dedicated to the position footer grows as the formatted offset becomes larger, meaning that each position slot addresses a larger and larger range.

The number of position slots available depends on the window size. The position slot table for the maximum window size of 2 megabytes, is shown in the table below.

Position Slot Table

Position Slot NumberBase PositionNumber of Position Footer BitsRange of Base Position and Position Footer
0000
1101
2202
3303
4414-5
5616-7
6828-11
712212-15
816316-23
924324-31
1032432-47
1148448-63
1264564-95
1396596-127
141286128-191
151926192-255
162567256-383
173847384-511
185128512-767
197688768-1023
20102491024-1535
21153691536-2047
222048102048-3071
233072103072-4095
244096114096-6143
256144116144-8191
268192128192-12287
27122881212288-16383
28163841316384-24575
29245761324576-32767
30327681432768-49151
31491521449152-65535
32655361565536-98303
33983041598304-131071
3413107216131072-196607
3519660816196608-262143
3626214417262144-393215
3739321617393216-524287
3852428817524288-655359
3965536017655360-786431
4078643217786432-917503
4191750417917504-1048575
421048576171048576-1179647
431179648171179648-1310719
441310720171310720-1441791
451441792171441792-1572863
461572864171572864-1703935
471703936171703936-1835007
481835008171835008-1966079
491966080171966080-2097151

In order to determine the position footer, it is first necessary to determine the position slot. Then, a simple lookup can be performed on the position slot to determine the number of bits, B, in the position footer. The B least significant bits of the formatted offset are the position footer. Pseudocode for obtaining the position slot and position footer are shown below, as is the lookup array (named extra_bits).

n
(position slot)
extra_bits[n]
(number of position footer bits)
00
10
20
30
41
51
62
72
83
93
104
114
125
135
146
156
167
177
188
198
209
219
2210
2310
2411
2511
2612
2712
2813
2913
3014
3114
3215
3315
3416
3516
36-4917

Converting the Position Slot and Position Footer

position_slot ← calculate_position_slot(formatted_offset)
position_footer_bits ← extra_bits[ position_slot ]
if position_footer_bits > 0
      position_footer ← formatted_offset & ((2^position_footer_bits)-1)
else
      position_footer ← null

Back to: LZX Data Compression Format > LZX Format

Position Footer ⇒ Verbatim Bits, Aligned Offset Bits

The position footer may be further subdivided into verbatim bits and aligned offset bits if the current block uses aligned offsets. If the current block is not an aligned offset block then there are no aligned offset bits, and the verbatim bits are the position footer.

If aligned offsets are used, then the lower 3 bits of the position footer are the aligned offset bits, while the remaining portion of the position footer are the verbatim bits. In the case where there are less than 3 bits in the position footer (i.e. formatted offset is <= 15) it is not possible to take the "lower 3 bits of the position footer" and therefore there are no aligned offset bits, and the verbatim bits and the position footer are the same.

Pseudocode for Splitting Position Footer into Verbatim Bits and Aligned Offset

if block_type = aligned_offset_block then
   if formatted_offset <= 15 then
      verbatim_bits ← position_footer
      aligned_offset ← null
   else
      aligned_offset ← position_footer
      verbatim_bits ← position_footer >> 3
   endif
else
   verbatim_bits ← position_footer
   aligned_offset ← null
endif 

Back to: LZX Data Compression Format > LZX Format

Match Length ⇒ Length Header, Length Footer

The match length is converted into a length header and a length footer. The length header may have one of eight possible values, from 0...7 (inclusive), indicating a match of length 2, 3, 4, 5, 6, 7, 8, or a length greater than 8. If the match length is 8 or less, then there is no length footer. Otherwise the value of the length footer is equal to the match length minus 9.

Pseudocode for Obtaining the Length Header and Footer

if match_length <= 8
   length_header ← match_length-2
   length_footer ← null
else
   length_header ← 7
   length_footer ← match_length-9
endif

Example Conversions of Some Match Lengths to Header and Footer Values

Match lengthLength headerLength footer value
2 (MIN_MATCH)0None
31None
42None
53None
64None
75None
86None
970
1071
50741
257 (MAX_MATCH)7248

Back to: LZX Data Compression Format > LZX Format

Length Header, Position Slot ⇒ Length/Position Header

The Length/Position header is the stage which correlates the match position with the match length (using only the most significant bits), and is created by combining the length header and the position slot as shown below:

len_pos_header ← (position_slot < < 3) + length_header

This operation creates a unique value for every combination of match length 2, 3, 4, 5, 6, 7, 8 with every possible position slot. The remaining match lengths greater than 8 are all lumped together, and as a group are correlated with every possible position slot.

Back to: LZX Data Compression Format > LZX Format

Encoding a Match

The match is finally output in up to four components, as follows:

  1. Output element (len_pos_header + NUM_CHARS) from the main tree
  2. If length_footer != null, then output element length_footer from the length tree
  3. If verbatim_bits != null, then output verbatim_bits
  4. If aligned_offset_bits != null, then output element aligned_offset from the aligned offset tree

Back to: LZX Data Compression Format > LZX Format

Decoding a Match or an Uncompressed Character

Decoding is performed by first decoding an element using the main tree and then, if the item is a match, determining which additional components are necessary to reconstruct the match. Pseudocode for decoding a match or an uncompressed character is shown below:

main_element = main_tree.decode_element()

if (main_element < NUM_CHARS) /* is an uncompressed character */

   window[ curpos ] ← (byte) main_element
   curpos ← curpos + 1

else /* is a match */

      length_header ← (main_element – NUM_CHARS) & NUM_PRIMARY_LENGTHS

      if (length_header == NUM_PRIMARY_LENGTHS) 
            match_length ← length_tree.decode_element() + NUM_PRIMARY_LENGTHS + MIN_MATCH
      else
            match_length ← length_header + MIN_MATCH /* no length footer */
      endif

      position_slot ← (main_element – NUM_CHARS) >> 3

      /* check for repeated offsets (positions 0,1,2) */
      if (position_slot == 0)
            match_offset ← R0
      else if (position_slot == 1)
            match_offset ← R1
            swap(R0 ⇔ R1)
      else if (position_slot == 2)
            match_offset ← R2
            swap(R0 ⇔ R2)
      else /* not a repeated offset */
            extra ← extra_bits[ position_slot ] 

            if (block_type == aligned_offset_block)
                  if (extra > 3) /* this means there are some aligned bits */
                        verbatim_bits ← (readbits(extra-3)) << 3
                        aligned_bits  ← aligned_offset_tree.decode_element();
                  else if (extra > 0) /* just some verbatim bits */
                        verbatim_bits ← readbits(extra)
                        aligned_bits  ← 0
                  else /* no verbatim bits */
                        verbatim_bits ← 0
                        aligned_bits  ← 0
            endif

            formatted_offset ← base_position[ position_slot ] + verbatim_bits + aligned_bits
      else /* block_type == verbatim_block */
            if (extra > 0) /* if there are any extra bits */
                  verbatim_bits ← readbits(extra)
            else
                  verbatim_bits ← 0
            endif

            formatted_offset ← base_position[ position_slot ] + verbatim_bits
      endif

      match_offset ← formatted_offset – 2

      /* update repeated offset LRU queue */
      R2 ← R1
      R1 ← R0
      R0 ← match_offset

      /* copy match data */
      for (i = 0; i < match_length; i++)
            window[curpos + i] ← window[curpos + i – match_offset]

      curpos ? curpos + match_length
endif

Back to: LZX Data Compression Format > LZX Format

Microsoft MakeCAB User's Guide

Copyright © 1997 Microsoft Corporation. All rights reserved.

Topics in this Section

Overview

Case 1: MakeCAB for Setup Programs
   Characteristics of a Setup Program
   MakeCAB Application
Case 2: MakeCAB for a 200MB Source Code Archive
   Characteristics of a Source Code Archive
   MakeCAB Application
Case 3: Self-extracting Cabinet File(s)
MakeCAB Deliverables
MakeCAB Goals

MakeCAB Optimizing and Tuning

Saving Diskettes
Tuning Access Time vs. Compression Ratio
Piecemeal DDFs for Localization and Different Disk Sizes
Creation Time

MakeCAB Concepts

Decoupling File Layout and INF Layout

MAKECAB.EXE

MAKECAB.EXE Syntax
MAKECAB.EXE Directive File Syntax
Command Summary
Variable Summary
InfDisk/Cabinet/FileLineFormat Syntax and Semantics
INF Parameters
Command Details
Variable Details

EXTRACT.EXE

Overview

MakeCAB is a lossless data compression tool that can be used for a wide variety of purposes. Although it was originally designed for use by setup programs, it can also be used in almost any situation where lossless data compression is required.

MakeCAB has three key features: 1) storing multiple files in a single cabinet ("CAB") file, 2) performing compression across file boundaries, and 3) permitting files to span cabinets. While existing products such as PKZIP, LHARC, and ARJ, support some of these features, combining all three does not appear to be common practice. MakeCAB also supports self-extracting archives, by simply concatenating a cabinet file to EXTRACT.EXE.

Depending upon the number of files to be compressed, and the access patterns expected (sequential or random access; whether most of the files will be requested at once or only a small portion of them), MakeCAB can be instructed to build cabinet files in different ways. One key concept in MakeCAB is the folder. A folder is a collection of one or more files which are compressed together, as a single entity.

The cabinet file format is capable of supporting multiple forms of compression. At this time, MSZIP and LZX are the compression formats supported by Microsoft. Other compression formats are possible in the future.

The following sections provide case studies of several possible ways that MakeCAB might be used. These are only provided to stimulate your imagination -- they are not the only ways in which MakeCAB can be used!

Back to: MakeCAB User's Guide

Case 1: MakeCAB for Setup Programs

Since MakeCAB was designed with setup programs in mind, it has a great deal of power and flexibility to tradeoff compressed size against speed of random access to files. The primary impact of MakeCAB is to minimize the number of diskettes required to distribute a product, thereby minimizing the Cost of Goods Sold (COGS).

In order for MakeCAB to build the disk images for a product, a directive file, or DDF, which specifies the list of files in a product, and any constraints on which disks certain files should be located, must be created. The same directive file can even be used for all the various localized versions of a product, since directive files support parameterization.

This section includes:

Characteristics of a Setup Program

MakeCAB Application

Back to: MakeCAB User's Guide > Overview

Characteristics of a Setup Program

  1. Minimizing disk count is very important, since it saves money in production costs.
  2. Files are accessed sequentially.
  3. Most files are accessed.

Back to: MakeCAB User's Guide > Overview > Case 1: MakeCAB for Setup Programs

MakeCAB Application

The distribution disks for a typical application product produced by MakeCAB might look similar to the following:

Distribution disk layout

Bb417343.c1_app(en-us,MSDN.10).gif

SETUP.EXE is the setup program, and SETUP.INF is a file generated by MakeCAB which guides the operation of the setup program (which files are needed for which options, and on which disk and in which cabinet file a file is contained). All of the remaining product files are contained in the cabinet files EXCEL.1 through EXCEL.N (N might be 7, for example).

To produce this disk layout with MakeCAB, a DDF is prepared which lists all of the files for the product, along with some optional MakeCAB settings to control parameters such as: 1) the capacity of the disks which are being used, 2) the naming convention of the cabinet files , 3) the visible (user-readable) labels on each disk, 4) how much random access is desired for files within a cabinet. The following is an example of a DDF that might be appropriate:

;*** MakeCAB Directive file example
;
.OPTION EXPLICIT                     ; Generate errors on variable typos

.Set DiskLabel1=Setup                ; Label of first disk
.Set DiskLabel2=Program              ; Label of second disk
.Set DiskLabel3="Program Continued"  ; Label of third disk
.Set CabinetNameTemplate=EXCEL.*     ; EXCEL.1, EXCEL.2, etc.
.set DiskDirectoryTemplate=Disk*     ; disk1, disk2, etc.
.Set MaxDiskSize=1.44M               ; 3.5" disks

;** Setup.exe and setup.inf are placed uncompressed in the first disk
.Set Cabinet=off
.Set Compress=off
.Set InfAttr=                        ; Turn off read-only, etc. attrs
bin\setup.exe                        ; Just copy SETUP.EXE as is
bin\setup.inf                        ; Just copy SETUP.INF as is

;** The rest of the files are stored, compressed, in cabinet files
.Set Cabinet=on
.Set Compress=on
bin\excel.exe                        ; Big EXE, will span cabinets
bin\excel.hlp
bin\olecli.dll
bin\olesrv.dll
;...                                 ; Many more files
;*** <the end>                       ; That's it

Now, you run MakeCAB to create the disk layout:

MakeCAB /f excel.ddf

MakeCAB will create directories Disk1, Disk2, etc. to hold the files for each disk, and will copy uncompressed files or create cabinet files (as appropriate) in each directory. The file SETUP.RPT will be written to the current directory (this can be overridden) with a summary of what MakeCAB did, and the file SETUP.INF will contain details on every disk and cabinet created, including a list of where each file was placed.

Back to: MakeCAB User's Guide > Overview > Case 1: MakeCAB for Setup Programs

Case 2: MakeCAB for a 200MB Source Code Archive

The Microsoft Developers Network (MSDN) CD includes over 200Mb of source code. While uncompressed this is only 1/3rd of the CD, that is still too much space, so tight compression is desired. This is slightly different from the Setup case, however, since there is a front-end tool that allows users to select sample programs and expand them onto the hard disk.

This section includes:

Characteristics of a Source Code Archive

MakeCAB Application

Back to: MakeCAB User's Guide

Characteristics of a Source Code Archive

  1. Minimizing space usage is slightly less important
  2. Files are accessed somewhat randomly, though in groups
  3. Only a small portion of the files will be accessed at any one time

Back to: MakeCAB User's Guide > Overview > Case 2: MakeCAB for a 200Mb Source Code Archive

MakeCAB Application

The cabinet files produced for the source archive need to be big enough to provide good compression, but not so big that random access speed is sacrificed. The challenge is to obtain a good tradeoff between compression and access time.

;*** MSDN Sample Source Code MakeCAB Directive file example
;
.OPTION EXPLICIT                  ; Generate errors on variable typos

.Set CabinetNameTemplate=MSDN.*   ; MSDN.1, MSDN.2, etc.
.set DiskDirectoryTemplate=CDROM  ; All cabinets go in a single directory
.Set MaxDiskFileCount=1000        ; Limit file count per cabinet, so that
                                  ; scanning is not too slow
.Set FolderSizeThreshold=200000   ; Aim for ~200K per folder
.Set CompressionType=MSZIP

;** All files are compressed in cabinet files
.Set Cabinet=on
.Set Compress=on
foo.c
foo.h
....
;*** <the end>                    ; That's it

Back to: MakeCAB User's Guide > Overview > Case 2: MakeCAB for a 200Mb Source Code Archive

Case 3: Self-extracting Cabinet File(s)

Many times, a software developer will want to ship executables, libraries, or the like across an Intranet or the Internet. They need a small package and an easy way for users to extract data. For example, Java[TM] developers may want to ship large libraries of classes, so that home and business developers can use those classes in their software.

EXTRACT.EXE, which extracts files from CAB files, recognizes when it has been copied to the front of a cabinet file, and will automatically extract the files in that cabinet file (and any continuation cabinet files). Here is how this is accomplished:

  1. Create a cabinet file (or set of cabinet files).
  2. Prepend EXTRACT.EXE to the first cabinet file (do not prepend EXTRACT.EXE to any other cabinet files in the set).
  3. Distribute the self-extracting cabinet (and any subsequent cabinets).

Example:

MakeCAB /f self.ddf                     ; Build cabinet file set self1.cab, self2.cab
copy /b extract.exe+self1.cab self.exe  ; self.exe is self-extracting

Back to: MakeCAB User's Guide > Overview

MakeCAB Deliverables

The following table is a list of all the libraries and programs that are part of MakeCAB:

FileContents
MAKECAB.EXECommand-line tool to perform disk layout (uses FCI.LIB)
FDI.LIBFile Decompression Interface library.
EXTRACT.EXECommand-line tool to expand files (uses FDI.LIB)
FCI.LIBFile Compression Interface library.

Back to: MakeCAB User's Guide > Overview

MakeCAB Goals

  • Provide excellent compression ratio and decompression speed
  • Simplify production of disk layouts for products
  • Provide command-line tools and link libraries for all Microsoft platforms

Back to: MakeCAB User's Guide > Overview

MakeCAB Optimizing and Tuning

This section includes:

Saving Diskettes

Tuning Access Time vs. Compression Ratio

Piecemeal DDFs for Localization and Different Disk Sizes

Creation Time

Back to: MakeCAB User's Guide

Saving Diskettes

For a product shipped on floppy disks, it is very important to minimize the number of disks shipped per product! As a back-of-the-envelope calculation, if each disk cost a dollar and one million units were shipped, then each disk saved would save $1 million. The following pseudo-code suggests a process you might follow as you strive to keep your Cost of Goods Sold (COGS) to a minimum:

get initial product files;
while (have not yet shipped)
   Compress file set using:
      CompressionType=LZX
      CompressionMemory=21
   If near a disk boundary
      Consider tossing files to save a disk (especially clipart & samples!)
   If near shipping
      Relax FolderSizeThreshold to
      improve access time at decompress.
end-while
Ship it!

Back to: MakeCAB User's Guide > MakeCAB Optimizing and Tuning

Tuning Access Time vs. Compression Ratio

MakeCAB introduces the concept of a folder to refer to a contiguous set of compressed bytes. To decompress a file from a cabinet, FDI.LIB (called by your SETUP.EXE and EXTRACT.EXE) finds the folder that the file starts in, and then must read and decompress all the bytes in that folder from the start up through and including the desired file.

For example, if the file FOO.EXE is at the end of a 1.44Mb folder on a 1.44M diskette, then FDI.LIB must read the entire diskette and decompress all the data. This is about the worst access time possible. By contrast, if FOO.EXE were at the start of a folder (regardless of how large the folder is), then it would be read and decompressed with no extra overhead.

So, why would one not always Set FolderFileCountThreshold=1? Because doing so would reset the compression history after each file, resulting in a poor compression ratio. MakeCAB provides several variables and directives to provide very fine control over these issues:

Variable/DirectiveMore Compression;
Slower Access Time
Less Compression;
Faster Access Time
CabinetFileCountThresholdBigger numbersLower numbers
FolderFileCountThresholdBigger numbersLower numbers
FolderSizeThresholdBigger numbersLower numbers
MaxCabinetSizeBigger numbersLower numbers
.New FolderDon't useUse often
.New CabinetDon't useUse often

The MakeCAB defaults are configured for a floppy disk layout, with the assumption that the most common scenario is a full setup that will extract most of the files, so these are the settings:

Variable/DirectiveValue
CabinetFileCountThreshold2000 (Since we have to call FDICopy() on a cabinet and walk through all the FILE headers, we want this small enough so that isn't too much overhead, but large enough to keep the number of cabinets down.)
FolderFileCountThresholdUnlimited (Let FolderSizeThreshold control folder size!)
FolderSizeThreshold200K (Represents 600K-800K of source (assuming 3:1 or 4:1 compression ratio)
MaxCabinetSizeUnlimited (Let CabinetFileCountThreshold control the cabinet size!)

Of course, if you are tight for space on your CD-ROM, you'll probably boost the FolderSizeThreshold and CompressionMemory settings!

Back to: MakeCAB User's Guide > MakeCAB Optimizing and Tuning

Piecemeal DDFs for Localization and Different Disk Sizes

MAKECAB.EXE was designed to minimize the amount of duplicate information needed to generate product layouts for different languages and disk sizes. A key feature is the ability to specify more than one DDF on the MAKECAB.EXE command line. For example:

acme.ddfSome standard definitions to control the format of the output INF file
lang.ddfSets language-specific settings (SourceDir, for example)
disk.ddfSets the diskette sizes (CDROM, 1.2M, 1.44M, etc.)
product.ddfLists all the files in the product, and uses variables set in the previous DDFs to customize its operation

The following command line would be used to process this set of DDFs:

MakeCAB /f acme.ddf /f lang.ddf /f disk.ddf /f product.ddf

Back to: MakeCAB User's Guide > MakeCAB Optimizing and Tuning

Creation Time

MakeCAB is generally used in scenarios where the time required for creation is not as important as the size or layout of the output. This is especially true when the output is created once and consumed many times. However, better compression ratios and increasingly complex layouts will result in longer MakeCAB execution times. Overall creation time can be minimized by reducing LZX CompressionMemory, using MSZIP, or turning off Compression. Because MakeCAB attempts to create Cabinets as optimally as possible, a greater number of Cabinets/disks may result in greatly increased creation times. Additionally, creation time can be reduced by reducing the complexity of the layout.

Back to: MakeCAB User's Guide > MakeCAB Optimizing and Tuning

MakeCAB Concepts

The key feature of MakeCAB is that it takes a set of files and produces a disk layout while at the same time attempting to minimize the number of disks required. In order to understand how MakeCAB does this, three terms need to be defined: cabinet, folder, and file. MakeCAB takes all of the files in the product or application being compressed, lays the bytes down as one continuous byte stream, compresses the entire stream, chopping it up into folders as appropriate, and then fills up one or more cabinets with the folders.

Cabinet
A normal file that contains pieces of one or more files, usually compressed. Also known as a "CAB file".
Folder
A decompression boundary. Large folders enable higher compression, because the compressor can refer back to more data in finding patterns. However, to retrieve a file at the end of a folder, the entire folder must be decompressed. So there is a tradeoff between achieved compression and the quickness of random access to individual files.
File
A file to be placed in the layout.

Back to: MakeCAB User's Guide

Decoupling File Layout and INF Layout

MakeCAB has two "modes" for generating the INF file; unified mode and relational mode. In unified mode, the INF file is generated as file copy commands are processed in the DDF file. This is the default, and minimizes the amount of effort needed to construct a DDF file. However, this forces the INF file to list the files in the layout in exactly the same order as they are placed on disks/cabinets.

Example of a Unified DDF:

;** Set up INF formats before we do the disk layout, because MakeCAB
;   writes Disk and Cabinet information out as it is generated.
.OPTION EXPLICIT            ; Generate errors for undefined variables

.Set InfDiskHeader="[disk list]"
.Set InfDiskHeader1=";<disk number>,<disk label>"
.Set InfDiskLineFormat="*disk#*,*label*"

.Set InfCabinetHeader="[cabinet list]"
.Set InfCabinetHeader1=";<cabinet number>,<disk number>,<cabinet file name>"
.Set InfCabinetLineFormat="*cab#*,*disk#*,*cabfile*"

.Set InfFileHeader=";*** File List ***"
.Set InfFileHeader1=";<disk number>,<cabinet number>,<filename>,<size>"
.Set InfFileHeader2=";Note: File is not in a cabinet if cab# is 0"
.Set InfFileHeader3=""
.Set InfFileLineFormat="*disk#*,*cab#*,*file*,*date*,*size*"


.set GenerateInf=ON         ; Unified mode - create the INF file as we go

;** Setup files.  These don't need to be in the INF file, so we put
;   /inf=NO on these lines so that MakeCAB won't generate an error when
;   it finds that these files are not mentioned in the INF portion of
;   the DDF.

.set Compress=OFF
.set Cabinet=OFF
setup.exe /inf=NO           ; This file doesn't show up in INF
setup.inf /inf=NO           ; This file doesn't show up in INF

;** Files in cabinets
.set Compress=ON
.set Cabinet=ON

;* Put all bitmaps together to help compression
a1.bmp                      ; Bitmap for client1.exe
b1.bmp                      ; Bitmap for client1.exe
c1.bmp                      ; Bitmap for client1.exe
d1.bmp                      ; Bitmap for client1.exe
a2.bmp                      ; Bitmap for client1.exe
b2.bmp                      ; Bitmap for client2.exe
c2.bmp                      ; Bitmap for client2.exe
d2.bmp                      ; Bitmap for client2.exe
shared.dll  /date=10/12/93  ; File needed by client1.exe and client2.exe
client1.exe                 ; needs shared.dll
client2.exe                 ; needs shared.dll

;*** The End

In relational mode the DDF has file reference lines to specify the exact placement of file information lines, including the ability to list the same file multiple times. This feature is important for INF structures which use section headers (e.g. "[clipart]", "[screen savers]") to identify sets of files for particular functionality, and for which the same file may need to be included in more than one section. For example, a product may have several optional features, all of which require a DLL file named "shared.dll". Rather than having "shared.dll" stored multiple times (once for each section which uses the file), a waste of disk space, a single copy of the file can be stored, and then referenced by all of the sections which require it.

A relational mode DDF is similar to a unified mode DDF, with the exception that a ".set GenerateInf=OFF" line must be inserted before the product's files are listed (as shown below). Once all of the files have been listed, the INF file generating portion of the DDF begins, and a ".set GenerateInf=ON" line must be inserted, followed by the section definitions.

Example of a Relational DDF:

   ;** Set up INF formats before we do the disk layout, because MakeCAB
   ;   writes Disk and Cabinet information out as it is generated.
   .OPTION EXPLICIT            ; Generate errors for undefined variables

   .Set InfDiskHeader="[disk list]"
   .Set InfDiskHeader1=";<disk number>,<disk label>"
   .Set InfDiskLineFormat="*disk#*,*label*"

   .Set InfCabinetHeader="[cabinet list]"
   .Set InfCabinetHeader1=";<cabinet number>,<disk number>,<cabinet file name>"
   .Set InfCabinetLineFormat="*cab#*,*disk#*,*cabfile*"

   .Set InfFileHeader=";*** File List ***"
   .Set InfFileHeader1=";<disk number>,<cabinet number>,<filename>,<size>"
   .Set InfFileHeader2=";Note: File is not in a cabinet if cab# is 0"
   .Set InfFileHeader3=""
   .Set InfFileLineFormat="*disk#*,*cab#*,*file*,*date*,*size*"


;
; *** Here is where we list all the files
;
   .set GenerateInf=OFF        ; RELATIONAL MODE - Do disk layout first

   ;** Setup files.  These don't need to be in the INF file, so we put
   ;   /inf=NO on these lines so that MakeCAB won't generate an error when
   ;   it finds that these files are not mentioned in the INF portion of
   ;   the DDF.

   .set Compress=OFF
   .set Cabinet=OFF
   setup.exe /inf=NO           ; This file doesn't show up in INF
   setup.inf /inf=NO           ; This file doesn't show up in INF

   ;** Files in cabinets
   ;
   .set Compress=ON
   .set Cabinet=ON

   ;* Put all bitmaps together to help compression
   a1.bmp                      ; Bitmap for client1.exe
   b1.bmp                      ; Bitmap for client1.exe
   c1.bmp                      ; Bitmap for client1.exe
   d1.bmp                      ; Bitmap for client1.exe
   a2.bmp                      ; Bitmap for client1.exe
   b2.bmp                      ; Bitmap for client2.exe
   c2.bmp                      ; Bitmap for client2.exe
   d2.bmp                      ; Bitmap for client2.exe
   shared.dll  /date=10/12/93  ; File needed by client1.exe and client2.exe
   client1.exe                 ; needs shared.dll
   client2.exe                 ; needs shared.dll


;
; *** Now we're generating the INF file
;
   .set GenerateInf=ON         

   ;** Feature One files
   .InfBegin File
   [feature One]
   ;Files for feature one
   .InfEnd
   client1.exe
   shared.dll  /date=04/01/94  ; Override date
   a1.bmp
   b1.bmp
   c1.bmp
   d1.bmp

   ;** Feature Two files
   .InfBegin File

   [feature Two]
   ;Files for feature Two
   ;Note that shared.dll is also required by Feature One
   .InfEnd
   client1.exe
   shared.dll
   a2.bmp
   b2.bmp
   c2.bmp
   d2.bmp

   ;*** The End

The generated INF file would look something like this:

[disk list]
;<disk number>,<disk label>
1,"Disk 1"

[cabinet list]
;<cabinet number>,<disk number>,<cabinet file name>
1,1,cabinet.1

;*** File List ***
;<disk number>,<cabinet number>,<filename>,<size>
;Note: File is not in a cabinet if cab# is 0

[feature One]
;Files for feature one
1,1,client1.exe,12/12/93,1234
1,1,shared.dll,04/01/94,1234
1,1,a1.bmp,12/12/93,573
1,1,b1.bmp,12/12/93,573
1,1,c1.bmp,12/12/93,573
1,1,d1.bmp,12/12/93,573

[feature Two]
;Files for feature Two
;Note that shared.dll is also required by Feature One
1,1,client1.exe,12/12/93,1234
1,1,shared.dll,10/12/93,1234
1,1,a2.bmp,12/12/93,643
1,1,b2.bmp,12/12/93,643
1,1,c2.bmp,12/12/93,643
1,1,d2.bmp,12/12/93,643

Notes:

  1. In "relational" mode, only the last setting of a particular InfXxx default parameter variable (both standard parameters like InfDate, InfTime, etc. and custom parameters) in the layout portion (i.e. the first part) of the DDF is respected.

    Example:

    If you did ".set InfDate=12/05/92" at the start of the layout portion, and then did ".set InfDate=01/01/94" in the middle of the layout portion, the latter value would be used for the entire INF file.

  2. Any parameters on a reference line will override parameters on the corresponding file copy line.

    Example:

    ;* layout portion
    bar /x=1
    
    ;* INF portion
    bar /x=2            ; INF file will have value 2
    
  3. In "relational" mode, each file copy command in the layout portion of the DDF must be referenced at least once in a reference command in the INF portion of the DDF. Any files that are not referenced will cause an error during pass 1. The /inf=no parameter must be specified on any file copy commands for files which are going to be omitted from the INF file (such as SETUP.EXE and SETUP.INF).
  4. In "relational" mode, UniqueFiles must be ON, because the destination file name is used in the INF portion of the DDF to refer back to file information.

Back to: MakeCAB User's Guide > MakeCAB Concepts

MAKECAB.EXE

MAKECAB.EXE is designed to produce the final distribution files and cabinets for an entire product in a single run. The most common way to use MAKECAB.EXE is to supply a directives file that controls how files are compressed and stored into one or more cabinets.

This section includes:

MAKECAB.EXE Syntax

MAKECAB.EXE Directive File Syntax

Back to: MakeCAB User's Guide

MAKECAB.EXE Syntax

There two primary forms of MAKECAB.EXE usage. The first is used for compressing a single file, while the second is used for compressing multiple files.

MAKECAB  [/Vn] [/D variable=value ...] [/L directory] source [destination]
MAKECAB  [/Vn] [/D variable=value ] /F directives_file [...]

The parameters are described below.

ParameterDescription
sourceA file to be compressed.
destinationThe name of the file to receive the compressed version of the source file. If not supplied, a default destination name is constructed from the source file name according to the rules defined by the CompressedFileExtensionChar variable. You can use /D CompressedFileExtensionChar=c on the command line to change the appended character.
/D variable=valueSet variable to be equal to value. Equivalent to using the .Set command in the directives file. For example, a single directive file could be used to produce layouts for different disk sizes by running MakeCAB once with different values of MaxDiskSize defined: /D MaxDiskSize=1.44M. Both standard MakeCAB variables and custom variables may be defined in this way. If .Option Explicit is specified in a directive file, then variable must be defined with a .Define command in a directive file.
/L directorySpecifies an output directory where the compressed file will be placed (most useful when destination is not supplied).
/F directives_fileA file containing commands for MAKECAB.EXE to execute. If more than one directive file is specified (/F file1 /F file2 ...), they are processed in the order (left to right) specified on the command line. Variable settings, open cabinets, open disks, etc. are all carried forward from one directive file to the next (just as if all of the files had been concatenated together and presented as a single file to MakeCAB). For example, this is intended to simplify the work for a product shipped in multiple languages. There would be a short, language-specific directives file, and then a single, large master directives file that covers the bulk of the product.
/VnSet debugging verbosity level (0=none,...,3=full)

Back to: MakeCAB User's Guide > MAKECAB.EXE

MAKECAB.EXE Directive File Syntax

Before diving into the details of the syntax of the directives file, provided here is an example of what the Excel directives file might look like:

;*** EXCEL MAKECAB Directive file
;
.Set DiskLabel1=Setup                ; Label of first disk
.Set DiskLabel2=Program              ; Label of second disk
.Set DiskLabel3="Program Continued"  ; Label of third disk
.Set CabinetNameTemplate=EXCEL*.CAB  ; EXCEL1.CAB, EXCEL2.CAB, etc.
.Set MaxDiskSize=1.44M               ; 3.5" disks

;** Setup.exe and setup.inf are placed uncompressed in the first disk
.Set Cabinet=off
.Set Compress=off
bin\setup.exe                        ; Just copy SETUP.EXE as is
bin\setup.inf                        ; Just copy SETUP.INF as is
;** The rest of the files are stored, compressed, in cabinet files
.Set Cabinet=on
.Set Compress=on
bin\excel.exe                        ; Big EXE, will span cabinets
bin\excel.hlp
bin\olecli.dll
bin\olesrv.dll
...

Here are some additional notes on the general syntax and behavior of MakeCAB Directive Files:

  1. MakeCAB will place files on disks (and in cabinets) in the order they are specified in the directive file(s).
  2. When ever a filename or directory is called for, you may supply either a relative (e.g., foo\bar, ..\foo) or an absolute (e.g., c:\banana, x:\slm\src\bin) path.
  3. Optimal compression is achieved when files with similar types of data are grouped together.
  4. MakeCAB is controlled in large part by setting variables. MakeCAB has many predefined variables, all of which have default values chosen to represent the most common case. You can modify these variables, and you can define your own variables as well.
  5. The value of a variable is retrieved by enclosing the variable name in percent (%) signs. If the variable is not defined, an error is generated. If you want an explicit percent sign, use two adjacent percent signs (%%). MakeCAB will collapse this to a single percent sign (%).
  6. Variable substitution is only done once. For example, .Set A=One [A is "One"]; .Set B=%%A%% (B is "%A%"); .Set C=%B% (C is "%A%", not "One").
  7. Variable substitution is done before any other line parsing, so variables can be used anywhere.
  8. Variables values may include blanks. Quote (") or apostrophe(') marks may be used in .Set statements to capture blanks. If you want an explicit quote(") or apostrophe('), you can intermix these two marks (use one for bracketing so that you may specify the other), or, as with the percent sign above, you can specify two adjacent marks ("") and MakeCAB will collapse this to a single mark(").
  9. All sizes are specified in bytes.
  10. There are a few special values for common disks sizes (CDROM, 1.44M, 1.2M, 720K, 360K) that can be used for any of the predefined MakeCAB variables that describe the attributes of a disk (MaxDiskSize, ClusterSize, MaxDiskFileCount). MakeCAB has built-in knowledge about the correct values of these attributes for these common disk sizes.
  11. MakeCAB does not check for 8.3 filename limitations directly, but rather depends upon the underlying operating system to do filename validity checking (this will allow MakeCAB to work with long file names.)
  12. MakeCAB makes two passes of the directive file(s). On the first pass, MakeCAB checks for syntax errors and makes sure that all of the files can be found. This is very fast, and reduces the chance that the second pass, where the actual data compression occurs, will have any problems. This is important because compression is very time consuming, so MakeCAB wants to avoid, for example, spending an hour compressing files only to find that a file toward the end of the directive file(s) cannot be found.
  13. There is a limit of 1024 characters per line in a Directive File.
  14. There is a limit of 1024 lines in a Directive File.

Back to: MakeCAB User's Guide > MAKECAB.EXE

This section includes:

Command Summary

Variable Summary

InfDisk/Cabinet/FileLineFormat Syntax and Semantics

INF Parameters

Command Details

Variable Details

Command Summary

The following table provides a summary of the MakeCAB Directive File syntax. Directives begin with a period ("."), followed by a command name, and possibly by blank delimited arguments. Note that a File Copy command is distinguished from a File Reference command by the setting of the GenerateInf variable.

SyntaxDescription
;Comment (anywhere on a DDF line)
src [dest] [/inf=yes|no] [/unique=yes|no] [/x=y ...]File Copy command
dest [/x=y ...]File Reference command
.Define variable=[value]Define variable to be equal to value (see .Option Explicit)
.Delete variableDelete a variable definition
.Dump Display all variable definitions
.InfBegin Disk | Cabinet | Folder Copy lines to specified INF file section
.InfEnd End an .InfBegin section
.InfWrite stringWrite "string" to file section of INF file
.InfWriteCabinet stringWrite "string" to cabinet section of INF file
.InfWriteDisk stringWrite "string" to disk section of INF file
.New Disk | Cabinet | Folder Start a new Disk, Cabinet, or Folder
.Option Explicit Require .Define first time for user-defined variables
.Set variable=[value]Set variable to be equal to value
%variable%Substitute value of variable
<blank line>Blank lines are ignored

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax

Variable Summary

Standard VariablesDescription
Cabinet=ON | OFFTurns Cabinet Mode on or off
CabinetFileCountThreshold=countThreshold count of files per Cabinet
CabinetNamen=filenameCabinet file name for cabinet number n
CabinetNameTemplate=templateCabinet file name template; * is replaced by Cabinet number
ChecksumWidth=1 | 2 | ... | 8Max low-order hex digits displayed by INF csum parameter
ClusterSize=bytesPerClusterCluster size on diskette (default is 512 bytes)
Compress=ON | OFFTurns compression on or off
CompressedFileExtensionChar=charLast character of the file extension for compressed files
CompressionMemory=15 | 16 | ... | 21The window size for LZX compression
CompressionType=MSZIP | LZXCompression engine
DestinationDir=pathDefault path for destination files (stored in cabinet file)
DiskDirectoryn=directoryOutput directory name for disk n
DiskDirectoryTemplate=templateOutput directory name template; * is replaced by disk number
DiskLabeln=labelPrinted disk label name for disk n
DiskLabelTemplate=templatePrinted disk label name template; * is replaced by disk number
DoNotCopyFiles= ON | OFFControls whether files are actually copied (ACME ADMIN.INF)
FolderFileCountThreshold=countThreshold count of files per Folder
FolderSizeThreshold=sizeThreshold folder size for current folder
GenerateInf=ON | OFFControl Unified vs. Relation INF generation mode
InfXxx=stringSet default value for INF Parameter Xxx
InfCabinetHeader[n]=stringINF cabinet section header text
InfCabinetLineFormat[n]=format stringINF cabinet section detail line format
InfCommentString=stringINF comment string
InfDateFormat=yyyy-mm-dd | mm/dd/yyINF date format
InfDiskHeader[n]=stringINF disk section header text
InfDiskLineFormat[n]=format stringINF disk section detail line format
InfFileHeader[n]=stringINF file section header text
InfFileLineFormat[n]=format stringINF file section detail line format
InfFileName=filenameName of INF file
InfFooter[n]=stringINF footer text
InfHeader[n]=stringINF header text
InfSectionOrder=[D | C | F]*INF section order (disk, cabinet, file)
MaxCabinetSize=sizeMaximum cabinet file size for current cabinet
MaxDiskFileCount=countMaximum count of files per Disk
MaxDiskSize[n]=sizeMaximum disk size
MaxErrors=countMaximum errors allowed before pass 1 terminates
ReservePerCabinetSize=sizeBase amount of space to reserve for FCRESERVE data
ReservePerDataBlockSize=sizeAmount of space to reserve in each data block
ReservePerFolderSize=sizeAmount of additional space in FCRESERVE for each folder
RptFileName=filenameName of RPT file
SourceDir=pathDefault path for source files
UniqueFiles=ON | OFFControl whether duplicate destination file names are allowed

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax

InfDisk/Cabinet/FileLineFormat Syntax and Semantics

The InfDiskLineFormat, InfCabinetLineFormat, and InfFileLineFormat variables are used to control the formatting of the "detail" lines in the INF file. The syntax of the values assigned to these variables is as follows:

  1. The "*" character is used to bracket replaceable parameters.
  2. Two "*" characters in a row ("**") are replaced by a single "*".
  3. A replaceable parameter name may be one of the standard ones defined by MakeCAB, or it may be a custom parameter. The value used for a parameter is found in the following order:
    1. If a parameter is specified on a File Copy or File Reference command, the specified value is used.
    2. If a variable InfXxxx is defined for this parameter, its value is used. The parameter is a standard parameter, and its defined value is used.
  4. Braces "{}" may be used to indicate portions of text plus exactly one parameter that are omitted if the parameter value is blank. For example, "{*id*,}*file*,*size*" will generate the following strings, depending upon the values of id, file, and size:
    idfilesizeOutput String
     foo.dat23foo.dat,23
    17foo.dat2317,foo.dat,23
    17 2317,,23

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax

INF Parameters

The following table lists the standard parameters that may be specified in INF line formats and on File Copy and File Reference commands. The Disk, Cab, and File columns indicate which parameters are supported in the InfDiskLineFormat, InfCabinetLineFormat, and InfFileLineFormat, respectively. In addition, the File column also indicates which parameters may be specified on the File Copy and File Reference commands.

ParameterDiskCabFileDescription
attr  YesFile attributes (A=archive, R=read-only, H=hidden, S=system)
cab#  YesYesCabinet number (0 means not in cabinet, 1 or higher is cabinet number)
cabfile  Yes Cabinet file name
csum   YesChecksum
date   YesFile date (mm/dd/yy or yyyy-mm-dd, depending upon InfDateFormat)
disk# YesYesYesDisk number (1-based)
file   YesDestination file name in layout (in cabinet or on a disk)
file#   YesDestination file number in layout (first file is 1, second file is 2, ...); the order of File Copy Commands controls the file number, so in relational INF mode the order of File Reference Commands has no affect on the file number.
label Yes  Disk user-readable label (value comes from DiskLabeln, if defined, and otherwise is constructed from DiskLabelTemplate).
lang   YesLanguage (i.e., VER.DLL info) in base 10, blank separated if multiple values
size   YesFile size (only affects value written to INF file)
time  YesFile time (hh:mm:ss[a|p])
ver  YesBinary File version (n.n.n.n base 10 format)
vers  YesString File version -- can be different from ver!
attr  YesFile attributes (A=archive, R=read-only, H=hidden, S=system)

Just as custom INF parameters can be defined by using the .Define and .Set command (e.g., .Set InfCustom=default value), the .Set command can also be used to override the values of these parameters. This is most obviously useful for the date and time parameters, as it provides a simple way to "date stamp" all the files in a layout; and for the attr parameter, this provides a way to force a consistent set of file attributes (commonly used to clear the read-only and archive attribute bits).

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax

Command Details

;
A comment line.
A comment may appear anywhere in a directive file. In addition, any line may include a comment at the end. Any text on the line following the comment is ignored.

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Command Summary

source [destination] [/INF= YES | NO] [/UNIQUE=YES | NO] [/x=y [/x=y ...]]
A File Copy Command; specifies a file to be placed onto a disk or cabinet. If GenerateInf is OFF, then lines without leading periods are interpreted as File Copy Commands.

source is a file name, and may include a relative or absolute path specification. The SourceDir variable is applied first, if specified.

destination is the name to store in the cabinet file (if Cabinet is On), or the name for the destination file (if Cabinet is Off). The DestinationDir variable is used as a prefix. /INF=YES | NO controls whether destination must specified in a Reference command in the INF section of the DDF. If YES is specified (the default), then destination must be specified in at least one Reference command. If NO is specified, then destination does not have to be specified in any Reference command. This parameter is used only if Relational INF mode is selected (see the GenerateInf variable), as Unified mode does not support Reference commands.

/UNIQUE=YES | NO controls whether destination must be unique throughout the layout. Specifying this parameter on the file copy command overrides the default setting controlled by the UniqueFiles variable (which defaults to YES). If Relational INF mode is selected (see the GenerateInf variable), then UniqueFiles must be YES.

/x=y permits standard and custom INF parameters to be applied to a file copy command. These parameters are carried along with the file by MakeCAB and used to format file detail lines in the INF file. In addition, the /Date, /Time, and /Attr parameters also control the values that are placed in the cabinet files or on the disk layout (for files outside of a cabinet). This permits a great deal of flexibility in customizing the INF file format. A parameter "x" is defined to have the value "y" (which may be empty). Quotes can be used in "y" to include blanks or other special characters. If a parameter "x" is also defined on a File Reference command, that setting overrides any setting for "x" specified on the referred to File Copy command. See INF Parameters for a list of standard parameters.

NOTE: You must define a variable InfX if you are going to use /X=y on a File Copy (or File Reference) command. If no such variable is defined, then /X=y will generate an error. This behavior ensures that there is a default value for every parameter, and makes it easier to catch inadvertent typing errors.

If the destination is not specified, its default value depends upon the Cabinet and Compress variables, as indicated by the following table, using BIN\EXCEL.EXE as a sample source file name. Note that the variable CompressedFileExtensionChar controls the actual character used to indicate a compressed file. Note also that the DestinationDir variable is prefixed to the destination name before it is stored in the cabinet file.

 Compress = OFFCompress = ON
Cabinet = OFFEXCEL.EXE -- uncompressed, not in a cabinet.This scenario is not supported -- See note below.
Cabinet = ONEXCEL.EXE -- uncompressed, in a cabinet.EXCEL.EXE -- compressed, in a cabinet
NOTE: Compressing a single file is generally not a good idea, as better compression is achieved by compressing across file boundaries (hence cabinet files). For this reason, MakeCAB does not support this case.

Examples:

.Set Compress=OFF             ; Turn off compression
.Set Cabinet=OFF              ; No cabinet file
setup.exe /inf=no             ; Setup is put on disk 1, won't be in INF
setup.inf                     ; Classic chicken & the egg problem

.Set Compress=ON              ; Turn compression on
readme.txt                    ; Placed on disk 1 as README.TX_
.Set Cabinet=ON               ; Turn cabinet file creation on
bin\excel.exe                 ; Placed in cabinet as EXCEL.EXE
msdraw.exe msapps\msdraw.exe  ; Placed in cabinet as MSAPPS\MSDRAW.EXE
a.txt dup.txt /unique=no      ; Another dup.txt is allowed
b.txt dup.txt /unique=no      ; And here it is

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Command Summary

destination [/x=y [/x=y ...]]
A File Reference Command; specifies that information for a file (previously specified in a File Copy command) is to be written to the File section of the INF file. This command is only supported in Relational INF mode. If GenerateInf is ON, then lines without leading periods are interpreted as File Reference Commands.

destination is the name of a file previously specified in a File Copy command as the destination in the layout (not the source!). Therefore, UniqueFiles is required to be ON.

/x=y permits standard and custom INF parameters to be applied to a file reference command. These parameters are merged with any parameters specified on the referenced File Copy command, with parameters on the File Reference command taking precedence.

A parameter "x" is defined to have the value "y" (which may be empty). Quotes can be used in "y" to include blanks or other special characters. . See INF Parameters for a list of standard parameters.

NOTE: You must define a variable InfX if you are going to use /X=y on a File Reference (or File Copy) command. If no such variable is defined, then /X=y will generate an error. This behavior ensures that there is a default value for every parameter, and makes it easier to catch inadvertent typing errors.

Examples:

.Set GenerateInf=OFF     ; Relational INF mode; file layout
setup.exe /inf=no        ; Setup is put on disk 1, won't be in INF
readme.txt
shared.dll /special=yes  ; Custom parameter

.Set GenerateInf=ON      ; INF section of DDF
.InfWrite [Common]
readme.txt
.InfWrite [One]
shared.dll /special=no   ; Override parm on file copy command
.InfWrite [Two]
shared.dll               ; Use /special value from file copy

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Command Summary

.Define variable=[value]
Define variable to be equal to value.

To use variable, surround it with percent signs (%) -- %variable%.

Using an undefined variable is an error, and will cause MakeCAB to stop before pass 2.

value may include references to other variables.

Leading and trailing blanks in value are discarded.

Blanks may be enclose in quote (") or apostrophe (') marks.

Explicit percent signs (%), quotes ("), or apostrophes (') must be specified twice.

NOTE: If .Option Explicit is specified, then you must first use .Define to define any user-defined variables before you can use .Set to modify them. For standard MakeCAB variables, .Define is not permitted, and only .Set may be used on. If .Option Explicit is not specified, then .Define is equivalent to .Set.

Examples:

.Define lang=ENGLISH                ; Set language
.Define country=USA                 ; Set country
.Define SourceDir=%lang%\%country%  ; SourceDir = [ENGLISH\USA]
.Define join=%lang%%country%        ; join = [ENGLISHUSA]
.Define success=100%%               ; success = [100%]
.Define SourceDir=                  ; SourceDir = []
.Define contraction="don't"         ; contraction = [don't]
.Define contraction=don''t          ; contraction = [don't]
.Define someSpaces=  hi there       ; someSpaces = [hi there]
.Define someMore="  blue dog  "     ; someMore = [  blue dog  ]

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Command Summary

.Delete variable
Delete a variable definition.

You may only delete variables that have been created by .Define or .Set commands. Standard MakeCAB variables may not be deleted.

Examples:

.Set myVariable=raisin
.Delete myVariable      ; Delete myVariable

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Command Summary

.Dump
Display the entire MakeCAB variable table.

This command can be used to aid debugging of complicated (or not so complicated) MakeCAB directive files. Note that the dump will be displayed during pass 1 and again during pass 2.

Examples:

.Dump               ; Dump variable table to stdout

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Command Summary

.InfBegin DISK | CABINET | FILE
Start a block of one or more lines to write to the specified area of the INF file.

The lines in the block will be copied unmodified to the specified section of the INF file, so no MakeCAB variable substitution will be performed. Similarly, MakeCAB will not strip comments.

Use .InfWrite, .InfWriteCabinet, or .InfWriteDisk if you need variable substitution.

Examples:

.InfBegin disk                ; Text for disk section of INF file
;This is a comment for the disk section.  MakeCAB will not process
;this line, so, for example, %var% will not be substituted.
.InfEnd

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Command Summary

.InfEnd
Terminate an .InfBegin block.

Examples:

.InfEnd            ; Close an .InfBegin block

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Command Summary

.InfWrite string
Write string to the file area of the INF file.

Note that lines will have MakeCAB comments removed and variable values substituted. If you want to avoid this processing, use the .InfBegin File command. Leading whitespace is normally removed, but you can override this by placing whitespace in quotes (see examples below)

Examples:

.InfWrite [A Section Header]  ; Text for file section, this comment
                              ;    will not appear.

.InfWrite ;<disk>,<file>      ; MakeCAB strips off the comments, so this
                              ;    command just writes a blank line!

.InfWrite ";<disk>,<file>"    ; Get that comment in the INF file

.InfWrite "  "%someVar%       ; Get leading space on the INF line

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Command Summary

.InfWriteCabinet string
Write string to the cabinet area of the INF file.

Note that lines will have MakeCAB comments removed and variable values substituted. If you want to avoid this processing, use the .InfBegin Cabinet command.

Examples:

.InfWriteCabinet 40%% off your favorite furniture  ; %% collapse down to
                     ; one %, because MakeCAB does variable
                     ; substitution on the string.

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Command Summary

.InfWriteDisk string
Write string to the disk area of the INF file.

Note that lines will have MakeCAB comments removed and variable values substituted. If you want to avoid this processing, use the .InfBegin Disk command.

Examples:

.InfWriteDisk The Rain in Spain falls Mainly on the Plain

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Command Summary

.New Disk | Cabinet | Folder
Force a disk, cabinet, or folder break.

This is used to complete the current disk, cabinet, or folder, and start a new one.

Examples:

.New Disk     ; Start a new disk
.New Cabinet  ; Start a new cabinet
.New Folder   ; Start a new folder

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Command Summary

.Set variable=value
Set variable to be equal to value.

To use variable, surround it with percent signs (%) -- %variable%.

Using an undefined variable is an error, and will cause MakeCAB to stop before pass 2.

value may include references to other variables.

value may be empty, in which case variable is set to the empty string.

Leading and trailing blanks in value are discarded.

Blanks may be enclose in quote (") or apostrophe (') marks.

Explicit percent signs (%), quotes ("), or apostrophes (') must be specified twice.

NOTE: If .Option Explicit is specified, then you must first use .Define to define any user-defined variables before you can use .Set to modify them. For standard MakeCAB variables, .Define is not permitted, and only .Set may be used on.

Examples:

.Set lang=ENGLISH                ; Set language
.Set country=USA                 ; Set country
.Set SourceDir=%lang%\%country%  ; SourceDir = [ENGLISH\USA]
.Set join=%lang%%country%        ; join = [ENGLISHUSA]
.Set success=100%%               ; success = [100%]
.Set SourceDir=                  ; SourceDir = []
.Set contraction="don't"         ; contraction = [don't]
.Set contraction=don''t          ; contraction = [don't]
.Set someSpaces=  hi there       ; someSpaces = [hi there]
.Set someMore="  blue dog  "     ; someMore = [  blue dog  ]

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Command Summary

Variable Details

The standard MakeCAB variables are listed below. These variables are predefined, and each of them have default value, which is used if you do not set the variable from the command line (/D var=value) or prior to the time you explicitly set the variable with a .Define or .Set command in a directive file.

You can create your own variables as well, using the .Define command if you specify .Option Explicit, and the .Set command otherwise.

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax

Cabinet=On | Off
Turns cabinet mode on or off.

Default: .Set Cabinet=On ; Cabinet mode is ON

When cabinet mode is On, the following applies:

  1. Files are stored in a cabinet, whose name is taken from the CabinetNameTemplate variable
  2. If the compressed size of a file would cause the current Cabinet to exceed the current MaxCabinetSize variable, then as much of the compressed file as possible is stored in the current Cabinet, that Cabinet is closed, and a new Cabinet is created. Note that it is possible for a large file to span multiple Cabinets!
  3. If the compressed size of a file (or set of files, if the files are small) would cause the current Folder to exceed the current FolderSizeThreshold variable, these files are the last ones added to the current Folder, a new Folder is started for any subsequent files. (See note below.) Note that if the current Folder cannot fit in the current Cabinet, as much as possible of the Folder is stored in the current Cabinet, and the remainder of the Folder is stored in the next Cabinet. This means that it is possible for several files to be continued from one Cabinet file to the next Cabinet file!
NOTE: The motivation here is that a Folder is a decompression boundary, and so is advisory. To access a file in a Folder, you must start decompressing from the beginning of a Folder, potentially decompressing (and discarding) many files until you arrive at the desired file. If we made the current folder larger, then this file just added would take longer to access. In general, the FolderSizeThreshold variable should be several times larger than 32K, to be of any utility.

When cabinet mode is Off, the following applies:

  1. The Compress mode must be Off
  2. Files are stored in individual files
  3. If the destination file is not supplied, the default name is controlled by the compression mode (see the Compress variable)

Examples:

.Set Cabinet=OFF   ; Files not in cabinets...
.Set Compress=OFF  ; ...and no compression.
setup.exe          ; Setup program is simply copied to disk.
.Set Cabinet=ON    ; Use a cabinet...
.SET Compress=ON   ; ...and compress remaining files.

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

CabinetFileCountThreshold=count
Sets a goal for the maximum number of files in a cabinet.

Default: .Set CabinetFileCountThreshold=0 ; Default is no threshold

count is a threshold for the number of files to store in a cabinet. Once this count has been reached, MakeCAB will close the current cabinet as soon as possible. Due to the blocking of files for compression purposes, it is possible that the cabinet will contain more files than specified by this variable.

If count is 0, then there is no limit on the number files per cabinet.

Examples:

.Set CabinetFileCountThreshold=100  ; Shoot for 100 files per cabinet

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

CabinetNamen=filename
The cabinet file name for the specified cabinet.

Default: ; By default none of these variables are defined

If this variable is not defined for a particular disk, then MakeCAB uses the CabinetNameTemplate to construct the cabinet name.

Examples:

.Set CabinetName1=one.cab

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

CabinetNameTemplate=template
Sets the cabinet file name template.

Default: .Set CabinetNameTemplate=*.CAB ; 1.CAB, 2.CAB, ...

This template is used to construct the file name of each cabinet. The "*" in this template is replaced by the cabinet number (1, 2, etc.). This variable is used only if no variable CabinetNamen exists for cabinet n.

NOTE: Be sure that the expanded cabinet name does not exceed the limits for your file system! For example, if you used "CABINET*.CAB", and MakeCAB had to create 10 or more cabinets, then you would have cabinet names like CABINET10.CAB, which is 9.3, which is an invalid name in the FAT file system. Unfortunately, MakeCAB would not detect this until it had already created 9 cabinets!

Examples:

.Set CabinetNameTemplate=EXCEL*.DIA  ; EXCEL1.DIA, EXCEL2.DIA, etc.
.Set CabinetNameTemplate=*.          ; 1, 2, 3, etc.

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

ChecksumWidth=1 | 2 | ... | 8
Sets the maximum number of low-order hex digits displayed by InfFileLineFormat csum parameter.

Default: .Set ChecksumWidth=8 ; Default is all 8 hex digits (csum is a 32-bit value)

The presence of the csum parameter in the InfFileLineFormat variable causes MakeCAB to compute a 32-bit CRC for each file and write that checksum to the INF file. While leading zeros are not written out, the presence of these checksums can significantly increase the size of the INF file. You can use ChecksumWidth to restrict the size of the checksum written to the INF file. If a value less than 8 is specified, then MakeCAB will mask off the high-order bits of the 32-bit checksum to produce a value for the INF file that is at most the number of hex digits specified.

Examples:

.Set ChecksumWidth=4  ; Only display the low order 4 hex digits

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

ClusterSize=bytesPerCluster
The cluster size of the distribution media.

Default: .Set ClusterSize=512 ; 1.44M and 1.2M floppies have 512-byte clusters

This is used by MakeCAB to round up the sizes of files and cabinets to a cluster boundary, so it can determine when to switch to the next disk.

You can use a standard disk size from the following list, and MakeCAB will supply the known cluster size for that disk size:

  • 1.44M
  • 1.25M (Japanese NEC 3.5" drive capacity)
  • 1.2M
  • 720K
  • 360K
  • CDROM

Examples:

.Set ClusterSize=1.44M  ; Use known 1.44M floppy info

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

Compress=ON | OFF
Turn file compression on or off.

Default: .Set Compress=On ; Compression is on

While compression is usually on, you generally turn if off for the first few files on disk 1 (SETUP.EXE, for example). If the Cabinet setting is Off then compression must also be Off, it is invalid to create single file Cabinets when processing multiple files.

Examples:

.Set Cabinet=OFF   ; Files not in cabinets...
.Set Compress=OFF  ; ...and no compression.
setup.exe          ; Setup program is simply copied to disk.
.Set Cabinet=ON    ; Use a cabinet...
.SET Compress=ON   ; ...and compress remaining files.

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

CompressedFileExtensionChar=char
Last character in file name used when compressing an individual file.

Default: .Set CompressedFileExtensionChar=_ ; Default is an underscore ("_")

When compressing an individual file through the command line, the file name is constructed by taking the source file name and replacing the last character of the file extension with the setting of this variable. You can use /D CompressedFileExtensionChar=c on the command line to change the appended character. If it's not supplied, then the default value will be used.

This variable is ignored if used in a directive file.

Examples:

MAKECAB.EXE /D CompressedFileExtensionChar=$  SAMPLE.EXE ; SAMPLE.EXE => SAMPLE.EX$
                                                         ; SAMPLE.EX  => SAMPLE.EX$
                                                         ; SAMPLE.E   => SAMPLE.E$
                                                         ; SAMPLE.    => SAMPLE.$
                                                         ; SAMPLE     => SAMPLE.$

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

CompressionMemory=15 | 16 | ... | 21
Controls the size of the LZX sliding window.

Default: .Set CompressionMemory=18 ; Default window size 218 bytes (256KB).

LZX can use a variety of window sizes from 215 bytes (32KB) to 221 bytes (2MB). Larger values will generally produce better compression ratios, but will require more time and memory during both compression and decompression.

This variable is ignored if the CompressionMemory is LZX.

Examples:

.Set CompressionMemory=21  ; 2MB LZX Window Size

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

CompressionType=MSZIP | LZX
Select compression engine.

Default: .Set CompressionType=MSZIP ; Default is MSZIP compressor

MSZIP is the default compression type supported by Microsoft. This version of MakeCAB.EXE also supports the LZX compression method, which can achieve higher compressions ratios.

Using MSZIP compression and FolderSizeThreshold=1 will generate a cabinet file approximately the same size as a PKZIP-compatible compression engine. LZX compression requires more time, but LZX decompression is typically faster.

Examples:

.Set CompressionType=MSZIP  ; MSZIP compressor

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

DestinationDir=path
Path prefix to store in cabinet file for each file in the cabinet.

Default: .Set DestinationDir= ; Default is no path prefix

path is concatenated with a path separator ("\") and the target file name on File Copy Commands to produce the file name that is stored in cabinet file. EXTRACT.EXE will use this file name as the default name when the file is extracted.

Examples:

.Set DestinationDir=SYSTEM  ; Following files get SYSTEM prefix
bin\ARIAL.TTF               ; Name in cabinet is SYSTEM\ARIAL.TTF
.Set DestinationDir=        ; No prefix
bin\ARIAL.TTF               ; Name in cabinet is ARIAL.TTF

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

DiskDirectoryn=directory
The output directory name for the specified disk.

Default: ; By default none of these variables are defined

If this variable is not defined for a particular disk, then MakeCAB uses the DiskDirectoryTemplate to construct the disk directory.

Examples:

.Set DiskDirectory1=disk.one

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

DiskDirectoryTemplate=template
Set the output directory name template. One directory is created for each disk of the layout.

Default: .Set DiskDirectoryTemplate=DISK* ; Default is DISK1, DISK2, etc.

As MakeCAB processes a directive file, it will create one or more disk "images". Rather than using some specific disk format, however, MakeCAB simply creates one subdirectory for each disk and places the files for each disk in the appropriate directory. If a "*" exists in this variable, then it is replaced with the disk number. If no "*" is specified, then all files are placed in the single directory specified by this variable.

This variable is used only if no variable DiskDirectoryn exists for disk n.

NOTE:

Examples:

.Set DiskDirectoryTemplate=C:\EXCEL6\DISK*  ; Put files in separate dirs
.Set DiskDirectoryTemplate=C:\EXCEL6        ; Put all files in C:\EXCEL6
.Set DiskDirectoryTemplate=                 ; Put all files in current dir

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

DiskLabeln=label
The user-readable text string for the specified disk.

Default: ; By default none of these variables are defined

This label is stored in cabinet files that contain files that are split across disk boundaries, to simplify prompting for the appropriate disk to insert into the drive. For example, if EXCEL.EXE started in 1.CAB and finished in 2.CAB, and a user asked to extract EXCEL.EXE from 2.CAB, EXTRACT.EXE can retrieve the printed label for the disk containing 1.CAB (say, Excel Program Disk 1) and tell the user to insert that disk and try again.

If this variable is not defined for a particular disk, then MakeCAB uses the DiskLabelTemplate to construct the disk label.

Examples:

.Set DiskLabel1="Excel Setup Disk 1"
.Set DiskLabel2="Excel Setup Disk 2"

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

DiskLabelTemplate=template
Set the printed disk label. Used if individual DiskLabeln variables are not defined.

Default: .Set DiskLabelTemplate="Disk *" ; Default is "Disk 1", "Disk 2", etc.

Sets the default user-readable disk label. If a "*" exists in this variable, then it is replaced with the disk number. This variable is used only if no variable DiskLabeln exists for disk n.

Examples:

.Set DiskLabelTemplate="Excel Disk *"

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

DoNotCopyFiles=On | Off
Controls whether File Copy Commands actually copy files.

Default: .Set DoNotCopyFiles=Off ; Files are copied

This option is intended to be used when Cabinet is OFF and Compress is OFF, as a means of generating an INF file very quickly. It has no affect when Cabinet is ON or Compress is ON.

Examples:

.Set DoNotCopyFiles=ON      ; Make MakeCAB create the INF file quickly

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

FolderFileCountThreshold=count
Set the threshold on the number of files to store in a folder.

Default: .Set FolderFileCountThreshold=0 ; Default to no limit on count of files in a folder

Sets the threshold file count for the current folder. When this threshold is exceeded, then the current folder is closed. If any more files are to be processed, they will go into a new folder.

If Cabinet is OFF, this variable is ignored.

If count is 0, then there is no limit on the count of files in a folder.

Examples:

.Set FolderFileCountThreshold=50  ; No more than 50 files per folder

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

FolderSizeThreshold=size
Set the threshold size for the current folder.

Default: .Set FolderSizeThreshold=0 ; Default to the maximum cabinet size

Sets the threshold size for the current folder. When this threshold is exceeded, then the current folder is closed. If any more files are to be processed, they will go into a new folder. MakeCAB attempts to limit folders to the size specified by this variable, but in most cases folders will be a bit larger than this threshold.

If Cabinet is OFF, this variable is ignored. If size is 0, then the threshold is the same as the maximum cabinet size.

Folders are compression/encryption boundaries. The state of the compressor and cryptosystem are reset at folder boundaries. To access a file in a folder, the folder must be decrypted and decompressed starting from the front of the folder and continuing through to the desired file. Thus, smaller folder thresholds are appropriate for a layout where a small number of files needs to be randomly accessed quickly from a cabinet. On the other hand, larger folder thresholds permit the compressor to examine more data, and so generally yield better compression results. For a layout where the files will be accessed sequentially and most of the files will be accessed, a larger folder threshold is best.

Examples:

.Set FolderSizeThreshold=1M  ; Aim for 1Mb folders

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

GenerateInf=ON | OFF
Controls Unified vs. Relational INF generation mode.

Default: .Set GenerateInf=ON ; Default to "unified" INF mode

If GenerateInf is ON when the first file copy command is encountered, then Unified INF mode is selected. In this mode, file detail lines are written to the INF file as file copy commands are processed, so the order of file lines in the INF is exactly the same as the order of the files in the layout.

If GenerateInf is OFF when the first file copy command is encountered, then Relational INF mode is selected. In this mode, file copy commands are processed, but INF file generation is delayed until GenerateInf is set to ON, and File Reference commands are used to select information on files in the layout to be placed in the INF file.

Unified mode is easier to use, since each file is specified only once, and is most appropriate for quick usage of MakeCAB.

Relational mode is more complicated, since each file must be specified (at least) twice, but it provides very fine control of both the disk layout and the format of the INF file. In particular, some INF files want to have sections to list the files associated with a certain feature, there may be many such sections, and some files may be required in more than one section. Unified mode does not provide any method to generate such an INF file, but Relational mode does via the File Reference command.

By separating the disk layout order from the INF file order, MakeCAB permits optimization of the file layout for compression vs. access time. The layout section of the DDF contains file copy commands that control precisely where files are in the layout. The INF section of the DDF contains INF formatting information, including File Reference commands to pull in information about specific files from earlier File Copy commands in the layout section.

NOTE: Once GenerateInf is set to ON and at least one File Copy command has been processed, GenerateInf may not be set to OFF (i.e., in Relational Mode, all File Copy commands must be processed before any File Reference commands)

Examples:

;** Layout section - File Copy commands
.Set GenerateInf=OFF
foo.exe
bar.exe other.exe
foo.exe foo1.exe
....

;** INF section -- File Reference commands
.Set GenerateInf=ON
.WriteInf "[a section]"
foo.exe
other.exe
foo1.exe /rename=sys\foo.exe   ; pass custom parameter
....

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

InfXxx=string
Sets the default value for an INF parameter.

Default: [Not applicable]

Variables of this form (other than the standard ones in this list) can be used for two purposes:

  1. To override the usual value of a standard INF parameter (like date, time, attr, etc.) for all the files (or a set of files) in the layout.
  2. To define a custom INF parameter, and specify its default value.
NOTE: When in Relation INF mode, only the last value for a particular InfXxx variable will be carried over from the layout section to the INF section of the DDF. In the following example:
;** Layout section - File Copy commands
.Set GenerateInf=OFF    ; Select Relational INF
.Set InfCustom=apple
file.1
.Set InfCustom=pear
file.2
;** INF section - File Reference commands
.Set GenerateInf=ON
file.1                  ; *custom* value is "pear", not "apple"!
file.2

Examples:

.Set InfDate=05/02/94   ; Date stamp all files
.Set InfTime=06:00:00a  ; Time stamp all files
.Set InfAttr=           ; Turn off all attributes (esp. read-only)
.Set InfCustom=yes      ; Define custom INF parameter

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

InfCabinetHeader[n]=string
Sets the header text for the cabinet section of the INF file.

Default: .Set InfCabinetHeader="[cabinet list]"

This string is written to the INF prior to any cabinet detail lines. MakeCAB will also use any variables of the form InfCabinetHeadern where n is an integer with no leading zeros (0). These additional lines will be printed out in increasing order after the InfCabinetHeader line. Any .InfBegin Cabinet/.InfEnd lines will be printed as they are encountered, but in any event after all of these header lines.

Examples:

.Set InfCabinetHeader=";Lots o' cabinets"

.Set InfCabinetHeader=                 ; No cabinet header

.Set InfCabinetHeader=";Line 1 of cabinets"
.Set InfCabinetHeader1=";Line 2 of cabinets"
.Set InfCabinetHeader2=";Line 3 of cabinets"

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

InfCabinetLineFormat[n]=format string
Sets the detail line format for the cabinet section of the INF file.

Default: .Set InfCabinetLineFormat=*cab#*,*disk#*,*cabfile*

This format is used to generate a line in the "cabinet" section of the INF. If a numeric suffix n is specified in the variable name, then the specified format is used for cabinet number n. If no such cabinet number-specific format is defined, then the value of the InfCabinetLineFormat variable is used.

See InfDisk/Cabinet/FileLineFormat Syntax and Semantics for details on the format string.

See INF Parameters for a list of the allowed parameter names.

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

InfCommentString=string
Sets the line comment string for the INF file.

Default: .Set InfCommentString=";"

This is the string MakeCAB will use to prefix comment lines that it generates in the INF (the autogenerated MakeCAB version/date/time lines, for example).

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

InfDateFormat=YYYY-MM-DD | MM/DD/YY
Sets the date format used for dates written to the INF file.

Default: .Set InfDateFormat=MM/DD/YY ; Default to normal US convention

This format is used to format the date parameter for the InfFileLineFormat used to write file detail lines to the INF file.

Examples:

.Set InfDateFormat=YYYY-MM-DD       ; Use the preferred ACME format

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

InfDiskHeader[n]=string
Sets the header text for the disk section of the INF file.

Default: .Set InfDiskHeader="[disk list]"

This string is written to the INF prior to any disk detail lines. MakeCAB will also use any variables of the form InfDiskHeadern where n is an integer with no leading zeros (0). These additional lines will be printed out in increasing order after the InfDiskHeader line. Any .InfBegin Disk/.InfEnd lines will be printed as they are encountered, but in any event after all of these header lines.

Examples:

.Set InfDiskHeader=";Lots o' Disks"

.Set InfDiskHeader=      ; No Disk header

.Set InfDiskHeader=";Line 1 of Disks"
.Set InfDiskHeader1=";Line 2 of Disks"
.Set InfDiskHeader2=";Line 3 of Disks"

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

InfDiskLineFormat[n]=format string
Sets the detail line format for the disk section of the INF file.

Default: .Set InfDiskLineFormat=*disk#*,*label*

This format is used to generate a line in the "disks" section of the INF. If a numeric suffix n is specified in the variable name, then the specified format is used for disk number n. If no such disk number-specific format is defined, then the value of the InfDiskLineFormat variable is used.

See InfDisk/Cabinet/FileLineFormat Syntax and Semantics for details on the format string.

See INF Parameters for a list of the allowed parameter names.

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

InfFileHeader[n]=string
Sets the header text for the file section of the INF file.

Default: .Set InfFileHeader="[file list]"

This string is written to the INF prior to any file detail lines. MakeCAB will also use any variables of the form InfFileHeadern where n is an integer with no leading zeros (0). These additional lines will be printed out in increasing order after the InfFileHeader line. Any .InfBegin File/.InfEnd lines will be printed as they are encountered, but in any event after all of these header lines.

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

InfFileLineFormat[n]=format string
Sets the detail line format for the file section of the INF file.

Default: .Set InfFileLineFormat=*disk#*,*cab#*,*file*,*size*

This format is used to generate a line in the "file" section of the INF. If a numeric suffix n is specified in the variable name, then the specified format is used for file number n (file numbers start at 1, and are based on the File Copy Commands, not the File Reference Commands). If no such file number-specific format is defined, then the value of the InfFileLineFormat variable is used.

See InfDisk/Cabinet/FileLineFormat Syntax and Semantics for details on the format string.

See INF Parameters for a list of the allowed parameter names.

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

InfFileName=filename
Sets the name of the INF output file.

Default: .Set InfFileName=SETUP.INF ; Default file name is SETUP.INF

Defines the file name for the INF file. This file has disk, cabinet, and file information that is intended for use by a setup program during the setup process.

Examples:

.Set InfFileName=EXCEL.INF

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

InfFooter[n]=string
Sets the footer text for beginning of the INF file.

Default: // Run MakeCAB and use the .Dump command to see the default footer

These strings are written to the INF file after all other information. To disable this footer text, set InfFooter to the empty string (.Set InfFooter=). MakeCAB will also use any variables of the form InfFootern where n is an integer with no leading zeros (0). These additional lines will be printed out in increasing order after the InfFooter line, starting with InfFooter1.

The following special strings may be specified in InfFooter[n] values (note that the two percent signs are required, so that MakeCAB does not interpret these as variable references):

StringDescription
%%1The comment string -- each InfFooter[n] line should probably start with %%1.
%%2The date and time MakeCAB was run to produce the INF file.
%%3The version of MakeCAB use to produce the INF file.

Examples:

.Set InfFooter=             ; Disable INF footer text
.Set InfFooter="%%1 %2 %3"  ; Short footer
.Set InfFooter="%%1*****"   ; Long footer
.Set InfFooter1="%%1* %2"   ; Long footer continued
.Set InfFooter2="%%1* %3"   ; Long footer continued
.Set InfFooter3="%%1*****"  ; Long footer continued

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

InfHeader[n]=string
Sets the header text for beginning of the INF file.

Default: // Run MakeCAB and use the .Dump command to see the default header.

These strings are written to the INF file prior to any other information. To disable this header text, set InfHeader to the empty string (.Set InfHeader=). MakeCAB will also use any variables of the form InfHeadern where n is an integer with no leading zeros (0). These additional lines will be printed out in increasing order after the InfHeader line, starting with InfHeader1.

The following special strings may be specified in InfHeader[n] values (note that the two percent signs are required, so that MakeCAB does not interpret these as variable references):

StringDescription
%%1The comment string -- each InfHeader[n] line should probably start with %%1.
%%2The date and time MakeCAB was run to produce the INF file.
%%3The version of MakeCAB use to produce the INF file.

Examples:

.Set InfHeader=             ; Disable INF header text
.Set InfHeader="%%1 %2 %3"  ; Short header
.Set InfHeader="%%1*****"   ; Long header
.Set InfHeader1="%%1* %2"   ; Long header continued
.Set InfHeader2="%%1* %3"   ; Long header continued

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

.Set InfHeader3="%%1*****" ; Long header continuedInfSectionOrder=[D | C | F]*
Set the generation and relative order of the Disk, Cabinet, and File sections in the INF file.

Default: .Set InfSectionOrder=DCF ; Disk, then Cabinet, and then File

This variable controls what sections of the INF file are generated, and the order in which they appear. Each of the letters "C" (cabinet), "D" (disk), and "F" (file) may be used at most once. Any or all of these letters may be omitted, and the corresponding section of the INF file will not be generated.

Examples:

.Set InfSectionOrder=DF  ; Disks, then files, omit the cabinet section

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

MaxCabinetSize=size
Set the maximum size for the current cabinet.

Default: .Set MaxCabinetSize=0 ; No limit, except MaxDiskSize

size is the maximum size for the current cabinet. If Cabinet is ON when this maximum is exceeded, then the current folder being processed will be split between the current cabinet and the next cabinet. If Cabinet is OFF, then this variable is ignored.

Note that MaxDiskSize (or MaxDiskSizen, if specified) takes precedence over this variable. MakeCAB never splits a cabinet file across a disk boundary, so a cabinet file will be no larger than the amount of free space available on the disk at the time the cabinet is created, even if this size is less than MaxCabinetSize.

If size is 0, then the cabinet size is limited only by the disk size (MaxDiskSize or MaxDiskSizen).

Examples:

.Set MaxCabinetSize=0  ; Use disk size as limit

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

MaxDiskFileCount=count
Sets the maximum number of files that can be stored on a disk.

Default: .Set MaxDiskFileCount=0 ; Default is no limit

count is the maximum number of files to store on a disk. Once this count has been reached, MakeCAB will close the current disk, even if space remains on the disk. This variable is most useful when cabinet files are not being used (say, to simulate the old style setup where each file is individually compressed), and MakeCAB needs to understand the limit of the number of files that can be stored in the root directory of a floppy.

If count is 0, then there is no limit on the number files per disk.

You can use a standard disk size from the following list, and MakeCAB will supply the known FAT root directory limits for that disk size:

  • 1.44M
  • 1.25M (Japanese NEC 3.5" drive capacity)
  • 1.2M
  • 720K
  • 360K
  • CDROM

The file count does not include any files inside cabinets. Each cabinet counts as a single file for purposes of this count.

Examples:

.Set DiskFileCountMax=256    ; Limit of 256 files per disk
.Set DiskFileCountMax=1.44M  ; Use limit for 1.44M FAT floppy disk

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

MaxDiskSize[n]=size
Set the maximum default size for a disk.

Default: .Set MaxDiskSize=1.44M ; Default is 1.44M floppy

size is the maximum default size for a disk. This variable is used only for disks for which a variable MaxDiskSizen is not defined.

If Cabinet is OFF, and the next file to be laid out cannot fit on the current disk, then MakeCAB will move to the next disk. If Cabinet is ON, then the current cabinet will use as much space on the current disk as possible.

If size is 0, then the disk size is unlimited.

You can use a standard disk size from the following list, and MakeCAB will use the correct disk size, down to the byte:

  • 1.44M
  • 1.25M (Japanese NEC 3.5" drive capacity)
  • 1.2M
  • 720K
  • 360K
  • CDROM

Examples:

.Set MaxDiskSize=0      ; No limit
.Set MaxDiskSize=CDROM  ; All files are being placed on a CD-ROM

.Set MaxDiskSize1=720K  ; First disk is 720K
.Set MaxDiskSize=1.44M  ; ... rest are 1.44M

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

MaxErrors=count
Set the maximum number of errors allowed before pass 1 terminates.

Default: .Set MaxErrors=20 ; Default is 20 errors

count is the maximum number of errors to permit before terminating pass 1.

If count is 0, then an unlimited number of errors is allowed.

Examples:

.Set MaxErrors=0  ; No limit
.Set MaxErrors=5  ; Limit to just a few

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

ReservePerCabinetSize=size
Sets a fixed size to reserve in a cabinet for the FCRESERVE structure.

Default: .Set ReservePerCabinetSize=0 ; Default is to reserve no space

size is the amount of space to reserve in a cabinet for the FCRESERVE structure. The total size of the FCRESERVE structure is the value of this variable plus the number of folders in the cabinet times the value of the ReservePerFolderSize variable.

size must be a multiple of 4 (to ensure memory alignment on certain systems).

A common use for this variable is to reserve space to store per-folder cryptosystem information, in the case where the cabinet is encrypted. For example, some sort of checksum value might be stored here to permit validation that the key being used to decrypt the cabinet is actually the one that was used to encrypt the cabinet.

MakeCAB fills this reserved section with zeros.

Examples:

.Set ReservePerCabinetSize=8  ; For use as a cryptosystem key checksum

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

ReservePerDataBlockSize=size
Sets the amount of space to reserve in each Data Block header.

Default: .Set ReservePerDataBlockSize=0 ; Default is to reserve no space

size is the amount of space to reserve in each Data Block header. This space is located after the standard Data Block header and before the data for the data block.

size must be a multiple of 4 (to ensure memory alignment on certain systems).

One possible use for this variable is to reserve space to store a per-data block cryptosystem information, in the case where the cabinet is encrypted. (See note below.)

NOTE: [6/6/94] Ali Baba is not using this value, so even though it has been implemented and tested, there are no known customers.

MakeCAB fills this reserved section with zeros.

Examples:

.Set ReservePerCabinetSize=4  ; Reserve 4 bytes per data block

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

ReservePerFolderSize=size
Sets the amount of additional space to reserve in the FCRESERVE structure for each folder in the cabinet.

Default: .Set ReservePerFolderSize=0 ; Default is to reserve no space

size is the amount of space to reserve in the FCRESERVE structure for each folder in the cabinet. The total size of the FCRESERVE structure is the value of this variable times the value of the number of folders in the cabinet, plus the value of the ReservePerCabinetSize variable.

size must be a multiple of 4 (to ensure memory alignment on certain systems).

A common use for this variable is to reserve space to store a per-folder cryptosystem key, in the case where the cabinet is encrypted.

MakeCAB fills this reserved section with zeros.

Examples:

.Set ReservePerCabinetSize=8  ; Size of an RC4 cryptosystem key

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

RptFileName=filename
Sets the name of the RPT output file.

Default: .Set RptFileName=SETUP.RPT ; Default file name is SETUP.RPT

Defines the file name for the RPT file. This file has summary information on the MakeCAB run.

Examples:

.Set RptFileName=EXCEL.RPT

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

SourceDir=path
The default path used to locate source files specified in File Copy Commands.

Default: .Set SourceDir= ; Default is to look in the current directory

path is concatenated with a path separator ("\") and the source file name on the File Copy Command to produce the file name used to find the source file.

If path is empty, then the source file name specified on the File Copy Command is not modified.

Examples:

.Set SourceDir=C:\PROJECT  ; Find all source files in c:\project

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax > Variable Details

UniqueFiles=ON | OFF
Controls whether destination file names in a layout must be unique.

Default: .Set UniqueFiles="ON" ; File names must be unique

If UniqueFiles is ON, MakeCAB checks that all destination file names (names stored on disks or in cabinets) are unique, and generates an error (during pass 1) if they are not. ON is the default, since using the same filename twice usually means that the same file was accidentally included twice, and this would be a waste of disk space.

If UniqueFiles is OFF, MakeCAB permits duplicate destination file names.

The /UNIQUE parameter may be specified on individual File Copy commands to override the value of UniqueFiles.

If the GenerateInf variable is used to select Relational INF generation, then UniqueFiles must always be ON, since MakeCAB uses the destination filename as the unique key to link File Reference commands back to File Copy commands.

Back to: MakeCAB User's Guide > MAKECAB.EXE > MAKECAB.EXE Directive File Syntax

EXTRACT.EXE

Extract supports command-line extraction of files from cabinet files.

extract [/y] [/A] [/D | /E] [/L location] cabinet_file [file_spec ...]
extract [/y] compressed_file [destination_file]

Switches:

/AProcess all files in a cabinet set, starting with the cabinet_file.
/DOnly produce a directory listing (do not extract).
/EForce extraction.
/LUse the directory specified by location, instead of the current directory, as the default location to place extracted files.
/YOverwrite destination without prompting. The default is to prompt if the destination file already exists, and allow the customer to: a) overwrite the file, b) skip the file, c) overwrite this file and all subsequent files that may already exist, or d) exit.

Parameters:

compressed_file
This is a cabinet file that contains a single file (example, FOO.EX_ containing FOO.EXE). If destination_file is not specified, then the file is extracted and given its original name in the current directory.
destination_file
This can be either a relative path (".:, "..", "c:foo", etc.) or a fully qualified path, and may specify either a file (or files, if wild cards are included) or a directory. If a directory is specified, then the file name stored in the cabinet is used. Otherwise, destination_file is used as the complete file name for the extracted file.
cabinet_file
This is a cabinet file that contains two or more files. If no file_spec parameter is specified, then a list of the files in the cabinet is displayed. If one or more file_spec parameters are specified, then these are used to select which files are to be extracted from the cabinet (or cabinets). Wild cards are allowed to specify multiple cabinets.
location
Specifies the directory where extracted files should be placed.
file_spec
Specifies files to be extracted from the cabinet(s). May contain ? and * wild cards. Multiple file_specs may be supplied.

Examples:

CommandBehavior
EXTRACT foo.ex_Assuming foo.ex_ contained just the single file foo.exe, then foo.exe would be extracted and placed in the current directory.
EXTRACT foo.ex_ bar.exeAssuming foo.ex_ contained just the single file foo.exe, then foo.exe would be extracted and placed in the current directory in the file bar.exe.
EXTRACT cabinet.1Assuming cabinet.1 contains multiple files, then a list of the files stored in the cabinet would be displayed.
EXTRACT cabinet.1 *.exeExtract all *.EXE files from cabinet.1 and place them in the current directory

Back to: MakeCAB User's Guide

Microsoft MSZIP Data Compression Format

Copyright © 1997 Microsoft Corporation. All rights reserved.

Topics in this section

Introduction
Implementation Details
Where to Find the 'Deflate' Specifications

Introduction

This document describes the format of MSZIP compressed data as used in the MSZIP compression mode of Microsoft's cabinet files. The purpose of this document is to allow anyone to encode or decode MSZIP compressed data.

Back to: MSZIP Data Compression Format

Implementation Details

MSZIP compression has only minor variations from Phil Katz's 'deflate' method. Rather than re-document this method, this document will explain these variations and refer the reader to publicly available 'deflate' documents. Some 'deflate' implementations may contain extensions to the original specifications, but MSZIP uses only the three basic modes of deflate: stored, fixed Huffman tree, and dynamic Huffman tree.

Each MSZIP data block is the result of a complete 'deflate' compression operation. Each block is flushed out of the compressor before the next block begins, so the last sub-block in each block will be marked as the 'end' of the stream. Any decoding trees are discarded after each block, with only the history buffer surviving from one block to the next. Each data block represents 32k uncompressed, except that the last block in a folder may be smaller. A two-byte MSZIP signature precedes the compressed encoding in each block, consisting of the bytes 0x43, 0x4B.

The maximum compressed size of each MSZIP block is 32k + 12 bytes. This allows for the data to be passed as two separate "stored" sub-blocks, which each have a 5-byte overhead, plus the 2-byte signature. The Microsoft MSZIP compressor will emit "stored" sub-blocks with a length of exactly 32k, while some implementations do not exceed 32k-1.

Whenever a cabinet folder boundary is reached, the compression history is discarded, so that decoding any folder does not require any prior data.

Back to: MSZIP Data Compression Format

Where to Find the "Deflate" Specifications

The "deflate" algorithm was original documented by Phil Katz in APPNOTE.TXT, which accompanied the PKZip software. Its most recent description can be found in RFC 1951. (Try ftp://ftp.uu.net/graphics/png/documents/zlib/zdoc-index.html for pointers to obtain this RFC.)

Back to: MSZIP Data Compression Format

Show:
© 2014 Microsoft