Chinaunix首页 | 论坛 | 博客
  • 博客访问: 5231172
  • 博文数量: 671
  • 博客积分: 10010
  • 博客等级: 上将
  • 技术积分: 7310
  • 用 户 组: 普通用户
  • 注册时间: 2006-07-14 09:56
文章分类

全部博文(671)

文章存档

2011年(1)

2010年(2)

2009年(24)

2008年(271)

2007年(319)

2006年(54)

我的朋友

分类: C/C++

2008-02-01 12:28:26

Introduction

With this project, C++ and .NET programmers get a very versatile library for compression and extraction of Microsoft CAB files.

.NET 1.1 does not offer compression functionality.

.NET 2.0 offers the System.IO.Compression.GZipStream class. But this is awkward to use and very primitive: it can only compress a stream but it is not possible to compress folders containing files and subfolders.

If you search the internet for more comfortable compression libraries, you find, for example, ICSharpCode.SharpZipLib.dll which offers ZIP compression. But this library is awkward to use and buggy, and so is unusable. Although the bugs have been known for years, the author has not fixed them.

I asked myself why should I search for another open source library (which will again have other bugs) while Windows itself supports CAB archives since the first days? Microsoft's Cabinet.dll (in the System32 directory) is not buggy. Many Microsoft installers (like the installer for Internet Explorer or Windows patches) use it. Additionally, CAB reaches a much better compression ratio than ZIP. Finally I found the project "" from Luuk Weltevreden on CodeProject. He created a very versatile wrapper around Microsoft's Cabinet.dll, consisting of C++ templates.

But he did only half of the work. He wrote good extraction classes, but the compression class was completely missing. I worked on his code, fixed a serious bug, simplified some awkward code, and added the missing compression functionality, encryption, Unicode support and more. Additionally, I added all files you need to compile the project. There is no need to download anything from Microsoft anymore, like it was for Luuk's project.

Features

  • This library is VERY easy to use.
  • This library is lightweight and fast.
  • This library can be extended very easily.
  • One project is for C++ developers.
  • One project is for .NET developers.
  • Both projects compile on Visual Studio .NET (7.0), .NET 2003 (7.1) and .NET 2005 (8.0)
  • The C++ project additionally compiles on Visual Studio 6
  • Encryption / Decryption of CAB files.
  • CAB files can contain trees of subfolders and files.
  • File dates and file attributes are preserved when compressing / extracting.
  • Extraction of CAB files which are embedded in the resources of your Win32 project or .NET project.
  • In .NET you can additionally extract CAB files from a stream.
  • The compression can split large CAB files into multiple pieces. (Pack1.cab, Pack2.cab, Pack3.cab etc)
  • An event handler allows to display the compression / extraction progress in the GUI of your application.
  • A lot of event handlers are called during compression and extraction which allows to interact with the progress (for example filtering specific files)
  • Both projects come with a demo application which shows how to compress and extract files and embedded CAB resources. Encryption and Decryption is also included in the demo.
  • Can be compiled as MBCS or UNICODE
  • This project makes use of Microsoft's Cabinet.dll in your System(32) directory, which is part of the operating system since Windows NT/98.
  • Cabinet.dll will be loaded only when it is needed and unloaded afterwards.
  • No additional Microsoft downloads required to compile and run this project.
  • The latest version (Jan 2007) does NOT require Msvcp70/71/80.DLL anymore.
  • The latest version (Jan 2007) of the .NET library supports Unicode in paths and filenames. (e.g. Japanese)
  • The latest version (Jan 2007) of the .NET library is thread safe.

Limitations

  • The size of files which are to be packed into the CAB file must not exceed 2 GB.
  • The resulting CAB file must not exceed 2 GB.
  • This project cannot compress or extract InstallShield CAB files. (see below)
  • You cannot add files to or delete files from an existing CAB archive.
  • For Windows 95 you have to deliver Microsofts "Cabinet.dll" which is part of the operation system since Windows NT/98.

Different CAB file formats

There are two completely different types of CAB files: The ones which this project supports are the "Microsoft CAB" files (also called "MS-CAB"). The Microsoft pack format is also known as MSZIP.

Some years later, InstallShield created the "InstallShield CAB" files. But these are absolutely incompatible with the MS-CAB files although they use the same file extension!

If you open a MS-CAB file with a hex editor, you will notice that the first four bytes are "MSCF" (MicroSoft Cab File), while the first three bytes of an InstallShield-CAB file are "ISc". (InstallShield Cab). You cannot open or create InstallShield CAB files with this project. There exist only very few tools which are capable of managing InstallShield CAB files; for example, the tool WinPack which you can download from my homepage.

Compression ratio

MS-CAB files have a very good compression ratio.
To test this I packed a bunch of about hundred text files. This is the result of my test:

Pack format Packed File size
CAB 139 kB
TAR + GZ 142 kB
ARJ 174 kB
TAR + LZH 189 kB
RAR 197 kB
TAR + JAR 242 kB
ZIP 242 kB

Intelligent installers

Microsoft's intention of CAB files was to use them for installations:

  • They are used in the Internet Explorer 6 Setup.
  • You see plenty CAB files on the Windows 95/98/ME setup CD.
  • All files ending with an underscore like "Kernel32.dl_" on your Windows 2000/XP setup CD are CAB files with the wrong file extension.

Many installers are stupid. If you start an unintelligent installer like the one of Nero 6 you will see:

  1. that it first extracts ALL files from the packed EXE installer into a temp directory. This is slow and the user has to wait until the first dialog will open.
  2. It is wasting diskspace and if the user's drive space on C: is low he will get an error of "no disk space".
  3. It carries the risk that temporary files remain on the disk when aborting the installation.

In contrary with this Cabinet library you can build an intelligent installer:

Scenario 1. You need two files: a tiny EXE file and a huge CAB file.

Put a huge CAB file on a local server or CD or DVD and let the user only start a tiny EXE setup file. The installer will start immediately and extract only the files from the CAB file which are really needed. This cabinet library obviously can extract the whole CAB file. But it is also possible to extract only specific files directly from the CAB file on server/CD/DVD to harddisk. The data transfer is compressed and - if you like - encrypted.

Scenario 2. You need only one huge EXE file

You can also embed the CAB file into the Setup.exe and directly extract specific files from the embedded resource in memory without creating temporary files. This scanrio makes only sense for small setup's otherwise users with few RAM will get problems if they start a 100 MB Exe file!

The C++ project

To add CAB support to your C++ project download the project CabinetT at the top of this page (a demo application is included) and add the four small header files to your solution.

The .NET project

The second project is for .NET developers. I wrote a wrapper in Managed C++ around this C++ project. The result compiles into a .NET DLL. You simply add the .NET assembly CabLib.dll to the references of your .NET project (C# or Visual Basic .NET or Managed C++) and you get CAB support. In the second download at the top of this page you will find CabLib.DLL already compiled and ready to use. (A demo application is included)

You will notice that there are more functions than these in the .NET assembly. Don't call them as they are for internal use only.

Cabinet.dll

Microsoft's tiny Cabinet.dll which is located in your System(32) directory since Windows NT/98 offers the following Compression API:
FciCreate FciAddFile FciFlushCabinet FciFlushFolder FciDestroy

And the Extraction API:
FdiCreate FdiIsCabinet FdiCopy FdiDestroy

You get a detailed description of these functions in the file Cabinet Doku.doc which you find in both projects and the files FCI.H and FDI.H contain plenty comments.

The API in Cabinet.Dll uses a bunch of Callbacks which are called while a CAB file is created or extracted. The C++ project wraps these callbacks and you can override each of the callback functions to modify the behaviour. The .NET project offers events which you can use to handle these callbacks in your .NET application.

You can use these callbacks / events to filter specific files or you can read compression data from a stream or from memory instead of a file on disk. This makes the library extremly versatile. (examples see below)


Using the Compression functions

File Compression

The following sample compresses into a file "C:\Temp\Packed.cab".
The file "C:\Windows\Explorer.exe" will be packed into a subfolder "FileManager" in the CAB file.
The file "C:\Windows\Notepad.exe" will be packed into a subfolder "TextManager" in the CAB file.

C++

Cabinet::CCompress i_Compress;
if (!i_Compress.CreateFCIContext("C:\\Temp\\Packed.cab"))
    { Error handling... }

if (!i_Compress.AddFile("C:\\Windows\\Explorer.exe", 
    "FileManager\\Explorer.exe", 0))
    { Error handling... }

if (!i_Compress.AddFile("C:\\Windows\\Notepad.exe",  
    "TextManager\\Notepad.exe", 0))
    { Error handling... }

if (!i_Compress.FlushCabinet(FALSE))
    { Error handling... }

C#

ArrayList i_Files = new ArrayList();
i_Files.Add(new string[] { @"C:\Windows\Explorer.exe",
                           @"FileManager\Explorer.exe" });
i_Files.Add(new string[] { @"C:\Windows\Notepad.exe",  
                           @"TextManager\Notepad.exe"  });

CabLib.Compress i_Compress = new CabLib.Compress();
i_Compress.CompressFileList(i_Files, @"C:\Temp\Packed.cab", 0);

You can also easily compress all HTM files in the folder "C:\Web" and all its subfolders into a CAB file which will reflect the folder structure found on harddisk:

C#

CabLib.Compress i_Compress = new CabLib.Compress();
i_Compress.CompressFolder(@"C:\Web", @"C:\Temp\Packed.cab", "*.htm", 0);


Compression splitting

If you want to deliver your data on a medium with limited size or for download on a webpage you can split the CAB file into pieces which the extraction functions will automatically put together afterwards.
The following sample will create CAB files of 200 kB.
In this case the file name !MUST! contain a %d at the end!!

C++

Cabinet::CCompress i_Compress;
if (!i_Compress.CreateFCIContext("C:\\Temp\\Packed_%d.cab", 200000))
    { Error handling... }

etc..

C#

i_Compress.CompressFileList(i_Files, @"C:\Temp\Packed_%d.cab", 200000);
or
i_Compress.CompressFolder(@"C:\Web", @"C:\Temp\Packed_%d.cab", "*.htm",
                          200000);


Setting the compression TEMP directory

During compression Cabinet.DLL will create some temporary files which will be automatically deleted afterwards.

By default it uses the TEMP directory which Windows specifies. If you want to compress huge files and the space on drive C: is low you should specify a TEMP directory on another drive. It is possible to use the same directory as output folder for the CAB file and as TEMP directory.

C++ and C#

i_Compress.SetTempDirectory("E:\\Temp");

Encryption

You can encrypt the CAB file with a key. The key may be a string or zero terminated binary data. The longer the key, the more secure is the encryption. The encryption algorithm is not as secure as PGP but much better than the algorithm used for ZIP encryption. The interesting thing is that no tool in the world will be able to open your CAB file, because a proprietary algorithm is used. Only your software will be able to open the CAB file.

C++ and C#

i_Compress.SetEncryptionKey(
               "KHzt/(90aresD$%§&UGjhgoh89äÖLÜnkjjkbIUH(I/H809z9z");

Unicode paths

The underlying Cabinet.DLL does not support Unicode, but the .NET project uses a trick to allow Unicode paths and filenames to be compressed: All files are stored using an Ansii filename in the Cabinet and an additional textfile in the CAB stores the origional Unicode filenames, which will be restored after extraction.


More compression functions

Normally you will not need the following C++ functions:
With i_Compress.AbortOperation() you can abort a lenghty compression. Obviously this must be called from another thread.
With i_Compress.FlushFolder() you can force that the current folder is finished.
With i_Compress.FlushCabinet() you force that the current CAB file is closed and any further files to be added will be written to the next CAB file in the split sequence.
For details see the file Cabinet Doku.doc and the plenty comments in the file FCI.H


Compression callbacks / events

CCompress.OnFilePlaced() (C++ Callback)
CabLib.Compress.FilePlaced
(.NET event)This is called whenever the compression has successfully placed a file into the cabinet.

CCompress.OnUpdateStatus() (C++ Callback)
CabLib.Compress.UpdateStatus (.NET event)This can be used to update your GUI to display the progress during a lengthy compression.

ATTENTION:
If you want to update the GUI you should execute the compression action from a NON-GUI thread. In C# you must call Form.Invoke() in the event handler routine otherwise you will run into trouble!

For details see the file Cabinet Doku.doc and the plenty comments in the file FCI.H

Extensions / Modifications

If you want a different behaviour for compression, do NOT modify the existing compression class CCompressT. Instead derive a new template class from CCompressT and override the functions you want to change.


Using the Extraction functions

File Extraction

During extraction there will be NO temporary files created.

The following sample extracts a file "C:\Temp\Packed.cab" into the folder "E:\ExtractFolder". The required subfolders will be created automatically if the CAB file contains subfolders.

C++

Cabinet::CExtract i_Extract;
if (!i_Extract.CreateFDIContext()) 
    { Error Handling ... }

if (!i_Extract.ExtractFile("C:\\Temp\\Packed.cab", "E:\\ExtractFolder"))
    { Error Handling ... }

C#

CabLib.Extract i_Extract = new CabLib.Extract();
i_Extract.ExtractFile(@"C:\Temp\Packed.cab", @"E:\ExtractFolder");



Win32 Resource Extraction

The following sample extracts a Cabinet file which is stored in the Win32 resources of a DLL or EXE file.
You can extract files DIRECTLY from a CAB file in memory!

There are some rules to respect when you add a CAB file to the resources of your project:

In the file Cabinet.rc of the C++ project and in CabLib.rc of the .NET project you find this line:

ID_CAB_TEST             CABFILE                 "Res\\Test.cab"

and in the file Resource.h you find this line:

#define ID_CAB_TEST                     101

IMPORTANT:
If you define ID_CAB_TEST in Resource.h, the resource will be stored under:
ResourceName = 101 (integer)
ResourceType = "CABFILE" (string)

If you do NOT define ID_CAB_TEST in Resource.h, the resource will be stored under:
ResourceName = "ID_CAB_TEST" (string)
ResourceType = "CABFILE" (string)

To extract the embedded resource Test.cab (which I added to both projects) write:

C++

Cabinet::CExtractResource i_Extract;
if (!i_Extract.CreateFDIContext()) 
    { Error Handling ... }

if (!i_Extract.ExtractResource("Cabinet.exe", ID_CAB_TEST, "CABFILE", 
                               "C:\\ExtractFolder"))
    { Error Handling ... }

C#

CabLib.Extract i_Extract = new CabLib.Extract();
i_Extract.ExtractResource("CabLib.dll", 101, "CABFILE", @"C:\ExtractFolder");

The first parameter specifies the filename (without path) from which to extract the Win32 CAB resource. You can set this = 0 (null) if the resource is inside the EXE which has created the process.

You can use this functionality to extract a CAB file from ANY DLL currently loaded into the process or from the application EXE itself. To explore the resources of files which are already compiled download the tool .

Most of the Windows Update patches contain a CAB file inside.

.Net Resource Extraction / Stream Extraction

.NET stores resources in a completely different way so you cannot see them in the tool .
Under the resource's properties you must set the Build Action = "Embedded Resource".
To extract a file Test.cab which is located in a project named MyProject in a subfolder named Resources write:

C#

System.Reflection.Assembly i_Ass  = 
    System.Reflection.Assembly.GetExecutingAssembly();
System.IO.Stream           i_Strm = 
    i_Ass.GetManifestResourceStream("MyProject.Resources.Test.cab");

CabLib.Extract i_Extract = new CabLib.Extract();
i_Extract.ExtractStream(i_Strm, @"E:\ExtractFolder");


More extraction functions

Normally you will not need the following C++ functions:

With i_Extract.AbortOperation() you can abort a lenghty extraction. Obviously this must be called from another thread.

With i_Extract.IsCabinet() you can check if the specified CAB file is corrupt. If you try to extract a corrupt file you will get an error, so calling this is not necessary.

Decryption

Similar to the encryption you can decrypt an archive:

C++ and C#

i_Extract.SetDecryptionKey(
                      "KHzt/(90aresD$%§&UGjhgoh89äÖLÜnkjjkbIUH(I/H809z9z");

Unicode paths

If the CAB file contains files which had Unicode path- or filenames before compression, the .NET project will restore the origional Unicode filenames after extraction. This is not possible on Windows 98 or ME, because these OS do not support Unicode. In this case the files will be extracted but not renamed to their origional Unciode name and an exception will be thrown.

How to extract only one file from the CAB file/resource/stream

C#

CabLib.Extract i_Extract = new CabLib.Extract();
i_Extract.SetSingleFile("File_1.txt");
i_Extract.ExtractFile(@"C:\Temp\Packed.cab", @"E:\ExtractFolder");
This will create a file E:\ExtractFolder\File_1.txt. The file to be exctracted MUST be located in the root folder of the CAB archive. If the file does not exist in the archive, nothing will happen. (no error) If a single file is extracted the event evBeforeCopyFile will not be fired. (see below)

Extraction callbacks / events

CExtract.OnBeforeCopyFile() (C++ Callback)

CabLib.Extract.BeforeCopyFile (.NET event)This is called before Cabinet.dll copies an extracted file to disk. You get detailed information about the file to be extracted. If you don't want this file to be copied to disk you can return FALSE here and the file will be skipped. (Examples see below)
You can use this callback to display progress information in your GUI.

CExtract.OnAfterCopyFile() (C++ Callback)
CabLib.Extract.AfterCopyFile (.NET event)This is called after Cabinet.dll has placed a new file onto disk.
You can use this callback to display progress information in your GUI.

CExtract.OnCabinetInfo() (C++ Callback)
CabLib.Extract.CabinetInfo (.NET event)This function will be called exactly once for each cabinet when it is opened. It passes information about the CAB file.

CExtract.OnNextCabinet() (C++ Callback)
CabLib.Extract.NextCabinet (.NET event)This function will be called when the next cabinet file in the sequence of splitted cabinets needs to be opened.
Here you can display a message like "Please insert disk 2!"


ATTENTION:

If you want to update the GUI you should execute the extraction action from a NON-GUI thread. In C# you must call Form.Invoke() in the event handler routine otherwise you will run into trouble!

For details see the file Cabinet Doku.doc and the plenty comments in the file FDI.H

Manipulating the extraction process

With the callback OnBeforeCopyFile() you can control exactly what you want to extract from the CAB file. The callback/event passes a structure kCabinetFileInfo which tells you details of the file to be extracted: file name, subfolder, full path, file size, file date/time and file attributes.

With this information you can decide if you want the file to be extracted and return false if not.

In C# you must attach an event handler first:

C#

CabLib.Extract.delBeforeCopyFile i_Delegate = 
                   new CabLib.Extract.delBeforeCopyFile(OnBeforeCopyFile);
i_Extract.evBeforeCopyFile += i_Delegate;

i_Extract.ExtractResource("CabLib.dll", 101, "CABFILE", @"E:\ExtractFoder");

i_Extract.evBeforeCopyFile -= i_Delegate;


How to extract only files with a specific file extension

If you want to extract only the files from the CAB which have the extension ".DLL" (including all subfolders) you can write:

C++

BOOL OnBeforeCopyFile(kCabinetFileInfo &k_Info, void* p_Param)
{ 
    int Len = (int)strlen(k_Info.s8_File);   // length of filename
    return (stricmp(k_Info.s8_File +Len -4, ".Dll") == 0);
}

C#

private bool OnBeforeCopyFile(CabLib.Extract.kCabinetFileInfo k_Info)
{
    return k_Info.s_File.ToUpper().EndsWith(".DLL");
}


How to extract only files within a specific subfolder in the CAB

If you want to extract only one folder from a CAB with the name "Setup\" and all its subfolders, write:

C++

BOOL OnBeforeCopyFile(kCabinetFileInfo &k_Info, void* p_Param)
{ 
    return (strnicmp(k_Info.s8_SubFolder, "Setup\\", 6) == 0);
}

C#

private bool OnBeforeCopyFile(CabLib.Extract.kCabinetFileInfo k_Info)
{
    return k_Info.s_SubFolder.ToUpper().StartsWith(@"SETUP\");
}

How to extract only newer files

If you want to make an update of existing files and you want only files on disk to be overwritten which have an older date than the files in the CAB, you can write:

C++

BOOL OnBeforeCopyFile(kCabinetFileInfo &k_Info, void* p_Param)
{
    // try to open the file on disk
    HANDLE h_File = CreateFile(k_Info.s8_FullPath, GENERIC_READ,
                               FILE_SHARE_READ, 0, OPEN_EXISTING, 
                               FILE_ATTRIBUTE_NORMAL, 0);
    
    // The file does not yet exist --> copy it!
    if (h_File == INVALID_HANDLE_VALUE)
        return TRUE;
    
    FILETIME k_FileTime, k_LocalTime;
    BOOL b_OK = GetFileTime(h_File, 0, 0, &k_FileTime);
    
    CloseHandle(h_File);
    
    if (!b_OK)
        return TRUE;
    
    // Last write time UTC --> Local time
    FileTimeToLocalFileTime(&k_FileTime,    &k_LocalTime);
    return (CompareFileTime(&k_Info.k_Time, &k_LocalTime) > 0);
}

C#

private bool OnBeforeCopyFile(CabLib.Extract.kCabinetFileInfo k_Info)
{
    if (!System.IO.File.Exists(k_Info.s_FullPath)) 
        return true;
    
    // retrieve local file time
    System.DateTime k_FileTime = 
         System.IO.File.GetLastWriteTime(k_Info.s_FullPath);

    return (k_Info.k_Time.CompareTo(k_FileTime) > 0);
}


The Extraction class hierarchy

This diagram demonstrates the C++ classes which are used in both projects :

The classes ending with "T" (like CExtractT) are C++ templates. The otheres are real classes.

If you want a different behaviour, do NOT modify the existing classes. Instead derive a new template class from the existing calsses and override the functions you want to change.

CExtractT contains the functions to extract a "real" CAB file from disk.

CExtractT contains the following callbacks which are called from Cabinet.dll:
Open() to open a file
Read() to read from a file
Write() to write to a file
Seek() to set the file pointer or ask its position
Close() to close a file

IMPORTANT: These callbacks are called from Cabinet.dll to read the CAB file AND to write all the extracted files to disk.

CExtractMemoryT is a class which overrides the file access functions and replaces them with functions which read the CAB data from memory instead of disk.

CExtractMemoryT itself cannot be instanciated. Other classes must be derived from it.
It provides these additional callbacks:

OpenMem() to open the memory which represents the CAB file
ReadMem() to read from the memory of the CAB file
SeekMem() to set the memory pointer or ask its position
CloseMem() to release the memory which holds the CAB file

IMPORTANT: These callbacks are ONLY called when Cabinet.dll wants to read the CAB file.

CExtractResourceT is derived from CExtractMemoryT to read data from a Win32 resource.
CExtractStreamT is derived from CExtractMemoryT to read data from a .NET stream.

You can easily derive your own classes to read data from a pipe out of the network / internet or whatever you like. The data stream must be capable of seeking. (random access)

Degugging

If you want to debug the whole compression/extraction process with a tool like DebugView from you can set in the file CompressT.hpp / ExtractT.hpp:

#define _TraceCompress   1
or
#define _TraceExtract    1
IMPORTANT:
To see anything in DebugView you must start the compiled application in Visual Studio with CTRL + F5

Signals and Slots

If you are a C++ programmer and search for a comfortable way to signal your application when an event ocurres (like a file has been extracted) then have a look at my article about on CodeProject.

And thats not all.....

Although I described only the basic functionality and there is much more possible, this article now reached a nice length which makes me stop here. Study the source code and you will see that this tiny library is really versatile!


From my you can download free C++ books in compiled HTML format.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found

阅读(5095) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~