分类: C/C++
2008-02-01 12:28:26
With this project, C++ and .NET programmers get a very versatile library for compression and extraction of Microsoft CAB files.
.NET 1.1 does not offer compression functionality.
.NET 2.0 offers the System.IO.Compression.GZipStream
class. But this is awkward to use and very primitive: it can only compress a stream but it is not possible to compress folders containing files and subfolders.
If you search the internet for more comfortable compression libraries, you find, for example, ICSharpCode.SharpZipLib.dll which offers ZIP compression. But this library is awkward to use and buggy, and so is unusable. Although the bugs have been known for years, the author has not fixed them.
I asked myself why should I search for another open source library (which will again have other bugs) while Windows itself supports CAB archives since the first days? Microsoft's Cabinet.dll (in the System32 directory) is not buggy. Many Microsoft installers (like the installer for Internet Explorer or Windows patches) use it. Additionally, CAB reaches a much better compression ratio than ZIP. Finally I found the project "" from Luuk Weltevreden on CodeProject. He created a very versatile wrapper around Microsoft's Cabinet.dll, consisting of C++ templates.
But he did only half of the work. He wrote good extraction classes, but the compression class was completely missing. I worked on his code, fixed a serious bug, simplified some awkward code, and added the missing compression functionality, encryption, Unicode support and more. Additionally, I added all files you need to compile the project. There is no need to download anything from Microsoft anymore, like it was for Luuk's project.
There are two completely different types of CAB files: The ones which this project supports are the "Microsoft CAB" files (also called "MS-CAB"). The Microsoft pack format is also known as MSZIP.
Some years later, InstallShield created the "InstallShield CAB" files. But these are absolutely incompatible with the MS-CAB files although they use the same file extension!
If you open a MS-CAB file with a hex editor, you will notice that the first four bytes are "MSCF" (MicroSoft Cab File), while the first three bytes of an InstallShield-CAB file are "ISc". (InstallShield Cab). You cannot open or create InstallShield CAB files with this project. There exist only very few tools which are capable of managing InstallShield CAB files; for example, the tool WinPack which you can download from my homepage.
MS-CAB files have a very good compression ratio.
To test this I packed a bunch of about hundred text files. This is the result of my test:
Pack format | Packed File size |
139 kB | |
TAR + GZ | 142 kB |
ARJ | 174 kB |
TAR + LZH | 189 kB |
RAR | 197 kB |
TAR + JAR | 242 kB |
ZIP | 242 kB |
Microsoft's intention of CAB files was to use them for installations:
Many installers are stupid. If you start an unintelligent installer like the one of Nero 6 you will see:
In contrary with this Cabinet library you can build an intelligent installer:
Scenario 1. You need two files: a tiny EXE file and a huge CAB file.
Put a huge CAB file on a local server or CD or DVD and let the user only start a tiny EXE setup file. The installer will start immediately and extract only the files from the CAB file which are really needed. This cabinet library obviously can extract the whole CAB file. But it is also possible to extract only specific files directly from the CAB file on server/CD/DVD to harddisk. The data transfer is compressed and - if you like - encrypted.
Scenario 2. You need only one huge EXE file
You can also embed the CAB file into the Setup.exe and directly extract specific files from the embedded resource in memory without creating temporary files. This scanrio makes only sense for small setup's otherwise users with few RAM will get problems if they start a 100 MB Exe file!
To add CAB support to your C++ project download the project CabinetT at the top of this page (a demo application is included) and add the four small header files to your solution.
The second project is for .NET developers. I wrote a wrapper in Managed C++ around this C++ project. The result compiles into a .NET DLL. You simply add the .NET assembly CabLib.dll to the references of your .NET project (C# or Visual Basic .NET or Managed C++) and you get CAB support. In the second download at the top of this page you will find CabLib.DLL already compiled and ready to use. (A demo application is included)
You will notice that there are more functions than these in the .NET assembly. Don't call them as they are for internal use only.
Microsoft's tiny Cabinet.dll which is located in your System(32) directory since Windows NT/98 offers the following Compression API:FciCreate FciAddFile FciFlushCabinet FciFlushFolder FciDestroy
And the Extraction API:FdiCreate FdiIsCabinet FdiCopy FdiDestroy
You get a detailed description of these functions in the file Cabinet Doku.doc which you find in both projects and the files FCI.H and FDI.H contain plenty comments.
The API in Cabinet.Dll uses a bunch of Callbacks which are called while a CAB file is created or extracted. The C++ project wraps these callbacks and you can override each of the callback functions to modify the behaviour. The .NET project offers events which you can use to handle these callbacks in your .NET application.
You can use these callbacks / events to filter specific files or you can read compression data from a stream or from memory instead of a file on disk. This makes the library extremly versatile. (examples see below)
Using the Compression functions
The following sample compresses into a file "C:\Temp\Packed.cab".
The file "C:\Windows\Explorer.exe" will be packed into a subfolder "FileManager" in the CAB file.
The file "C:\Windows\Notepad.exe" will be packed into a subfolder "TextManager" in the CAB file.
C++
Cabinet::CCompress i_Compress;
if (!i_Compress.CreateFCIContext("C:\\Temp\\Packed.cab"))
{ Error handling... }
if (!i_Compress.AddFile("C:\\Windows\\Explorer.exe",
"FileManager\\Explorer.exe", 0))
{ Error handling... }
if (!i_Compress.AddFile("C:\\Windows\\Notepad.exe",
"TextManager\\Notepad.exe", 0))
{ Error handling... }
if (!i_Compress.FlushCabinet(FALSE))
{ Error handling... }
C#
ArrayList i_Files = new ArrayList();
i_Files.Add(new string[] { @"C:\Windows\Explorer.exe",
@"FileManager\Explorer.exe" });
i_Files.Add(new string[] { @"C:\Windows\Notepad.exe",
@"TextManager\Notepad.exe" });
CabLib.Compress i_Compress = new CabLib.Compress();
i_Compress.CompressFileList(i_Files, @"C:\Temp\Packed.cab", 0);
You can also easily compress all HTM files in the folder "C:\Web" and all its subfolders into a CAB file which will reflect the folder structure found on harddisk:
C#
CabLib.Compress i_Compress = new CabLib.Compress();
i_Compress.CompressFolder(@"C:\Web", @"C:\Temp\Packed.cab", "*.htm", 0);
If you want to deliver your data on a medium with limited size or for download on a webpage you can split the CAB file into pieces which the extraction functions will automatically put together afterwards.
The following sample will create CAB files of 200 kB.
In this case the file name !MUST! contain a %d at the end!!
C++
Cabinet::CCompress i_Compress;
if (!i_Compress.CreateFCIContext("C:\\Temp\\Packed_%d.cab", 200000))
{ Error handling... }
etc..
C#
i_Compress.CompressFileList(i_Files, @"C:\Temp\Packed_%d.cab", 200000);
or
i_Compress.CompressFolder(@"C:\Web", @"C:\Temp\Packed_%d.cab", "*.htm",
200000);
During compression Cabinet.DLL will create some temporary files which will be automatically deleted afterwards.
By default it uses the TEMP directory which Windows specifies. If you want to compress huge files and the space on drive C: is low you should specify a TEMP directory on another drive. It is possible to use the same directory as output folder for the CAB file and as TEMP directory.
C++ and C#
i_Compress.SetTempDirectory("E:\\Temp");
You can encrypt the CAB file with a key. The key may be a string or zero terminated binary data. The longer the key, the more secure is the encryption. The encryption algorithm is not as secure as PGP but much better than the algorithm used for ZIP encryption. The interesting thing is that no tool in the world will be able to open your CAB file, because a proprietary algorithm is used. Only your software will be able to open the CAB file.
C++ and C#
i_Compress.SetEncryptionKey(
"KHzt/(90aresD$%§&UGjhgoh89äÖLÜnkjjkbIUH(I/H809z9z");
The underlying Cabinet.DLL does not support Unicode, but the .NET project uses a trick to allow Unicode paths and filenames to be compressed: All files are stored using an Ansii filename in the Cabinet and an additional textfile in the CAB stores the origional Unicode filenames, which will be restored after extraction.
i_Compress.AbortOperation()
you can abort a lenghty compression. Obviously this must be called from another thread.i_Compress.FlushFolder()
you can force that the current folder is finished. i_Compress.FlushCabinet()
you force that the current CAB file is closed and any further files to be added will be written to the next CAB file in the split sequence.CCompress.OnFilePlaced() (C++ Callback)
CabLib.Compress.FilePlaced (.NET event)This is called whenever the compression has successfully placed a file into the cabinet.
CCompress.OnUpdateStatus()
(C++ Callback)CabLib.Compress.UpdateStatus
(.NET event)This can be used to update your GUI to display the progress during a lengthy compression.
ATTENTION:
If you want to update the GUI you should execute the compression action from a NON-GUI thread. In C# you must call Form.Invoke()
in the event handler routine otherwise you will run into trouble!
For details see the file Cabinet Doku.doc and the plenty comments in the file FCI.H
If you want a different behaviour for compression, do NOT modify the existing compression class CCompressT
. Instead derive a new template class from CCompressT
and override the functions you want to change.
Using the Extraction functions
During extraction there will be NO temporary files created.
The following sample extracts a file "C:\Temp\Packed.cab" into the folder "E:\ExtractFolder". The required subfolders will be created automatically if the CAB file contains subfolders.
C++
Cabinet::CExtract i_Extract;
if (!i_Extract.CreateFDIContext())
{ Error Handling ... }
if (!i_Extract.ExtractFile("C:\\Temp\\Packed.cab", "E:\\ExtractFolder"))
{ Error Handling ... }
C#
CabLib.Extract i_Extract = new CabLib.Extract();
i_Extract.ExtractFile(@"C:\Temp\Packed.cab", @"E:\ExtractFolder");
The following sample extracts a Cabinet file which is stored in the Win32 resources of a DLL or EXE file.
You can extract files DIRECTLY from a CAB file in memory!
There are some rules to respect when you add a CAB file to the resources of your project:
In the file Cabinet.rc of the C++ project and in CabLib.rc of the .NET project you find this line:
ID_CAB_TEST CABFILE "Res\\Test.cab"
and in the file Resource.h you find this line:
#define ID_CAB_TEST 101
IMPORTANT:
If you define ID_CAB_TEST in Resource.h, the resource will be stored under:
ResourceName = 101 (integer)
ResourceType = "CABFILE" (string)
If you do NOT define ID_CAB_TEST in Resource.h, the resource will be stored under:
ResourceName = "ID_CAB_TEST" (string)
ResourceType = "CABFILE" (string)
To extract the embedded resource Test.cab (which I added to both projects) write:
C++
Cabinet::CExtractResource i_Extract;
if (!i_Extract.CreateFDIContext())
{ Error Handling ... }
if (!i_Extract.ExtractResource("Cabinet.exe", ID_CAB_TEST, "CABFILE",
"C:\\ExtractFolder"))
{ Error Handling ... }
C#
CabLib.Extract i_Extract = new CabLib.Extract();
i_Extract.ExtractResource("CabLib.dll", 101, "CABFILE", @"C:\ExtractFolder");
The first parameter specifies the filename (without path) from which to extract the Win32 CAB resource. You can set this = 0 (null) if the resource is inside the EXE which has created the process.
You can use this functionality to extract a CAB file from ANY DLL currently loaded into the process or from the application EXE itself. To explore the resources of files which are already compiled download the tool .
Most of the Windows Update patches contain a CAB file inside.
.NET stores resources in a completely different way so you cannot see them in the tool .
Under the resource's properties you must set the Build Action = "Embedded Resource".
To extract a file Test.cab which is located in a project named MyProject in a subfolder named Resources write:
C#
System.Reflection.Assembly i_Ass =
System.Reflection.Assembly.GetExecutingAssembly();
System.IO.Stream i_Strm =
i_Ass.GetManifestResourceStream("MyProject.Resources.Test.cab");
CabLib.Extract i_Extract = new CabLib.Extract();
i_Extract.ExtractStream(i_Strm, @"E:\ExtractFolder");
Normally you will not need the following C++ functions:
With i_Extract.AbortOperation()
you can abort a lenghty extraction. Obviously this must be called from another thread.
With i_Extract.IsCabinet()
you can check if the specified CAB file is corrupt. If you try to extract a corrupt file you will get an error, so calling this is not necessary.
Similar to the encryption you can decrypt an archive:
C++ and C#
i_Extract.SetDecryptionKey(
"KHzt/(90aresD$%§&UGjhgoh89äÖLÜnkjjkbIUH(I/H809z9z");
If the CAB file contains files which had Unicode path- or filenames before compression, the .NET project will restore the origional Unicode filenames after extraction. This is not possible on Windows 98 or ME, because these OS do not support Unicode. In this case the files will be extracted but not renamed to their origional Unciode name and an exception will be thrown.
C#
CabLib.Extract i_Extract = new CabLib.Extract();
i_Extract.SetSingleFile("File_1.txt");
i_Extract.ExtractFile(@"C:\Temp\Packed.cab", @"E:\ExtractFolder");
evBeforeCopyFile
will not be fired. (see below)
CExtract.OnBeforeCopyFile()
(C++ Callback)
CabLib.Extract.BeforeCopyFile
(.NET event)This is called before Cabinet.dll copies an extracted file to disk. You get detailed information about the file to be extracted. If you don't want this file to be copied to disk you can return FALSE here and the file will be skipped. (Examples see below)
You can use this callback to display progress information in your GUI.
CExtract.OnAfterCopyFile()
(C++ Callback)CabLib.Extract.AfterCopyFile
(.NET event)This is called after Cabinet.dll has placed a new file onto disk.
You can use this callback to display progress information in your GUI.CExtract.OnCabinetInfo()
(C++ Callback)CabLib.Extract.CabinetInfo
(.NET event)This function will be called exactly once for each cabinet when it is opened. It passes information about the CAB file.CExtract.OnNextCabinet()
(C++ Callback)CabLib.Extract.NextCabinet
(.NET event)This function will be called when the next cabinet file in the sequence of splitted cabinets needs to be opened.
Here you can display a message like "Please insert disk 2!"
ATTENTION:
If you want to update the GUI you should execute the extraction action from a NON-GUI thread. In C# you must call Form.Invoke()
in the event handler routine otherwise you will run into trouble!
For details see the file Cabinet Doku.doc and the plenty comments in the file FDI.H
With the callback OnBeforeCopyFile() you can control exactly what you want to extract from the CAB file. The callback/event passes a structure kCabinetFileInfo which tells you details of the file to be extracted: file name, subfolder, full path, file size, file date/time and file attributes.
With this information you can decide if you want the file to be extracted and return false if not.
In C# you must attach an event handler first:
C#
CabLib.Extract.delBeforeCopyFile i_Delegate =
new CabLib.Extract.delBeforeCopyFile(OnBeforeCopyFile);
i_Extract.evBeforeCopyFile += i_Delegate;
i_Extract.ExtractResource("CabLib.dll", 101, "CABFILE", @"E:\ExtractFoder");
i_Extract.evBeforeCopyFile -= i_Delegate;
If you want to extract only the files from the CAB which have the extension ".DLL" (including all subfolders) you can write:
C++
BOOL OnBeforeCopyFile(kCabinetFileInfo &k_Info, void* p_Param)
{
int Len = (int)strlen(k_Info.s8_File); // length of filename
return (stricmp(k_Info.s8_File +Len -4, ".Dll") == 0);
}
C#
private bool OnBeforeCopyFile(CabLib.Extract.kCabinetFileInfo k_Info)
{
return k_Info.s_File.ToUpper().EndsWith(".DLL");
}
If you want to extract only one folder from a CAB with the name "Setup\" and all its subfolders, write:
C++
BOOL OnBeforeCopyFile(kCabinetFileInfo &k_Info, void* p_Param)
{
return (strnicmp(k_Info.s8_SubFolder, "Setup\\", 6) == 0);
}
C#
private bool OnBeforeCopyFile(CabLib.Extract.kCabinetFileInfo k_Info)
{
return k_Info.s_SubFolder.ToUpper().StartsWith(@"SETUP\");
}
If you want to make an update of existing files and you want only files on disk to be overwritten which have an older date than the files in the CAB, you can write:
C++
BOOL OnBeforeCopyFile(kCabinetFileInfo &k_Info, void* p_Param)
{
// try to open the file on disk
HANDLE h_File = CreateFile(k_Info.s8_FullPath, GENERIC_READ,
FILE_SHARE_READ, 0, OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL, 0);
// The file does not yet exist --> copy it!
if (h_File == INVALID_HANDLE_VALUE)
return TRUE;
FILETIME k_FileTime, k_LocalTime;
BOOL b_OK = GetFileTime(h_File, 0, 0, &k_FileTime);
CloseHandle(h_File);
if (!b_OK)
return TRUE;
// Last write time UTC --> Local time
FileTimeToLocalFileTime(&k_FileTime, &k_LocalTime);
return (CompareFileTime(&k_Info.k_Time, &k_LocalTime) > 0);
}
C#
private bool OnBeforeCopyFile(CabLib.Extract.kCabinetFileInfo k_Info)
{
if (!System.IO.File.Exists(k_Info.s_FullPath))
return true;
// retrieve local file time
System.DateTime k_FileTime =
System.IO.File.GetLastWriteTime(k_Info.s_FullPath);
return (k_Info.k_Time.CompareTo(k_FileTime) > 0);
}
This diagram demonstrates the C++ classes which are used in both projects :
The classes ending with "T" (like CExtractT
) are C++ templates. The otheres are real classes.
If you want a different behaviour, do NOT modify the existing classes. Instead derive a new template class from the existing calsses and override the functions you want to change.CExtractT
contains the functions to extract a "real" CAB file from disk.
CExtractT contains the following callbacks which are called from Cabinet.dll:Open()
to open a fileRead()
to read from a fileWrite()
to write to a fileSeek()
to set the file pointer or ask its positionClose()
to close a file
IMPORTANT: These callbacks are called from Cabinet.dll to read the CAB file AND to write all the extracted files to disk.
CExtractMemoryT
is a class which overrides the file access functions and replaces them with functions which read the CAB data from memory instead of disk.
CExtractMemoryT
itself cannot be instanciated. Other classes must be derived from it.
It provides these additional callbacks:
OpenMem()
to open the memory which represents the CAB file ReadMem()
to read from the memory of the CAB fileSeekMem()
to set the memory pointer or ask its positionCloseMem()
to release the memory which holds the CAB file
IMPORTANT: These callbacks are ONLY called when Cabinet.dll wants to read the CAB file.
CExtractResourceT
is derived from CExtractMemoryT
to read data from a Win32 resource.CExtractStreamT
is derived from CExtractMemoryT
to read data from a .NET stream.
You can easily derive your own classes to read data from a pipe out of the network / internet or whatever you like. The data stream must be capable of seeking. (random access)
If you want to debug the whole compression/extraction process with a tool like DebugView from you can set in the file CompressT.hpp / ExtractT.hpp:
#define _TraceCompress 1
or
#define _TraceExtract 1
If you are a C++ programmer and search for a comfortable way to signal your application when an event ocurres (like a file has been extracted) then have a look at my article about on CodeProject.
Although I described only the basic functionality and there is much more possible, this article now reached a nice length which makes me stop here. Study the source code and you will see that this tiny library is really versatile!
From my you can download free C++ books in compiled HTML format.
This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.
A list of licenses authors might use can be found