Thursday, 16 May 2013

Use ASP.NET and DotNetZip to Create and Extract ZIP Files


Use ASP.NET and DotNetZip to Create and Extract ZIP Files


This article shows how to use DotNetZip to create and extract ZIP files in an ASP.NET application, and covers advanced features like password protection and encryption.

Contents [hide]

Introduction

In 1989, Phil Katz created the ZIP file format. Over the decades, it grew into one of the most popular and widely-used data compression and archival file formats. Windows has long supported the ZIP file format, offering built-in support for creating, reading, and extracting ZIP files. There are both commercial and free third-party applications for working with ZIP files, including WinZipWinRAR, and 7-Zip.
Over the years I've worked on a number of ASP.NET projects that required the ability to create and extract ZIP files. Unfortunately, the .NET Framework provides spotty support for working with ZIP files. The ZipPackage class, added to the .NET Framework version 3.0, can be used to create ZIP files; however, the ZIP files created by this class contain an additional manifest file as part of the archive. What's more, the ZipPackageclass cannot read or extract a ZIP file unless this manifest file is present. To boot, the ZipPackage class provides suboptimal compression ratios and lacks support for features like passwords, comments, and AES encryption.
The good news is that there is a feature-rich, free, open source ZIP implementation for .NET - DotNetZip. Using DotNetZip and a dash of C# code you can:
  • Create a new ZIP file and add one or more files or folders,
  • Read the contents of a ZIP file,
  • Extract all (or some) of the contents of a ZIP file to a specified folder,
  • Use advanced ZIP file format features, such as encrypting the contents of the ZIP and protecting them with a password.
To get started with DotNetZip you need to get your hands on the DotNetZip assembly, which can be downloaded from the DotNetZip project pageon CodePlex. From the project’s Releases page you can download the runtime, the developer's kit, or the source code. This runtime and developer's kit downloads include the DotNetZip assembly, Ionic.Zip.dll, along with examples and documentation. (If you download the source code you will need to compile DotNetZip yourself.)
NOTEYou can also get your hands on the DotNetZip assembly by downloading the demo application for this article, which uses DotNetZip verison 1.9. You'll find this assembly in the demo application's Bin folder.
This article shows how to use DotNetZip to create and extract ZIP files in an ASP.NET application, and covers advanced features like password protection and encryption.

Creating a ZIP File

Creating a ZIP file with DotNetZip involves three steps:
  1. Create a ZipFile object,
  2. Add one or more entries to the ZipFile object, and
  3. Save the ZIP file to disk or to a stream.
The ZipFile class implements the IDisposable interface; consequently, when creating a ZipFile object you should do so with a using block, as the following code snippet illustrates:

Listing 1: Create a new ZipFile object.

1.using (var zip = new ZipFile())
2.{
3....
4.}
The ZipFile class offers a variety of methods for adding entries to the ZIP. The AddFile and AddFiles methods allow you to add one or more existing files; use the AddDirectory method to add all of the files in a specified directory to the ZIP. You can also add programmatically-generated content to the ZIP using the AddEntry method.
The code snippet below adds two entries to the ZIP file:
  • The image file named Sam.jpg and located in the website's Images folder, which is added via the AddFile method, and
  • An entry named ZIPInfo.txt, whose content is specified at runtime via the AddEntry method.

Listing 2: Two entries have been added to the ZIP file.

1.using (var zip = new ZipFile())
2.{
3.var imagePath = Server.MapPath("~/Images/Sam.jpg");
4.zip.AddFile(imagePath);
5.zip.AddEntry("ZIPInfo.txt""This ZIP file was created on " + DateTime.Now);
6....
7.}
After adding all of the entries to the ZIP file, the final step is to save it. You can save the ZIP file to a file on disk or to any stream using theZipFile class's Save method. The code snippet below saves the ZIP to a file named PictureOfSam.zip in the website's root folder.

Listing 3: The ZIP file has been saved to a file named PictureOfSam.zip.

1.using (var zip = new ZipFile())
2.{
3....
4.var saveToFilePath = Server.MapPath("~/PictureOfSam.zip"
5.zip.Save(saveToFilePath);
6.}

Determining Where Entries are Located Within the ZIP File

If you are following along at your computer and have created the PictureOfSam.zip file, you may be surprised when opening the ZIP and examining its structure. There are two entries in the ZIP file - Sam.jpg and ZIPInfo.txt - but the Sam.jpg entry is likely contained within one (or more) subfolders. For instance, when I ran the above code from my computer it created a ZIP file that contained a My Projects folder, which contained a subfolder named Writings, with another subfolder named DotNetSlackers, and so on, all the way down to an Images folder where the Sam.jpg entry is located. (The ZIPInfo.txt entry is at the root of the ZIP folder, as you'd expect.) What's going on here?
By default, DotNetZip's AddFileAddFiles, and AddDirectory methods create a folder structure in the ZIP that mirrors their folder structure on disk. On my computer, the Sam.jpg file resides in the folder D:\My Projects\Writings\DotNetSlackers\...\Images, which explains the folder structure in the ZIP file.
You can override the default behavior and specify the folder the file should be added to in the ZIP by specifying the folder name as the second input parameter to the AddFileAddFiles, or AddDirectory methods. Use string.Empty to have the file added to the root of the ZIP.
The following code snippet adds the Sam.jpg file to the ZIP twice: once in the root of the ZIP and once in a folder named My Images.

Listing 4: The Sam.jpg image is twice added to the ZIP file - once in the root and once in a folder named My Images.

01.using (var zip = new ZipFile())
02.{
03.var imagePath = Server.MapPath("~/Images/Sam.jpg");
04.zip.AddFile(imagePath, string.Empty);
05.zip.AddFile(imagePath, "My Images");
06.zip.AddEntry("ZIPInfo.txt""This ZIP file was created on " + DateTime.Now);
07.var saveToFilePath = Server.MapPath("~/PictureOfSam2.zip");
08.zip.Save(saveToFilePath);
09.}
Firgure 1 shows a screen shot of the corresponding ZIP file when opened using WinRAR, a third-party ZIP program. Note how the ZIP's root folder contains two entries - Sam.jpg and ZIPInfo.txt - as well as a folder named My Images, which also houses a copy of Sam.jpg.

Figure 1: The ZIP file now contains both Sam.jpg and ZIPInfo.txt in its root folder. There’s also a My Images folder.

The ZIP file now contains both Sam.jpg and ZIPInfo.txt in its root folder. There’s also a My Images folder.
To indicate where entries added by the AddEntry method should go, specify the folder path in the entry name. For example, in the ZIP file above the ZIPInfo.txt entry is added to the root. To have it placed in the My Images folder instead, you would specify the entry name as My Images\ZIPInfo.txt like so:
1.zip.AddEntry(@"My Images\ZIPInfo.txt", ...);

Password Protecting and Encrypting ZIP File Entries

By default, the entries in a ZIP file are not encrypted nor password protected. You can secure the contents of a ZIP file by applying a password and encryption scheme to one or more ZIP file entries. This is accomplished in DotNetZip by setting the ZipFile object's Password and Encryptionproperties.
Keep in mind that while the Password and Encryption properties are defined on the ZipFile object, they apply to the individual entries added to the ZIP file. You can have a ZIP file that contains some entries that are not protected, some that are protected with a particular password and encryption scheme, and others that are protected with a different password and encryption scheme. What password and encryption scheme is applied to an entry depends on the values of the ZipFile object's Password and Encryption properties at the time the entry is added to the ZIP. Therefore, to have the entire ZIP file protected with the same password and encryption scheme, set these two properties prior to adding any entries.
The following snippet shows how to set a password and encryption scheme for a ZIP file. Here we are adding two entries, Sam.jpg and ZIPInfo.txt, but only Sam.jpg is protected. By design, the ZIPInfo.txt entry is not protected because it is added before the Password and Encryption properties are set.

Listing 5: The Sam.jpg entry is encrypted and protected with a password.

01.using (var zip = new ZipFile())
02.{
03.zip.AddEntry("ZIPInfo.txt""This ZIP file was created on " + DateTime.Now);
04.zip.Password = "password";
05.zip.Encryption = EncryptionAlgorithm.PkzipWeak;
06.var imagePath = Server.MapPath("~/Images/Sam.jpg");
07.zip.AddFile(imagePath, string.Empty);
08.var saveToFilePath = Server.MapPath("~/PictureOfSam3.zip");
09.zip.Save(saveToFilePath);
10.}
Alternatively, I could have made the ZIPInfo.txt entry unprotected by:
  • Setting the Password and Encryption properties to password and EncryptionAlgorithm.PkzipWeak,
  • Adding the Sam.jpg entry,
  • Setting the Password property to null, and then
  • Adding the ZIPInfo.txt entry.
In short, whenever you add an entry to a ZIP file the current Password and Encryption property values are used to protect that entry; if Password isnull then the entry is not protected. For more information on how password and encryption settings are applied to ZIP file entries, refer to thePassword property technical documentation.
All password protected entries are encrypted; which algorithm is used to encrypt a protected entry is determined by the value of ZipFile object'sEncryption property at the time the entry is added. By default, the Encryption property is set to the PkzipWeak value when a password is specified. This instructs DotNetZip to encrypt the entry using the Zip 2.0 encryption algorithm defined in the ZIP file format specification. DotNetZip also supports 128- and 256-bit AES encryption.
NOTEZip 2.0 is known to be a weak encryption algorithm. Unfortunately, Zip 2.0 is the only standard encryption algorithm that is supported across all ZIP file readers. For example, WinRAR can read and extract AES encrypted ZIP file entries. However, Windows Explorer cannot extract AES encrypted ZIP file entries. For further discussion, refer to the Encryption property technical documentation.

Putting It All Together: Building a Multi-File Download Service

The download for this article includes a demo application that provides a real-world use for creating ZIP files in a web application - allowing users to select and download multiple files as a single ZIP file.
The demo application contains a folder named TaxForms with six PDF documents I downloaded from the Internal Revenue Service’s website. FromCreateZIP.aspx the user is prompted to select one or more forms to download. They can also choose to password protect the forms being downloaded by entering a password. Figure 2 shows a screen shot of CreateZIP.aspx after selecting the forms to download and entering a password. Clicking the "Download Forms" button brings up a Save As / Open dialog box, allowing them to save or open the ZIP file that contains the selected forms.

Figure 2: The user has selected four forms to download.

The user has 

selected four forms to download.
When the user clicks the "Download Forms" button there is a postback, at which point the ZIP file is created. The selected forms are added to the ZIP file in a folder named Requested Forms. Additionally, a README.txt file that provides details about the forms IS added to the root of the ZIP file. This ZIP file is then sent back directly to the client (without being saved to disk), prompting the user's browser to display the Save As / Open dialog box.
The following code performs the tasks detailed above. To start, a Content-Disposition HTTP header is added to instruct the browser to display the Save As / Open dialog box. Next, the Response.ContentType property is set to application/zip, which tells the browser it is receiving a ZIP file.
The ZIP file is then created. Note how the Password and Encryption properties are set only if the user supplies a password. The items in the CheckBoxList of forms are enumerated; the corresponding form for each selected checkbox is added to the ZIP file. Once the requested forms have been added to the ZIP, the Password property is set to null and the README.txt entry is added. (Recall that setting the Password property to nullensures that the following entries will not be protected. Consequently, the README.txt file will not be protected, regardless of whether the user specified a password.)
Finally, the contents of the ZIP file are saved to the Response object's OutputStream. This has the effect of sending the contents of the ZIP file directly to the visitor's browser.

Listing 6: A ZIP file containing the requested tax forms is created and streamed back to the visitor’s browser.

01.var downloadFileName = string.Format("TaxForms-{0}.zip", DateTime.Now.ToString("yyyy-MM-dd-HH_mm_ss"));
02.Response.AddHeader("Content-Disposition""attachment; filename=" + downloadFileName);
03.Response.ContentType = "application/zip";
04.using (var zip = new ZipFile())
05.{
06.if (!string.IsNullOrEmpty(txtZIPPassword.Text))
07.{
08.zip.Password = txtZIPPassword.Text;
09.zip.Encryption = EncryptionAlgorithm.PkzipWeak;
10.}
11.var readMeContent = string.Format("This ZIP file was created by DotNetZip at {0} and contains the following files:{1}{1}", DateTime.Now, Environment.NewLine);
12.foreach (ListItem liForm in cblTaxForms.Items)
13.{
14.if (liForm.Selected)
15.{
16.var fullTaxFormFilePath = Server.MapPath("~/TaxForms/" + liForm.Value);
17.zip.AddFile(fullTaxFormFilePath, "Requested Forms");
18.readMeContent += string.Format("\t* {0} - {1}{2}", liForm.Text, liForm.Value, Environment.NewLine);
19.}
20.}
21.// Do NOT protect this file
22.zip.Password = null;
23.zip.AddEntry("README.txt", readMeContent);
24.zip.Save(Response.OutputStream);
25.}

Reading and Extracting ZIP Files

Like with creating a ZIP file, reading or extracting a ZIP file's contents is accomplished using the ZipFile class. Use the Read method to load an existing ZIP file into a ZipFile instance. Using the Read method you can load a ZIP file from disk, from a byte array, or from a stream. The code snippet below shows how to read the contents of a ZIP file named LINQDemos.zip, which resides in the website's root folder.
1.var zipFileToRead = Server.MapPath("~/LINQDemos.zip");
2.using (var zip = ZipFile.Read(zipFileToRead))
3.{
4....
5.}
The ZipFile object's Entries property returns a collection of ZipEntry objects, which model the entries in the ZIP file. The ZipEntry class offers properties like FileNameUncompressedSizeCompressedSizeCreationTime, and so on. Note that folders in the ZIP file are considered entires and therefore appear in the Entries collection. To determine if an entry is a folder or a compressed file, consult the ZipEntry object's IsDirectoryproperty.
The following code snippet enumerates the entries in the LINQDemos.zip file. For each non-folder entry, the entry's name, compressed size, and original size are appended to a StringBuilder object, building up the markup for a bulleted list. Figure 3 shows this markup when viewed through a browser.

Listing 7: The non-folder entries in the LINQDemos.zip file are displayed in a bulleted list.

01.var zipFileToRead = Server.MapPath("~/LINQDemos.zip");
02.using (var zip = ZipFile.Read(zipFileToRead))
03.{
04.output.Append("<ul>");
05.foreach (var entry in zip.Entries)
06.{
07.if (!entry.IsDirectory)
08.{
09.output.AppendFormat("<li><b>{0}</b> - Packed Size / Original Size: {1:N0} / {2:N0}</li>",
10.entry.FileName, entry.UncompressedSize, entry.CompressedSize,entry.);
11.}
12.}
13.output.Append("</ul>");
14.}

Figure 3: The contents of the LINQDemos.zip file are displayed in a bulleted list.

The contents of the LINQDemos.zip file are displayed in a bulleted list.
To extract a file use the ZipEntry class's Extract method. You can extract a ZIP file entry to the file system or to a stream. By default, extracting an entry to the file system places it in the current, working folder. However, you may optionally provide an alternate folder to place the extracted file in. Also, you can specify the action to take when extracting a file to a location where there's alreayd an exsiting file with the same name.
For example, to extract the contents of LINQDemos.zip to the LINQDemos folder in the website, you could use the following code:

Listing 8: The contents of the LINQDemos.zip file are extracted, one-by-one, to the ~/LINQDemos folder.

1.var zipFileToRead = Server.MapPath("~/LINQDemos.zip");
2.var extractToFolder = Server.MapPath("~/LINQDemos");
3.using (var zip = ZipFile.Read(zipFileToRead))
4.{
5.foreach (var entry in zip.Entries)
6.entry.Extract(extractToFolder, ExtractExistingFileAction.OverwriteSilently);
7.}
The above code reads in the LINQDemos.zip ZIP file and enumerates its entries. Each entry is extracted to the target directory (~/LINQDemos) with instructions to silently overwrite any files that exist in that folder with the same name.
The ZipFile class also offers methods for extracting one or more of its entries. For instance, the above code could be replaced with a call to theZipFile object's ExtractAll method. The ExtractSelectedEntries method extracts all entries that match a particular search criteria. The following code snippet shows how to extract only those entries from the LINQDemos.zip file that have the file extension .cs.

Listing 9: Only the .cs files are extracted from LINQDemos.zip

01.var zipFileToRead = Server.MapPath("~/LINQDemos.zip");
02.var extractToFolder = Server.MapPath("~/LINQDemos");
03.using (var zip = ZipFile.Read(zipFileToRead))
04.{
05.zip.ExtractSelectedEntries("name=*.cs",
06.null,
07.extractToFolder,
08.ExtractExistingFileAction.OverwriteSilently);
09.}

Putting It All Together: Extracting User-Uploaded ZIP Files

The demo application includes a page named ExtractZIP.aspx that allows a user to upload a ZIP file from their computer to the website, where its contents are extracted and saved to a folder on the web server's file system. There's also a textbox on the page for the user to enter a password, in the case that the ZIP contains protected entries. After extracting the ZIP file's contents, a GridView control displays details about the uploaded ZIP file's entries.
The code that runs when the user uploads their ZIP file follows. Note that the ZipFile object is created by calling its Read method and passing in the contents of the uploaded file as a byte array. Next, the txtZIPPassword TextBox is examined; if the user supplied a password then the ZipFileobject's Password property is set accordingly. Note that you do not need to specify the Encryption property when extracting protected files - DotNetZip can automatically determine how each entry was encrypted.
Finally, all of the files in the ZIP are extracted to the ~/UserUploads folder using the ExtractExistingFileAction.DoNotOverwrite option. This option instructs DotNetZip that if a file with the same name is found then it should leave the existing file and continue with extracting the remaining files from the ZIP. The GridView on the page is bound to the ZipFile object's Entries collection, showing the user the contents of the ZIP they uploaded.

Listing 10: The contents of the uploaded ZIP file are extracted to the ~/UserUploads folder.

01.var extractToFolder = Server.MapPath("~/UserUploads");
02.using (var zip = ZipFile.Read(fuZIP.FileBytes))
03.{
04.if (!string.IsNullOrEmpty(txtZIPPassword.Text))
05.zip.Password = txtZIPPassword.Text;
06.zip.ExtractAll(extractToFolder, ExtractExistingFileAction.DoNotOverwrite);
07.gvZIPContents.DataSource = zip.Entries;
08.gvZIPContents.DataBind();
09.}
Figure 4 shows a screen shot of the ExtractZIP.aspx page after the user has uploaded a ZIP file.

Figure 4: A GridView displays the contents of the uploaded (and extracted) ZIP file.

A GridView displays the contents of the uploaded (and extracted) ZIP file.

Conclusion

The ZIP file format is one of the most popular file compression and archival formats. It is so common that support for creating, reading, and extracting ZIP files is built into both the Windows and Mac OS X operating systems.
Unfortunately, the .NET Framework offers incomplete support for working with ZIP files. If you need to work with ZIP files in an ASP.NET application your best bet is to turn to DotNetZip, a free, open source option. As we saw in this article, DotNetZip offers a straightforward API for creating, reading, and extracting ZIP files. And by setting a property or two, it's possible to create and read password protected ZIP files and to encrypt the entries using either the standard Zip 2 encryption or 128- or 256-bit AES encryption.
Happy Programming!

Further Reading

No comments:

Post a Comment