Easily create an ISO from a CD/DVD [Releasing ExtractISO tool and source]

Thursday, November 6, 2008

A few years back I found myself in need of a tool to create an ISO image from a CD I owned. I searched around a bit, but the only tools I could find to do that were part of much larger programs and cost money I didn't want to spend. The concept seemed simple enough that I figured it would be fairly easy to write my own tool to do this. So I did. :)

ExtractISO is a native, console-mode application that does what its name suggests: extracts the contents of a CD or DVD and saves it to a file. The tool and complete source code are attached to this post as ExtractISO.zip. Here's the documentation that comes with it:

====================================
==  ExtractISO                    ==
==  http://blogs.msdn.com/Delay/  ==
====================================


Summary
=======

ExtractISO - Creates an ISO image file from a CD/DVD

ExtractISO reads raw sectors from the specified CD/DVD volume and writes them
to the specified ISO image file.  The resulting file can be burned to a blank
CD/DVD disc, mounted in a virtual CD/DVD drive, or opened in an ISO viewer.

For a discussion of the limitations of this process, please refer to:
   http://www.cdrfaq.org/faq02.html#S2-28

Note that direct access to a CD/DVD volume requires administrative privileges.

ExtractISO attempts to read damaged media by retrying failed reads a few times
before giving up. In the event of a persistent read failure, the partial output
file is not deleted so that subsequent extraction attempts (possibly after
cleaning the media and/or using a different drive to read it) can be attempted
without the loss of any successfully extracted data.

Syntax: ExtractISO D: File.iso
   D: is the CD/DVD drive to extract from
   File.iso is the file to write to


Version History
===============

Version 1.11, 2008-11-04
Improved formatting of "bytes extracted" display value
Initial public release

Version 1.10, 2006-03-12
Added support for damaged media recovery (see above)

Version 1.01, 2005-05-18
Fixed silly integer overflow problem with "bytes extracted" display
Note: Problem affected display only; no functional impact

Version 1.0, 2005-04-27
Initial release

ExtractISO has been available to anyone inside Microsoft since I wrote it. Last week, I got an unexpected request to make it public so certain customers could use it. I was happy to do so, but one thing had been bothering me for a while: the status display of the bytes extracted so far didn't group by thousands. Instead of "1,000,000", the counter displayed as "1000000" which I find more difficult to parse. This was such a small issue, I never got around to fixing it - but now seemed like the ideal time to do so!

Unfortunately, this formatting task which is so simple in .NET seems to be quite a bit more involved with the Win32 API. The first problem is that printf doesn't support outputting the thousands separator. I considered using something like the sample code the previous link suggests, but wanted something simpler and easier to understand. A bit of research turned up the GetNumberFormat API which seemed like the perfect solution. However, the GetNumberFormat API suffers from at least two notable shortcomings: it's not easy to customize the output and the input needs to be a string. I did a quick test and found that the output of GetNumberFormat for "1000000" is "1,000,000.00" (on my machine using the default United States settings). This is close to what I wanted, but displaying the integral byte count with two digits of decimal precision just seems silly to me. So I looked for an easy way to customize the output via NUMBERFMT, and ran into the same issues Michael talks about in the post I linked to. When further research turned up no better alternatives, I decided to use GetNumberFormat and then "fix" its output by calling GetLocaleInfo(LOCALE_SDECIMAL) and removing everything after the localized decimal character(s). I like this approach because it is almost all platform code (i.e., code I don't have to write/test) and should be correct for all cultures where numbers are written as "1,000,000.00", "1.000.000,00", etc..

Other than the formatting issue I've just discussed, I made no other changes to the ExtractISO implementation. The only things I did were to update some of the VERSIONINFO settings, tweak the Build.cmd script, and recompile with the latest Visual Studio 2008.

ExtractISO works reasonably quickly, but it's worth mentioning that performance was specifically not a goal when I wrote it. The code implements a simple read/write loop with a 1 MB buffer and makes no attempt to interleave the two operations. The large-ish buffer should help minimize the overhead of the read/write calls, but I suspect a different implementation that issues the reads and writes in parallel would be faster. However, achieving that parallelism comes at the price of complexity - and I didn't feel the time it would have taken to write and debug that code was justified here. Even at top speed, the extraction operation is going to take a minute or two - an extra minute on top of that seems a small price to pay for simpler, easier to maintain code. You are welcome to disagree, of course. :) If you develop a faster implementation, I'd love to hear about it!

ExtractISO is a simple tool with a simple purpose - and one that I've used quite happily for a number of years now. If you've got a hankering for ISOs and like free stuff, please have a look at ExtractISO!

PS - ExtractISO cannot be used to duplicate audio CDs or copy-protected DVDs: audio CDs use a different storage scheme than data CDs and copy-protected DVDs store their encryption key in an "inaccessible" location of the disk.

[ExtractISO.zip]

Tags: Technical Utilities