Posts tagged "Technical"

"Make things as simple as possible, but not simpler." [ManagedMsiExec sample app shows how to use the Windows Installer API from managed code]

Monday, January 9, 2012

Windows Installer is the installation and configuration service used by Windows to add and remove applications. (For more information, Wikipedia has a great overview of Windows Installer, covering products, components, setup phases, permissions, etc..) In addition to exposing a rich, native API, Windows Installer comes with msiexec.exe, a command-line tool that offers fine-grained control over the install/uninstall process.

I wanted to familiarize myself with the use of the Windows Installer API from .NET, so I wrote a wrapper class to expose it to managed applications and then a simple program to exercise it. Unlike msiexec.exe which can do all kinds of things, my ManagedMsiExec supports only UI-less install and uninstall of a .MSI Windows Installer package (i.e., /quiet mode). By default, ManagedMsiExec provides simple status reporting on the command line and renders a text-based progress bar that shows how each phase is going. In its "verbose" mode, ManagedMsiExec outputs the complete set of status/progress/diagnostic messages generated by Windows Installer (i.e., /l*) so any problems can be investigated.

Aside: Although I haven't tested ManagedMsiExec exhaustively, it's fundamentally just a thin wrapper around Windows Installer, so I'd expect it to work for pretty much any MSI out there.

Here's how it looks when run:

C:\T>ManagedMsiExec
SYNTAX: ManagedMsiExec <--Install|-i|--Uninstall|-u> Package.msi [--Verbose|-v]
Windows Installer result: 87 (INVALID_PARAMETER)

Doing a simple install:

C:\T>ManagedMsiExec -i Package.msi
ManagedMsiExec: Installing C:\T\Package.msi
Windows Installer result: 0 (SUCCESS)

Doing a simple uninstall:

C:\T>ManagedMsiExec -u Package.msi
ManagedMsiExec: Uninstalling C:\T\Package.msi
Windows Installer result: 0 (SUCCESS)

Using verbose mode to diagnose a failure:

C:\T>ManagedMsiExec -u Package.msi -v
ManagedMsiExec: Uninstalling C:\T\Package.msi
ACTIONSTART: Action 18:31:21: INSTALL.
INFO: Action start 18:31:21: INSTALL.
COMMONDATA: 1: 0 2: 1033 3: 1252
PROGRESS:
PROGRESS: 1: 2 2: 189440
COMMONDATA: 1: 0 2: 1033 3: 1252
INFO: This action is only valid for products that are currently installed.
C:\T\Package.msi
COMMONDATA: 1: 2 2: 0
COMMONDATA: 1: 2 2: 1
INFO: DEBUG: Error 2755:  Server returned unexpected error 1605 attempting to
  install package C:\T\Package.msi.
ERROR: The installer has encountered an unexpected error installing this package.
  This may indicate a problem with this package. The error code is 2755.
INFO: Action ended 18:31:21: INSTALL. Return value 3.
TERMINATE:
Windows Installer result: 1603 (INSTALL_FAILURE)

Of note:

Msi.cs, the class containing a set of .NET platform invoke definitions for interoperating with the native MSI.dll that exposes Windows Installer APIs. The collection of functions and constants in this file is not comprehensive, but it covers enough functionality to get simple scenarios working. Most of the definitions are straightforward, and all of them have XML documentation comments (via MSDN) explaining their purpose. For convenience, many of the relevant Windows error codes from winerror.h are exposed by the ERROR enumeration.
ManagedMsiExec.cs, the sample application itself which works by calling the relevant Msi APIs in the right order. Conveniently, a complete install can be done in as few as three calls: MsiOpenPackage, MsiDoAction, and MsiCloseHandle (with MsiSetProperty an optional fourth for uninstall or customization). To provide a better command-line experience, the default UI for status reporting is customized via MsiSetInternalUI, MsiSetExternalUI, and MsiSetExternalUIRecord. Implementing the handler for a MsiSetExternalUI callback is easy because it is passed pre-formatted strings; parsing record structures in the handler for the MsiSetExternalUIRecord callback requires a few more API calls (and a closer reading of the documentation!).

Note: When providing delegates to the MsiSetExternalUI and MsiSetExternalUIRecord methods, it's important to ensure they won't be garbage collected while still in use (or else the application will crash!). Because the managed instances are handed off to native code and have a lifetime that extends beyond the function call itself, it's necessary to maintain a reference for the life of the application (or until the callback is unregistered). A common technique for maintaining such a reference is the GC.KeepAlive method (though ManagedMsiExec simply stores its delegate references in a static member for the same effect). Conveniently, the callbackOnCollectedDelegate managed debugging assistant can help identify lifetime problems during debugging

[Click here to download a pre-compiled executable along with the complete source code for the ManagedMsiExec sample.]

The introduction of Windows Installer helped unify the way applications install on Windows and encapsulates a great deal of complexity that installers previously needed to deal with. The openness and comprehensiveness of the Windows Installer native API makes it easy for applications to orchestrate installs directly, and the power of .NET's interoperability support makes it simple for managed applications to do the same.

The Msi class I'm sharing here should allow developers to get started with the Windows Installer API a bit more simply - as the ManagedMsiExec sample demonstrates. The wrapper class and the sample are both pretty straightforward - I hope they're useful, informative, or at least somewhat interesting! :)

Tags: Technical Utilities

"You are now up to date." [IfModifiedSinceWebClient makes it easy to keep a local file in sync with online content - and do so asynchronously]

Tuesday, October 18, 2011

Recently, I was working with a program that needs to cache the content of a network file on the local disk for faster, easier access. The network file is expected to change periodically according to an unpredictable schedule; the objective is to keep the local file in "close" sync with the online copy without a lot of cost or complex protocols.

The network content is static (when it's not changing!), so hosting it on a web server and accessing it via HTTP seems like a reasonable start. Implementation-wise, it turns out the file is needed immediately when accessed, so downloading it on-demand is not an option due to the possibility of lengthy network delays or connectivity issues. Fortunately, it's okay to use an out-of-date version as long as there's a good chance the next access will use the latest one. So a reasonable approach seems to be to wait for the local file to be needed, use it immediately in its current state, then update it to the latest version by asynchronously downloading the network file and replacing the local copy in the background.

That approach satisfies the requirements of the scenario, but it's a little wasteful because the local file is likely to be used much more frequently than the online version gets updated - which means most of the downloads will end up being the same version that's already cached locally! Fortunately, we can improve things easily: the HTTP specification defines the If-Modified-Since header for exactly this purpose! By including that header with the HTTP request, the server "knows" whether the local file is out of date. If so, it returns the data for the network file as usual - but if the network file has not changed more recently, the web server returns a 304 Not Modified result and no content. This "short circuiting" of the HTTP response eliminates the need to send redundant data and reduces the network traffic to a single, short HTTP request/response pair.

When implementing a solution like this with the .NET Framework, two approaches spring to mind. The first is to use the low-level HttpWebRequest/HttpWebResponse classes and manage the entire operation directly. HttpWebRequest has an IfModifiedSince property that can be used to set the relevant HTTP header (and format it correctly), so this approach is straightforward and quite flexibile. However, it also requires the caller to manage the transfer of bits from the source to the destination (including reading, writing, buffering, etc.), and that's not really code we want to write. The second approach is to use something like the higher-level WebClient class's DownloadFileAsync method to do the entire download and call us back when everything has been taken care of. That seems preferable, so lets go ahead and set the WebClient.IfModifiedSince property and... umm... wait... WebClient doesn't have an IfModifiedSince property! And not only doesn't the property exist, you're not allowed to set it manually via the Headers property: "In addition, some other headers are also restricted when using a WebClient object. These restricted headers include, but are not limited to the following: ... If-Modified-Since ...".

Darn, I really wanted to use WebClient and avoid having to encode the If-Modified-Since header myself. If only it were possible to tweak the way WebClient initializes its underlying HttpWebRequest, we'd be set... Hey, what about the WebClient.GetWebRequest method? Isn't this exactly what it's for? Yes, it is! :)

To make this all work, I created IfModifiedSinceWebClient which is a WebClient subclass that adds an IfModifiedSince property and overrides GetWebRequest to set that DateTime value onto the underlying HttpWebRequest. Unfortunately, there are two issues: the destination file gets deleted before the download starts (so it ends up being 0 bytes when HTTP 304 is returned) and HTTP 304 is defined as a failure code, so WebClient thinks a successful (NOOP) download has failed. To address both issues and offer a seamless experience, IfModifiedSinceWebClient exposes a custom UpdateFileIfNewer method that's asynchronous (i.e., "fire and forget") and simple. Just pass it the path to a local file to create/update and a URI for the remote file. (You can optionally pass a "completed" method to be called with the result of the asynchronous update.) UpdateFileIfNewer sets the If-Modified-Since header and initiates a call to DownloadFileAsync, providing a temporary file path. If the remote file is not newer than the local one, no transfer occurs and the UpToDate result is passed to the completion method. If the remote file is newer (or the server doesn't support If-Modified-Since), the local file will be replaced with the just-downloaded copy and the Updated result will be returned. And if something goes wrong, the local file is left alone and the Error result is used.

Here's what a typical call looks like:

private void MyMethod()
{
    // ...

    var localFile = "LocalFile.txt";
    var uri = new Uri("http://example.com/NetworkFile.txt");
    IfModifiedSinceWebClient.UpdateFileIfNewer(localFile, uri, UpdateFileIfNewerCompleted);
    // Note: Download occurs asynchronously

    // ...
}

private void UpdateFileIfNewerCompleted(IfModifiedSinceWebClient.UpdateFileIfNewerResult result)
{
    switch (result)
    {
        case IfModifiedSinceWebClient.UpdateFileIfNewerResult.Updated:
            // ...
            break;
        case IfModifiedSinceWebClient.UpdateFileIfNewerResult.UpToDate:
            // ...
            break;
        case IfModifiedSinceWebClient.UpdateFileIfNewerResult.Error:
            // ...
            break;
    }
}

[Click here to download the source code for IfModifiedSinceWebClient and the sample application shown at the start of the post.]

(Don't forget to update the sample's localhost test URI to a valid URI for your environment.)

IfModifiedSinceWebClient is a simple subclass that adds If-Modified-Since functionality to the .NET Framework's WebClient. But that's only half the battle - the UpdateFileIfNewer method makes the "asynchronously update a file if necessary" scenario work by building on that with a temporary file, error detection, and result codes. The result is a seamless, unobtrusive way for an application to keep itself up to date with dynamically changing online content without incurring unnecessary network overhead. Although there are other, more sophisticated solutions to this problem, it's hard to beat the simplicity and compactness of IfModifiedSinceWebClient!

Aside: For bonus points, IfModifiedSinceWebClient could also set the local file's "last modified" time to the Last-Modified value returned by the server. I haven't done so in the sample because it doesn't seem like the subtle time skew (between the client and server clocks) will matter in most cases. However, I reserve the right to change my mind if practical experience contradicts me. :)

Tags: Technical

Don't make the audience sit through all your typo-ing [Overview of implementing an extension for WebMatrix - and the complete source code for the Snippets sample]

Wednesday, October 5, 2011

When I wrote about WebMatrix's new extensibility features a couple of posts back, I said I'd share the complete source code to Snippets, one of the sample extensions in the gallery. In this post, I'll be doing that along with explaining the overall extension model in a bit more detail. To begin with, here's the code so those of you who want to follow along at home can do so:

[Click here to download the complete source code for the Snippets extension]

Background

I created the Snippets extension by starting from the "WebMatrix Extension" Visual Studio Project template I wrote about previously. When expanded, the project template automatically set up the right project references (i.e., Microsoft.WebMatrix.Extensibility.dll) and build steps and created a functioning extension with a single Ribbon button. From there, creating Snippets was a simple matter of adding the functionality I wanted on top of that foundation. But more on that later - I want to go over the default code first.

The default project template extension

One key thing to notice is that the template-generated WebMatrixExtension class derives from ExtensionBase, a base class which implements WebMatrix's IExtension interface and simplifies some of the work of dealing with it. In the interest of generality (and because it's just an interface), IExtension doesn't do much beyond identifying a few required properties. ExtensionBase builds on that to offer concrete collections for the IEnumerable(T) properties, automatically MEF Imports the IWebMatrixHost interface, creates an OnWebMatrixHostChanged override, and so on. Of course, none of this is rocket science and the decision to use ExtensionBase is completely up to you. But the whole reason it exists to make your life easier, so I'd suggest at least giving it a chance. :)

In order for an extension to be loaded by WebMatrix, it needs to be located in the right directory (more on that in the previous post) and it needs to MEF Export the IExtension interface. You might think that ExtensionBase should do the latter for you, but it doesn't because that might restrict your own class hierarchy (i.e., intermediary classes would advertise themselves as extensions even though they're not). Therefore, subclasses like the template-generated WebMatrixExtension class need to explicitly export IExtension.

With the groundwork out of the way, the basic functionality of the example extension is to add a Ribbon button and handle activation of that button to open the current web site in the browser. Providing content for the Ribbon is as easy as adding instances implementing the relevant interfaces (IRibbonButton, IRibbonMenuButton, IRibbonGroup, etc.) to the generated class's RibbonItemsCollection. It's important to do this exactly once (typically in the extension's constructor) because subsequent changes are not honored (FYI, that may change in the future, but please don't count on it). Of course, you can show and hide Ribbon content whenever you wish; you just can't add or remove items after initialization. Again, there are simple, concrete subclasses for each of the relevant interfaces (IRibbonButton->RibbonButton, etc.) so you don't need to spend time implementing these simple things yourself. And just like before, using the "helper implementations" is completely optional.

Creating a RibbonButton requires a label, an ICommand implementation, and (optionally) a small/large image. The template's sample includes a couple of images already configured properly to provide a working example. Wiring up the images correctly is standard WPF, but the pack URI syntax can be a little tricky and it's common to forget to change the build type of the image files to "Resource" - so that's already been done for you as a helpful reminder. :) For its ICommand implementation, the template sample uses a simple DelegateCommand class (also included). The sample DelegateCommand is very much in line with other implementations of DelegateCommand or RelayCommand. (Feel free to use whatever version you'd like; the sample DelegateCommand exists simply to avoid introducing a dependency on a third-party library.) As you'd expect, the ICommand's CanExecute and CanExecuteChanged methods are used to dynamically enable/disable the button and its Execute method is called when the button is clicked.

Yeah, it takes a while to describe what's going on, but there's hardly any code at all! :)

The Snippets extension

With the foundation behind us, it's time to consider the Snippets extension itself - and for that, it's helpful to know what it does. Snippets was created with the typical demo scenario in mind: a presenter is showing off WebMatrix and wants to add a block of code to a document, but doesn't want to type it out in front of the audience because that can be slow and boring. Instead, he or she clicks on the Snippets button in the Ribbon, selects from a list of available snippets, and the relevant text is automatically inserted in the editor. Of course, individual snippets should be easy for the user to add or modify and they should allow small and large amounts of text as well as blank lines, etc..

There are probably a hundred ways you could build this extension; here's how I've done it:

Each snippet is stored as a text file (ex: "Empty DIV.txt") in the Snippets folder of the user's Documents folder (i.e., %USERPROFILE%\Documents\Snippets). The file's name identifies the snippet and its contents get added when the snippet is used. You can have as many or few snippet files as you'd like; they're read when the extension is loaded and cached. If there aren't any snippet files (or the Snippets folder doesn't exist), the extension provides a simple message with instructions for how to set things up.

Aside: WebMatrix needs to be restarted in order for snippet changes to take effect. An obvious improvement would be for the Snippets extension to monitor the Snippets directory for changes and apply them on the fly.
The Snippets user interface is a RibbonMenuButton which contains a collection of RibbonButton instances corresponding to each of the available snippets. When the RibbonMenuButton is clicked, it automatically shows the IRibbonButton instances from its Itemscollection in a small, drop-down menu. When a selection is made, the menu is automatically closed.

Aside: A RibbonSplitButton could have been used if there was a scenario where the user could click the top half of the button to perform a similar action (like inserting a default snippet).
The Snippets button only makes sense when the "Files" workspace is active (i.e., a document is being edited), so the extension listens to the IWebMatrixHost.WorkspaceChanged event and shows/hides its Ribbon button according to whether the new workspace is an instance of the IEditorWorkspace interface.

Aside: One of the bits of feedback we've gotten so far is that this scenario (i.e., "only available for a single workspace") is common and should be simplified. Yep, message received. :)
To insert text into the document, Snippets invokes the Paste command via the IWebMatrixHost.HostCommands property. While it's possible to get and set editor text directly for more advanced scenarios, the Paste command works nicely because the editor automatically updates the caret position, replaces selected text, etc.. The downside to using the (application-wide) Paste command is that if the input focus isn't in the body of an open file, then the paste action will be directed elsewhere and won't work as it's meant to.

Aside: The editor interfaces are almost rich enough to manage everything here and avoid using Paste. However, there were a couple of glitches when I tried which led me to use the simpler paste approach for the sample.
Input focus aside, one thing that's clear is that Snippets can never be added without an open document visible. To determine if that's the case, Snippets uses an unnamed host command to populate an instance of IEditorContainer. If that fails or the IEditorExt within is null, that means no document is open and Snippets knows to disable its Ribbon buttons.

Unnamed host commands work just like named commands (i.e., Paste), though they're a little harder to get to. The HostCommands.GetCommand method exists for this purpose and allows the caller to pass a GUID/ID for the command to return. The relevant code looks like this:
```
/// <summary>
/// Gets an IEditorExt instance if the editor is in use.
/// </summary>
/// <returns>IEditorExt reference.</returns>
private IEditorExt GetEditorExt()
{
    var editorContainer = new EditorContainer();
    _editorContainerCommand = WebMatrixHost.HostCommands.GetCommand(new Guid("27a0f541-c86c-4f0b-b436-0b50bf9f7ef8"), 10);
    if (_editorContainerCommand != null)
    {
        _editorContainerCommand.Execute(editorContainer);
    }

    return editorContainer.Editor;
}
```
Helper properties for this and other unnamed commands be added in the future. For now, FYI about a useful one. :)

Although the RibbonButton class supports a parameter parameter for passing to the ICommand's CanExecute/Execute methods to provide additional context, this value is not actually passed through in some cases. :( This bug was found too late to fix for the Beta, but the good news is that it's easy to work around by creating a closure to capture the relevant parameter information instead. If you're not familiar with this technique, it involves defining an anonymous method that references the desired data; the compiler automatically captures the necessary values and passes them along when the delegate gets called.

Here's what it looks like in the sample (using LINQ to create the collection):

// Create buttons for each snippet
var snippetsButtons = _snippets
    .Select(s =>
        new RibbonButton(
            s.Key,
            new DelegateCommand(
                (parameter) => GetEditorExt() != null,
                (parameter) =>
                {
                    // parameter is (incorrectly) null, so add this indirection for now
                    HandleSnippetInvoke(s.Value);
                }),
            s.Value,
            _snippetsImageSmall,
            _snippetsImageLarge));

It turns out there's a subtle bug because of the following code in the sample:
```
// Paste the snippet into the editor
var paste = WebMatrixHost.HostCommands.Paste;
if (paste.CanExecute(insertText))
{
    paste.Execute(insertText);
}
```
Note that Snippets is invoking a special flavor of the Paste command by providing the text as the parameter property for the CanExecute/Execute methods (meaning "paste this text, please"). However, the underlying editor code is returning a CanExecute result based on whether or not it could paste from the clipboard (i.e., it's not honoring the meaning of the text parameter). Therefore, if the clipboard is empty or contains non-text data like a file, CanExecute returns false and Snippets isn't able to insert text.

The easy workaround is to copy some text to the clipboard so the underlying implementation will return true for CanExecute and the specialized paste operation will be invoked.

Summary

Snippets is a small extension that builds on the default WebMatrix Extension project template to implement some useful (if limited) functionality. Its original purpose was to simplify demos, but if people find practical uses for it, that's great, too! :)

But whether or not people use it, the Snippets extension touches on enough interesting areas of extensibility that people can probably learn from it. If you're getting started with WebMatrix extensions and are looking for a "real world" sample, I hope Snippets can be helpful. If you have feedback or questions - about Snippets or more generally - please let me know!

Tags: Technical Web Platform

TextAnalysisTool.NET to Windows 8: "Y U no run me under .NET 4?" [How to avoid the "please install .NET 3.5" dialog when running older .NET applications on the Windows 8 Developer Preview]

Friday, September 23, 2011

I wrote TextAnalysisTool.NET a number of years ago to streamline the task of analyzing large log files by creating an interactive experience that combines searching, filtering, and tagging and allow the user to quickly identify interesting portions of a large log file. Although .NET 2.0 was out at the time, I targeted .NET 1.1 because it was more widely available, coming pre-installed on Windows Server 2003 (the "latest and greatest" OS then). In the years since, I've heard from folks around the world running TextAnalysisTool.NET on subsequent Windows operating systems and .NET Framework versions. Because Windows and the .NET Framework do a great job maintaining backwards compatibility, the same TextAnalysisTool.NET binary has continued to work as-is for the near-decade since its initial release.

But the story changes with Windows 8! Although Windows 8 has the new .NET 4.5 pre-installed (including .NET 4.0 upon which it's based), it does not include .NET 3.5 (and therefore .NET 2.0). While I can imagine some very sensible reasons for the Windows team to take this approach, it's inconvenient for existing applications because .NET 4 in this scenario does not automatically run applications targeting an earlier framework version. What is cool is that the public Windows 8 Developer Preview detects when an older .NET application is run and automatically prompts the user to install .NET 3.5:

That's pretty slick and is a really nice way to bridge the gap. However, it's still kind of annoying for users as it may not always be practical for them to perform a multi-megabyte download the first time they try to run an older .NET program. So it would be nice if there were an easy way for older .NET applications to opt into running under the .NET 4 framework that's already present on Windows 8...

And there is! :) One aspect of .NET's "side-by-side" support involves using .config files to specify which .NET versions an application is known to work with. (For more information, please refer to the MSDN article How to: Use an Application Configuration File to Target a .NET Framework Version.) Consequently, improving the Windows 8 experience for TextAnalisisTool.NET should be as easy as creating a suitable TextAnalysisTool.NET.exe.config file in the same folder as TextAnalysisTool.NET.exe.

Specifically, the following should do the trick:

<?xml version="1.0"?>
<configuration>
  <startup>
    <supportedRuntime version="v1.1.4322"/>
    <supportedRuntime version="v2.0.50727"/>
    <supportedRuntime version="v4.0"/>
  </startup>
</configuration>

And it does! :) With that TextAnalysisTool.NET.exe.config file in place, TextAnalysisTool.NET runs on a clean install of the Windows 8 Developer Preview as-is and without prompting the user to install .NET 3.5. I've updated the download ZIP to include this file so new users will automatically benefit; existing users should drop TextAnalysisTool.NET.exe.config in the right place, and they'll be set as well!

Aside: Although this trick will work in many cases, it isn't guaranteed to work. In particular, if there has been a breaking change in .NET 4, then attempting to run a pre-.NET 4 application in this manner might fail. Therefore, it's prudent to do some verification when trying a change like this!

[Click here to download a ZIP file containing TextAnalysisTool.NET, the relevant .config file, its documentation, and a ReadMe.]

TextAnalysisTool.NET has proven to be extremely popular with support engineers and it's always nice to hear from new users. I hope today's post extends the usefulness of TextAnalysisTool.NET by making the Windows 8 experience as seamless as people have come to expect!

Tags: Technical TextAnalysisTool Utilities

Preprocessor? .NET don't need no stinkin' preprocessor! [DebugEx.Assert provides meaningful assertion failure messages automagically!]

Thursday, July 28, 2011

If you use .NET's Debug.Assert method, you probably write code to validate assumptions like so:

Debug.Assert(args == null);

If the expression above evaluates to false at run-time, .NET immediately halts the program and informs you of the problem:

The stack trace can be really helpful (and it's cool you can attach a debugger right then!), but it would be even more helpful if the message told you something about the faulty assumption... Fortunately, there's an overload of Debug.Assert that lets you provide a message:

Debug.Assert(args == null, "args should be null.");

The failure dialog now looks like this:

That's a much nicer experience - especially when there are lots of calls to Assert and you're comfortable ignoring some of them from time to time (for example, if a network request failed and you know there's no connection). Some code analysis tools (notably StyleCop) go as far as to flag a warning for any call to Assert that doesn't provide a custom message.

At first, adding a message to every Assert seems like it ought to be pure goodness, but there turn out to be some drawbacks in practice:

The comment is often a direct restatement of the code - especially for very simple conditions. Redundant redundancy is redundant, and when I see messages like that, I'm reminded of code comments like this:
```
i++; // Increment i
```
It takes time and energy to type out all those custom messages, and the irony is that most of them will never be seen at all!

Because the code for the condition is often expressive enough as-is, it would be nice if Assert automatically used the code as the message!

Aside: This is hardly a new idea; C developers have been doing this for years by leveraging macro magic in the preprocessor to create a string from the text of the condition.

Further aside: This isn't even a new idea for .NET; I found a couple places on the web where people ask how to do this. And though nobody I saw seemed to have done quite what I show here, I'm sure there are other examples of this technique "in the wild".

As it happens, displaying the code for a condition can be accomplished fairly easily in .NET without introducing a preprocessor! However, it requires that calls to Assert be made slightly differently so as to defer execution of the condition. In a normal call to Assert, the expression passed to the condition parameter is completely evaluated before being checked. But by changing the type of the condition parameter from bool to Func<bool> and then wrapping it in the magic Expression<Func<bool>>, we're able to pass nearly complete information about the expression into the Assert method where it can be used to recreate the original source code at run-time!

To make this a little more concrete, the original "message-less" call I showed at the beginning of the post can be trivially changed to:

DebugEx.Assert(() => args == null);

And the DebugEx.Assert method I've written will automatically provide a meaningful message (by calling the real Debug.Assert and passing the condition and a message):

The message above is identical to the original code - but maybe that's because it's so simple... Let's try something more complex:

DebugEx.Assert(() => args.Select(a => a.Length).Sum() == 10);

Becomes:

Assertion Failed: (args.Select(a => a.Length).Sum() == 10)

Wow, amazing! So is it always perfect? Unfortunately, no:

DebugEx.Assert(() => args.Length == 5);

Becomes:

Assertion Failed: (ArrayLength(args) == 5)

The translation of the code to an expression tree and back seems to have lost a little fidelity along the way; the compiler translated the Length access into an expression tree that doesn't map back to code exactly the same. Similarly:

DebugEx.Assert(() => 5 + 3 + 2 >= 100);

Becomes:

Assertion Failed: False

In this case, the compiler evaluated the constant expression at compile time (it's constant, after all!), and the information about which numbers were used in the computation was lost.

Yep, the loss of fidelity in some cases is a bit of a shame, but I'll assert (ha ha!) that nearly all the original intent is preserved and that it's still quite easy to determine the nature of the failing code without having to provide a message. And of course, you can always switch an ambiguous DebugEx.Assert back to a normal Assert and provide a message parameter whenever you want. :)

[Click here to download the source code for DebugEx.Assert and the sample application used for the examples above.]

DebugEx.Assert was a fun experiment and a great introduction to .NET's powerful expression infrastructure. DebugEx.Assert is a nearly-direct replacement for Debug.Assert and (similarly) applies only when DEBUG is defined, so it costs nothing in release builds. It's worth noting there will be a bit of extra overhead due to the lambda, but it should be negligible - especially when compared to the time you'll save by not having to type out a bunch of unnecessary messages!

If you're getting tired of typing the same code twice, maybe DebugEx.Assert can help! :)

Notes:

The code for DebugEx.Assert turned out to be simple because nearly all the work is done by the Expression(T) class. The one bit of trickiness stems from the fact that in order to create a lambda to pass as the Func(T), the compiler creates a closure which introduces an additional class (though they're never exposed to the developer). Therefore, even simple statements like the original example become kind of hard to read: Assertion Failed: (value(Program+<>c__DisplayClass0).args == null).

To avoid that problem, I created an ExpressionVisitor subclass to rewrite the expression tree on the fly, getting rid of the references to such extra classes along the way. What I've done with SimplifyingExpressionVisitor is simple, but seems to work nicely for the things I've tried. However, if you find scenarios where it doesn't work as well, I'd love to know so I can handle them too!

Tags: Technical

Use it or lose it, part deux [New Delay.FxCop code analysis rule helps identify uncalled public or private methods and properties in a .NET assembly]

Thursday, July 14, 2011

Previous posts introduced the Delay.FxCop custom code analysis assembly and demonstrated the benefits of automated code analysis for easily identifying problem areas in an assembly. The Delay.FxCop project included two rules, DF1000: Check spelling of all string literals and DF1001: Resources should be referenced - today I'm introducing another! The new rule follows in the footsteps of DF1001 by identifying unused parts of an assembly that can be removed to save space and reduce complexity. But while DF1001 operated on resources, today's DF1002: Uncalled methods should be removed analyzes the methods and properties of an assembly to help find those stale bits of code that aren't being used any more.

Note: If this functionality seems familiar, it's because CA1811: Avoid uncalled private code is one of the standard FxCop rules. I've always been a big fan of CA1811, but frequently wished it could look beyond just private code to consider all code. Of course, limiting the scope of the "in-box" rule makes perfect sense from an FxCop point of view: you don't want the default rules to be noisy or else they'll get turned off and ignored. But the Delay.FxCop assembly isn't subject to the same restrictions, so I thought it would be neat to experiment with an implementation that analyzed all of an assembly's code.

Further note: One of the downsides of this increased scope is that DF1002 can't distinguish between methods that are part of a library's public API and those that are accidentally unused. As far as DF1002 is concerned, they're both examples of code that's not called from within the assembly. Therefore, running this rule on a library involves some extra overhead to suppress the warnings for public APIs. If it's just a little extra work, maybe it's still worthwhile - but if it's overwhelming, you can always disable DF1002 for library assemblies and restrict it to applications where it's more relevant.

Implementation-wise, DF1002: Uncalled methods should be removed isn't all that different from its predecessors - in fact, it extends and reuses the same assembly node enumeration helper introduced with DF1001. During analysis, every method of the assembly is visited and if it isn't "used" (more on this in a moment), a code analysis warning is output:

DF1002 : Performance : The method 'SilverlightApplication.MainPage.UnusedPublicMethod' does not appear to be used in code.

Of course, these warnings can be suppressed in the usual manner:

[assembly: SuppressMessage("Usage", "DF1001:ResourcesShouldBeReferenced",
           MessageId = "app.xaml", Scope = "resource", Target = "SilverlightApplication.g.resources",
           Justification = "Loaded by Silverlight for App.xaml.")]

It's interesting to consider what it means for a method or a property to be "used"... (Internally, properties are implemented as a pair of get/set methods.) Clearly, a direct call to a method means it's used - but that logic alone results in a lot of false positives! For example, a class implementing an interface must define all the relevant interface methods in order to compile successfully. Therefore, explicit and implicit interface method implementations (even if uncalled) do not result in a DF1002 warning. Similarly, a method override may not be directly called within an assembly, but can still be executed and should not trigger a warning. Other kinds of "unused" methods that do not result in a warning include: static constructors, assembly entry-points, and methods passed as parameters (ex: to a delegate for use by an event).

With all those special cases, you might think nothing would ever be misdiagnosed. :) But there's a particular scenario that leads to many DF1002 warnings in a perfectly correct application: reflection-based access to properties and methods. Granted, reflection is rare at the application level - but at the framework level, it forms the very foundation of data binding as implemented by WPF and Silverlight! Therefore, running DF1002 against a XAML application with data binding can result in warnings for the property getters on all model classes...

To avoid that problem, I've considered whether it would make sense to suppress DF1002 for classes that implement INotifyPropertyChanged (which most model classes do), but it seems like that would also mask a bunch of legitimate errors. The same reasoning applies to subclasses of DependencyObject or implementations of DependencyProperty (though the latter might turn out to be a decent heuristic with a bit more work). Another approach might be for the rule to also parse the XAML in an assembly and identify the various forms of data binding within. That seems promising, but goes way beyond the initial scope of DF1002! :)

Of course, there may be other common patterns which generate false positives - please let me know if you find one and I'll look at whether I can improve things for the next release.

[Click here to download the Delay.FxCop rule assembly, associated .ruleset files, samples, and the complete source code.]

For directions about running Delay.FxCop on a standalone assembly or integrating it into a project, please refer to the steps in my original post.

Unused code is an unnecessary tax on the development process. It's a distraction when reading, incurs additional costs during coding (ex: when refactoring), and it can mislead others about how an application really works. That's why there's DF1002: Uncalled methods should be removed - to help you easily identify unused methods. Try running it on your favorite .NET application; you might be surprised by what you find! :)

Tags: Technical

Use it or lose it! [New Delay.FxCop code analysis rule helps identify unused resources in a .NET assembly]

Wednesday, June 29, 2011

My previous post outlined the benefits of automated code analysis and introduced the Delay.FxCop custom code analysis assembly. The initial release of Delay.FxCop included only one rule, DF1000: Check spelling of all string literals, which didn't seem like enough to me, so today's update doubles the number of rules! :) The new rule is DF1001: Resources should be referenced - but before getting into that I'm going to spend a moment more on spell-checking...

What I planned to write for the second code analysis rule was something to check the spelling of .NET string resources (i.e., strings from a RESX file). This seemed like another place misspellings might occur and I'd heard of other custom rules that performed this same task (for example, here's a sample by Jason Kresowaty). However, in the process of doing research, I discovered rule CA1703: Resource strings should be spelled correctly which is part of the default set of rules!

To make sure it did what I expected, I started a new application, added a misspelled string resource, and ran code analysis. To my surprise, the misspelling was not detected... However, I noticed a different warning that seemed related: CA1824: Mark assemblies with NeutralResourcesLanguageAttribute "Because assembly 'Application.exe' contains a ResX-based resource file, mark it with the NeutralResourcesLanguage attribute, specifying the language of the resources within the assembly." Sure enough, when I un-commented the (project template-provided) NeutralResourcesLanguage line in AssemblyInfo.cs, the desired warning showed up:

CA1703 : Microsoft.Naming : In resource 'WpfApplication.Properties.Resources.resx', referenced by name
'SampleResource', correct the spelling of 'mispelling' in string value 'This string has a mispelling.'.

In my experience, a some people suppress CA1824 instead of addressing it. But as we've just discovered, they're also giving up on free spell checking for their assembly's string resources. That seems silly, so I recommend setting NeutralResourcesLanguageAttribute for its helpful side-effects!

Note: For expository purposes, I've included an example in the download: CA1703 : Microsoft.Naming : In resource 'WpfApplication.Properties.Resources.resx', referenced by name 'IncorrectSpelling', correct the spelling of 'mispelling' in string value 'This string has a single mispelling.'.

Once I realized resource spell checking was unnecessary, I decided to focus on a different pet peeve of mine: unused resources in an assembly. In much the same way stale chunks of unused code can be found in most applications, it's pretty common to find resources that aren't referenced and are just taking up valuable space. But while there's a built-in rule to detect certain kinds of uncalled code (CA1811: Avoid uncalled private code), I'm not aware of anything similar for resources... And though it's possible to perform this check manually (by searching for the use of each individual resource), this is the kind of boring, monotonous task that computers are made for! :)

Therefore, I've created the second Delay.FxCop rule, DF1001: Resources should be referenced, which compares the set of resource references in an assembly with the set of resources that are actually present. Any cases where a resource exists (whether it's a string, stream, or object), but is not referenced in code will result in an instance of the DF1001 warning during code analysis.

Aside: For directions about how to run the Delay.FxCop rules on a standalone assembly or integrate them into a project, please refer to the steps in my original post.

As a word of caution, there can be cases where DF1001 reports that a resource isn't referenced from code, but that resource is actually used by an assembly. While I don't think it will miss typical uses from code (either via the automatically-generated Resources class or one of the lower-level ResourceManager methods), the catch is that not all resource references show up in code! For example, the markup for a Silverlight or WPF application is included as a XAML/BAML resource which is loaded at run-time without an explicit reference from the assembly itself. DF1001 will (correctly; sort of) report this resource as unused, so please remember that global code analysis suppressions can be used to squelch false-positives:

[assembly: SuppressMessage("Usage", "DF1001:ResourcesShouldBeReferenced", MessageId = "mainwindow.baml",
    Scope = "resource", Target = "WpfApplication.g.resources", Justification = "Loaded by WPF for MainWindow.xaml.")]

Aside: There are other ways to "fool" DF1001, such as by loading a resource from a different assembly or passing a variable to ResourceManager.GetString. But in terms of how things are done 95% of the time, the rule's current implementation should be accurate. Of course, if you find cases where it misreports unused resources, please let me know and I'll look into whether it's possible to improve things in a future release!

[Click here to download the Delay.FxCop rule assembly, associated .ruleset files, samples, and the complete source code.]

Stale references are an unnecessary annoyance: they bloat an assembly, waste time and money (for example, when localized unnecessarily), confuse new developers, and generally just get in the way. Fortunately, detecting them in an automated fashion is easy with DF1001: Resources should be referenced! After making sure unused resources really are unused, remove them from your project - and enjoy the benefits of a smaller, leaner application!

Tags: Technical

Speling misteaks make an aplikation look sily [New Delay.FxCop code analysis rule finds spelling errors in a .NET assembly's string literals]

Thursday, June 16, 2011

No matter how polished the appearance of an application, web site, or advertisement is, the presence of even a single spelling error can make it look sloppy and unprofessional. The bad news is that spelling errors are incredibly easy to make - either due to mistyping or because one forgot which of the many, conflicting special cases applies in a particular circumstance. The good news is that technology to detect and correct spelling errors exists and is readily available. By making regular use of a spell-checker, you don't have to be a good speller to look like one. Trust me! ;)

Spell-checking of documents is pretty well covered these days, with all the popular word processors offering automated, interactive assistance. However, spell-checking of code is not quite so far along - even high-end editors like Visual Studio don't tend to offer interactive spell-checking support. Fortunately, it's possible - even easy! - to augment the capabilities of many development tools to integrate spell-checking into the development workflow. There are a few different ways of doing this: one is to incorporate the checking into the editing experience (like this plugin by coworker Mikhail Arkhipov) and another is to do the checking as part of the code analysis workflow (like code analysis rule CA1703: ResourceStringsShouldBeSpelledCorrectly). I'd already been toying with the idea of implementing my own code analysis rules, so I decided to experiment with the latter approach...

Aside: If you're not familiar with Visual Studio's code analysis feature, I highly recommend the MSDN article Analyzing Managed Code Quality by Using Code Analysis. Although the fully integrated experience is only available on higher-end Visual Studio products, the same exact code analysis functionality is available to everyone with the standalone FxCop tool which is free as part of the Microsoft Windows SDK for Windows 7 and .NET Framework 4. (FxCop has a dedicated download page with handy links, but it directs you to the SDK to do the actual install.)

Unrelated aside: In the ideal world, all of an application's strings would probably come from a resource file where they can be easily translated to other languages - and therefore string literals wouldn't need spell-checking. However, in the real world, there are often cases where user-visible text ends up in string literals (ex: exception messages) and therefore a rule like this seems to have practical value. If the string resources of your application are already perfectly separated, congratulations! However, if your application doesn't use resources (or uses them incompletely!), please continue reading... :)

As you might expect, it's possible to create custom code analysis rules and easily integrate them into your build environment; a great walk-through can be found on the Code Analysis Team Blog. If you still have questions after reading that, this post by Tatham Oddie is also quite good. And once you have an idea what you're doing, this documentation by Jason Kresowaty is a great resource for technical information.

Code analysis is a powerful tool and has a lot of potential for improving the development process. But for now, I'm just going to discuss a single rule I created: DF1000: CheckSpellingOfAllStringLiterals. As its name suggests, this rule checks the spelling of all string literals in an assembly. To be clear, there are other rules that check spelling (including some of the default FxCop/Visual Studio ones), but I didn't see any that checked all the literals, so this seemed like an interesting place to start.

Aside: Programs tend to have a lot of strings and those strings aren't always words (ex: namespace prefixes, regular expressions, etc.). Therefore, this rule will almost certainly report a lot of warnings run for the first time! Be prepared for that - and be ready to spend some time suppressing warnings that don't matter to you. :)

As I typically do, I've published a pre-compiled binary and complete source code, so you can see exactly how CheckSpellingOfAllStringLiterals works (it's quite simple, really, as it uses the existing introspection and spell-checking APIs). I'm not going to spend a lot of time talking about how this rule is implemented, but I did want to show how to use it so others can experiment with their own projects.

Important: Everything I show here was done with the Visual Studio 2010/.NET 4 toolset. Past updates to the code analysis infrastructure are such that things may not work with older (or newer) releases.

To add the Delay.FxCop rules to a project, you'll want to know a little about rule sets - the MSDN article Using Rule Sets to Group Managed Code Analysis Rules is a good place to start. I've provided two .ruleset files in the download: Delay.FxCop.ruleset which contains just the custom rule I've written and AllRules_Delay.FxCop.ruleset which contains my custom rule and everything in the shipping "Microsoft All Rules" ruleset. (Of course, creating and using your own .ruleset is another option!) Incorporating a custom rule set into a Visual Studio project is as easy as: Project menu, ProjectName Properties..., Code Analysis tab, Run this rule set:, Browse..., specify the path to the custom rule set, Build menu, Run Code Analysis on ProjectName.

Note: For WPF projects, you may also want to uncheck Suppress results from generated code in the "Code Analysis" tab above because the XAML compiler adds GeneratedCodeAttribute to all classes with an associated .xaml file and that automatically suppresses code analysis warnings for those classes. (Silverlight and Windows Phone projects don't set this attribute, so the default "ignore" behavior is fine.)

Assuming your project contains a string literal that's not in the dictionary, the Error List window should show one or more warnings like this:

DF1000 : Spelling : The word 'recieve' is not in the dictionary.

At this point, you have a few options (examples of which can be found in the TestProjects\ConsoleApplication directory of the sample):

Fix the misspelling.

Duh. :)
Suppress the instance.

If it's an isolated use of the word and is correct, then simply right-clicking the warning and choosing Suppress Message(s), In Source will add something like the following attribute to the code which will silence the warning:
```
[SuppressMessage("Spelling", "DF1000:CheckSpellingOfAllStringLiterals", MessageId = "leet")]
```
While you're at it, feel free to add a Justification message if the reason might not be obvious to someone else.
Suppress the entire method.

If a method contains no user-visible text, but has lots of strings that cause warnings, you can suppress the entire method by omitting the MessageId parameter like so:
```
[SuppressMessage("Spelling", "DF1000:CheckSpellingOfAllStringLiterals")]
```
Add the word to the custom dictionary.

If the "misspelled" word is correct and appears throughout the application, you'll probably want to add it to the project's custom dictionary which will silence all relevant warnings at once. MSDN has a great overview of the custom dictionary format as well as the exact steps to take to add a custom dictionary to a project in the article How to: Customize the Code Analysis Dictionary.

Alternatively, if you're a command-line junkie or don't want to modify your Visual Studio project, you can use FxCopCmd directly by running it from a Visual Studio Command Prompt like so:

C:\T>"C:\Program Files (x86)\Microsoft Visual Studio 10.0\Team Tools\Static Analysis Tools\FxCop\FxCopCmd.exe"
  /file:C:\T\ConsoleApplication\bin\Debug\ConsoleApplication.exe
  /ruleset:=C:\T\Delay.FxCop\Delay.FxCop.ruleset /console
Microsoft (R) FxCop Command-Line Tool, Version 10.0 (10.0.30319.1) X86
Copyright (C) Microsoft Corporation, All Rights Reserved.

[...]
Loaded delay.fxcop.dll...
Loaded ConsoleApplication.exe...
Initializing Introspection engine...
Analyzing...
Analysis Complete.
Writing 1 messages...

C:\T\ConsoleApplication\Program.cs(12,1) : warning  : DF1000 : Spelling : The word 'recieve' is not in the dictionary.
Done:00:00:01.4352025

Or else you can install the standalone FxCop tool to get the benefits of a graphical user interface without changing anything about your existing workflow!

[Click here to download the Delay.FxCop rule assembly, associated .ruleset files, samples, and the complete source code.]

Spelling is one of those things that's easy to get wrong - and also easy to get right if you apply the proper technology and discipline. I can't hope to make anyone a better speller ('i' before 'e', except after 'c'!), but I can help out a little on the technology front. I plan to add new code analysis rules to Delay.FxCop over time - but for now I hope people put DF1000: CheckSpellingOfAllStringLiterals to good use finding spelling mistakes in their applications!

Tags: Technical

Safe X (ml parsing with XLINQ) [XLinqExtensions helps make XML parsing with .NET's XLINQ a bit safer and easier]

Thursday, May 26, 2011

XLINQ (aka LINQ-to-XML) is a set of classes that make it simple to work with XML by exposing the element tree in a way that's easy to manipulate using standard LINQ queries. So, for example, it's trivial to write code to select specific nodes for reading, create well-formed XML fragments, or transform an entire document. Because of its query-oriented nature, XLINQ makes it easy to ignore parts of a document that aren't relevant: if you don't query for them, they don't show up! Because it's so handy and powerful, I encourage folks who aren't already familiar to find out more.

Aside: As usual, flexibility comes with a cost and it is often more efficient to read and write XML with the underlying XmlReader and XmlWriter classes because they don't expose the same high-level abstractions. However, I'll suggest that the extra productivity of developing with XLINQ will often outweigh the minor computational cost it incurs.

When I wrote the world's simplest RSS reader as a sample for my post on WebBrowserExtensions, I needed some code to parse the RSS feed for my blog and dashed off the simplest thing possible using XLINQ. Here's a simplified version of that RSS feed for reference:

<rss version="2.0">
  <channel>
    <title>Delay's Blog</title>
    <item>
      <title>First Post</title>
      <pubDate>Sat, 21 May 2011 13:00:00 GMT</pubDate>
      <description>Post description.</description>
    </item>
    <item>
      <title>Another Post</title>
      <pubDate>Sun, 22 May 2011 14:00:00 GMT</pubDate>
      <description>Another post description.</description>
    </item>
  </channel>
</rss>

The code I wrote at the time looked a lot like the following:

private static void NoChecking(XElement feedRoot)
{
    var version = feedRoot.Attribute("version").Value;
    var title = feedRoot.Element("channel").Element("title").Value;
    ShowFeed(version, title);
    foreach (var item in feedRoot.Element("channel").Elements("item"))
    {
        title = item.Element("title").Value;
        var publishDate = DateTime.Parse(item.Element("pubDate").Value);
        var description = item.Element("description").Value;
        ShowItem(title, publishDate, description);
    }
}

Not surprisingly, running it on the XML above leads to the following output:

Delay's Blog (RSS 2.0)
  First Post
    Date: 5/21/2011
    Characters: 17
  Another Post
    Date: 5/22/2011
    Characters: 25

That code is simple, easy to read, and obvious in its intent. However (as is typical for sample code tangential to the topic of interest), there's no error checking or handling of malformed data. If anything within the feed changes, it's quite likely the code I show above will throw an exception (for example: because the result of the Element method is null when the named element can't be found). And although I don't expect changes to the format of this RSS feed, I'd be wary of shipping code like that because it's so fragile.

Aside: Safely parsing external data is a challenging task; many exploits take advantage of parsing errors to corrupt a process's state. In the discussion here, I'm focusing mainly on "safety" in the sense of "resiliency": the ability of code to continue to work (or at least not throw an exception) despite changes to the format of the data it's dealing with. Naturally, more resilient parsing code is likely to be less vulnerable to hacking, too - but I'm not specifically concerned with making code hack-proof here.

Adding the necessary error-checking to get the above snippet into shape for real-world use isn't particularly hard - but it does add a lot more code. Consequently, readability suffers; although the following method performs exactly the same task, its implementation is decidedly harder to follow than the original:

private static void Checking(XElement feedRoot)
{
    var version = "";
    var versionAttribute = feedRoot.Attribute("version");
    if (null != versionAttribute)
    {
        version = versionAttribute.Value;
    }
    var channelElement = feedRoot.Element("channel");
    if (null != channelElement)
    {
        var title = "";
        var titleElement = channelElement.Element("title");
        if (null != titleElement)
        {
            title = titleElement.Value;
        }
        ShowFeed(version, title);
        foreach (var item in channelElement.Elements("item"))
        {
            title = "";
            titleElement = item.Element("title");
            if (null != titleElement)
            {
                title = titleElement.Value;
            }
            var publishDate = DateTime.MinValue;
            var pubDateElement = item.Element("pubDate");
            if (null != pubDateElement)
            {
                if (!DateTime.TryParse(pubDateElement.Value, out publishDate))
                {
                    publishDate = DateTime.MinValue;
                }
            }
            var description = "";
            var descriptionElement = item.Element("description");
            if (null != descriptionElement)
            {
                description = descriptionElement.Value;
            }
            ShowItem(title, publishDate, description);
        }
    }
}

It would be nice if we could somehow combine the two approaches to arrive at something that reads easily while also handling malformed content gracefully... And that's what the XLinqExtensions extension methods are all about!

Using the naming convention SafeGet* where "*" can be Element, Attribute, StringValue, or DateTimeValue, these methods are simple wrappers that avoid problems by always returning a valid object - even if they have to create an empty one themselves. In this manner, calls that are expected to return an XElement always do; calls that are expected to return a DateTime always do (with a user-provided fallback value for scenarios where the underlying string doesn't parse successfully). To be clear, there's no magic here - all the code is very simple - but by pushing error handling into the accessor methods, the overall experience feels much nicer.

To see what I mean, here's what the same code looks like after it has been changed to use XLinqExtensions - note how similar it looks to the original implementation that used the simple "write it the obvious way" approach:

private static void Safe(XElement feedRoot)
{
    var version = feedRoot.SafeGetAttribute("version").SafeGetStringValue();
    var title = feedRoot.SafeGetElement("channel").SafeGetElement("title").SafeGetStringValue();
    ShowFeed(version, title);
    foreach (var item in feedRoot.SafeGetElement("channel").Elements("item"))
    {
        title = item.SafeGetElement("title").SafeGetStringValue();
        var publishDate = item.SafeGetElement("pubDate").SafeGetDateTimeValue(DateTime.MinValue);
        var description = item.SafeGetElement("description").SafeGetStringValue();
        ShowItem(title, publishDate, description);
    }
}

Not only is the XLinqExtensions version almost as easy to read as the simple approach, it has all the resiliancy benefits of the complex one! What's not to like?? :)

[Click here to download the XLinqExtensions sample application containing everything shown here.]

I've found the XLinqExtensions approach helpful in my own projects because it enables me to parse XML with ease and peace of mind. The example I've provided here only scratches the surface of what's possible (ex: SafeGetIntegerValue, SafeGetUriValue, etc.), and is intended to set the stage for others to adopt a more robust approach to XML parsing. So if you find yourself parsing XML, please consider something similar!

PS - The complete set of XLinqExtensions methods I use in the sample is provided below. Implementation of additional methods to suit custom scenarios is left as an exercise to the reader. :)

/// <summary>
/// Class that exposes a variety of extension methods to make parsing XML with XLINQ easier and safer.
/// </summary>
static class XLinqExtensions
{
    /// <summary>
    /// Gets the named XElement child of the specified XElement.
    /// </summary>
    /// <param name="element">Specified element.</param>
    /// <param name="name">Name of the child.</param>
    /// <returns>XElement instance.</returns>
    public static XElement SafeGetElement(this XElement element, XName name)
    {
        Debug.Assert(null != element);
        Debug.Assert(null != name);
        return element.Element(name) ?? new XElement(name, "");
    }

    /// <summary>
    /// Gets the named XAttribute of the specified XElement.
    /// </summary>
    /// <param name="element">Specified element.</param>
    /// <param name="name">Name of the attribute.</param>
    /// <returns>XAttribute instance.</returns>
    public static XAttribute SafeGetAttribute(this XElement element, XName name)
    {
        Debug.Assert(null != element);
        Debug.Assert(null != name);
        return element.Attribute(name) ?? new XAttribute(name, "");
    }

    /// <summary>
    /// Gets the string value of the specified XElement.
    /// </summary>
    /// <param name="element">Specified element.</param>
    /// <returns>String value.</returns>
    public static string SafeGetStringValue(this XElement element)
    {
        Debug.Assert(null != element);
        return element.Value;
    }

    /// <summary>
    /// Gets the string value of the specified XAttribute.
    /// </summary>
    /// <param name="attribute">Specified attribute.</param>
    /// <returns>String value.</returns>
    public static string SafeGetStringValue(this XAttribute attribute)
    {
        Debug.Assert(null != attribute);
        return attribute.Value;
    }

    /// <summary>
    /// Gets the DateTime value of the specified XElement, falling back to a provided value in case of failure.
    /// </summary>
    /// <param name="element">Specified element.</param>
    /// <param name="fallback">Fallback value.</param>
    /// <returns>DateTime value.</returns>
    public static DateTime SafeGetDateTimeValue(this XElement element, DateTime fallback)
    {
        Debug.Assert(null != element);
        DateTime value;
        if (!DateTime.TryParse(element.Value, out value))
        {
            value = fallback;
        }
        return value;
    }
}

Tags: Technical

Images in a web page: meh... Images in a web page: cool! [Delay.Web.Helpers assembly now includes an ASP.NET web helper for data URIs (in addition to Amazon S3 blob/bucket support)]

Wednesday, April 6, 2011

The topic of "data URIs" came up on a discussion list I follow last week in the context of "I'm in the process of creating a page and have the bytes of an image from my database. Can I deliver them directly or must I go through a separate URL with WebImage?" And the response was that using a data URI would allow that page to deliver the image content inline. But while data URIs are fairly simple, there didn't seem to be a convenient way to use them from an ASP.NET MVC/Razor web page.

Which was kind of fortuitous for me because I've been interested in learning more about data URIs for a while and it seemed that creating a web helper for this purpose would be fun. Better yet, I'd already released the Delay.Web.Helpers assembly (with support for Amazon S3 blob/bucket access), so I had the perfect place to put the new DataUri class once I wrote it! :)

Aside: For those who aren't familiar, the WebImage class provides a variety of handy methods for dealing with images on the server - including a Write method for sending them to the user's browser. However, the Write method needs to be called from a dedicated page that serves up just the relevant image, so it isn't a solution for the original scenario.

In case you've not heard of them before, data URIs are a kind of URL scheme documented by RFC 2397. They're quite simple, really - here's the relevant part of the specification:

data:[<mediatype>][;base64],<data>

dataurl    := "data:" [ mediatype ] [ ";base64" ] "," data
mediatype  := [ type "/" subtype ] *( ";" parameter )
data       := *urlchar
parameter  := attribute "=" value

It takes two pieces of information to create a data URI: the data and its media type (ex: "image/png"). (Although the media type appears optional above, it defaults to "text/plain" when absent - which is unsuitable for most common data URI scenarios.) Pretty much the only interesting thing you can do with data URIs on the server is write them, so the DataUri web helper exposes a single Write method with five flavors. The media type is always passed as a string (feel free to use the MediaTypeNames class to help here), but the data can be provided as a file name string, byte[], IEnumerable<byte>, or Stream. That's four methods; the fifth one takes just the file name string and infers the media type from the file's extension (ex: ".png" -> "image/png").

Aside: Technically, it would be possible for the other methods to infer media type as well by examining the bytes of data. However, doing so would require quite a bit more work and would always be subject to error. On the other hand, inferring media type from the file's extension is computationally trivial and much more likely to be correct in practice.

For an example of the DataUri helper in action, here's the Razor code to implement the "image from the database" scenario that started it all:

@{
    IEnumerable<dynamic> databaseImages;
    using (var database = Database.Open("Delay.Web.Helpers.Sample.Database"))
    {
        databaseImages = database.Query("SELECT * FROM Images");
    }

    // ...

    foreach(var image in databaseImages)
    {
        <p>
            <img src="@DataUri.Write(image.Content, image.MediaType)" alt="@image.Name"
                 width="@image.Width" height="@image.Height" style="vertical-align:middle"/>
            @image.Name
        </p>
    }
}

Easy-peasy lemon-squeezy!

And here's an example of using the file name-only override for media type inference:

<script src="@DataUri.Write(Server.MapPath("Sample-Script.js"))" type="text/javascript"></script>

Which comes out like this in the HTML that's sent to the browser:

<script src="data:text/javascript;base64,77u/ZG9j...PicpOw==" type="text/javascript"></script>

Aside: This particular example (using a data URI for a script file) doesn't render in all browsers. Specifically, Internet Explorer 8 (and earlier) blocks script delivered like this because of security concerns. Fortunately, Internet Explorer 9 has addressed those concerns and renders as expected. :)

[Click here to download the Delay.Web.Helpers assembly, complete source code, automated tests, and the sample web site.]

[Click here to go to the NuGet page for Delay.Web.Helpers which includes the DLL and its associated documentation/IntelliSense XML file.]

[Click here to go to the NuGet page for the Delay.Web.Helpers.SampleWebSite which includes the sample site that demonstrates everything.]

Data URIs are pretty neat things - though it's important to be aware they have their drawbacks as well. Fortunately, the Wikipedia article does a good job discussing the pros and cons, so I highly recommend looking it over before converting all your content. :) Creating a data URI manually isn't rocket science, but it is the kind of thing ASP.NET web helpers are perfectly suited for. If you're a base-64 nut, maybe you'll continue doing this by hand - but for everyone else, I hope the new DataUri class in the Delay.Web.Helpers assembly proves useful!

Tags: Technical Web Platform