The blog of dlaa.me

Powerful log file analysis for everyone [Releasing TextAnalysisTool.NET!]

A number of years ago, the product team I was on spent a lot of team analyzing large log files. These log files contained thousands of lines of output tracing what the code was doing, what its current state was, and gobs of other diagnostic information. Typically, we were only interested in a handful of lines - but had no idea which ones at first. Often one would start by searching for a generic error message, get some information from that, search for some more specific information, obtain more context, and continue on in that manner until the problem was identified. It was usually the case that interesting lines were spread across the entire file and could only really be understood when viewed together - but gathering them all could be inconvenient. Different people had different tricks and tools to make different aspects of the search more efficient, but nothing really addressed the end-to-end scenario and I decided I'd try to come up with something better.

TextAnalysisTool was first released to coworkers in July of 2000 as a native C++ application written from scratch. It went through a few revisions over the next year and a half and served myself and others well during that time. Later, as the .NET Framework became popular, I decided it would be a good learning exercise to rewrite TextAnalysisTool to run on .NET as a way to learn the Framework and make some architectural improvements to the application. TextAnalysisTool.NET was released in February of 2003 as a fully managed .NET 1.0 C# application with the same functionality of the C++ application it replaced. TextAnalysisTool.NET has gone through a few revisions since then and has slowly made its way across parts of the company. (It's always neat to get an email from someone in a group I had no idea was using TextAnalysisTool.NET!) TextAnalysisTool.NET is popular enough among its users that I started getting requests to make it available outside the company so that customers could use it to help with investigations.

The effort of getting something posted to Microsoft.com seemed overwhelming at the time, so TextAnalysisTool.NET stayed internal until now. With the latest request, I realized my blog would be a great way to help internal groups and customers by making TextAnalysisTool.NET available to the public!

TextAnalysisTool.NET Demonstration

You can download the latest version of TextAnalysisTool.NET by clicking here (or on the image above).

In the above demonstration of identifying the errors and warnings from sample build output, note how the use of regular expression text filters and selective hiding of surrounding content make it easy to zoom in on the interesting parts of the file - and then zoom out to get context.

Additional information can be found in the TextAnalysisTool.NET.txt file that's included in the ZIP download (or from within the application via Help | Documentation). The first section of that file is a tutorial and the second section gives a more detailed overview of TextAnalysisTool.NET (excerpted below). The download also includes a ReadMe.txt with release notes and a few other things worth reading.

The Problem: For those times when you have to analyze a large amount of textual data, picking out the relevant line(s) of interest can be quite difficult. Standard text editors usually provide a generic "find" function, but the limitations of that simple approach quickly become apparent (e.g., when it is necessary to compare two or more widely separated lines). Some more sophisticated editors do better by allowing you to "bookmark" lines of interest; this can be a big help, but is often not enough.

The Solution: TextAnalysisTool.NET - a program designed from the start to excel at viewing, searching, and navigating large files quickly and efficiently. TextAnalysisTool.NET provides a view of the file that you can easily manipulate (through the use of various filters) to display exactly the information you need - as you need it.

Filters: Before displaying the lines of a file, TextAnalysisTool.NET passes the lines of that file through a set of user-defined filters, dimming or hiding all lines that do not satisfy any of the filters. Filters can select only the lines that contain a sub-string, those that have been marked with a particular marker type, or those that match a regular expression. A color can be associated with each filter so lines matching a particular filter stand out and so lines matching different filters can be easily distinguished. In addition to the normal "including" filters that isolate lines of text you DO want to see, there are also "excluding" filters that can be used to suppress lines you do NOT want to see. Excluding filters are configured just like including filters but are processed afterward and remove all matching lines from the set. Excluding filters allow you to easily refine your search even further.

Markers: Markers are another way that TextAnalysisTool.NET makes it easy to navigate a file; you can mark any line with one or more of eight different marker types. Once lines have been marked, you can quickly navigate between similarly marked lines - or add a "marked by" filter to view only those lines.

Find: TextAnalysisTool.NET also provides a flexible "find" function that allows you to search for text anywhere within a file. This text can be a literal string or a regular expression, so it's easy to find a specific line. If you decide to turn a find string into a filter, the history feature of both dialogs makes it easy.

Summary: TextAnalysisTool.NET was written with speed and ease of use in mind throughout. It saves you time by allowing you to save and load filter sets; it lets you import text by opening a file, dragging-and-dropping a file or text from another application, or by pasting text from the clipboard; and it allows you to share the results of your filters by copying lines to the clipboard or by saving the current lines to a file. TextAnalysisTool.NET supports files encoded with ANSI, UTF-8, Unicode, and big-endian Unicode and is designed to handle large files efficiently.

I maintain a TODO list with a variety of user requests, but I thought I'd see what kind of feedback I got from releasing TextAnalysisTool.NET to the public before I decide where to go with the next release. I welcome suggestions - and problem reports - so please share them with me if you've got any!

I hope you find TextAnalysisTool.NET useful as I have!