The blog of dlaa.me
Tag: "TextAnalysisTool"
  • Out of hibernation [A new home and a bunch of updates for TextAnalysisTool.NET]
    Monday, January 12th 2015

    TextAnalysisTool.NET is one of the first side projects I did at Microsoft, and one of the most popular. (Click here for relevant blog posts by tag.) Many people inside and outside the company have written me with questions, feature requests, or sometimes just to say "thank you". It's always great to hear from users, and they've provided a long list of suggestions and ideas for ways to make TextAnalysisTool.NET better.

    By virtue of changing teams and roles various times over the years, I don't find myself using TextAnalysisTool.NET as much as I once did. My time and interests are spread more thinly, and I haven't been updating the tool as aggressively. (Understatement of the year?)

    Various coworkers have asked for access to the code, but nothing much came of that - until recently, when a small group showed up with the interest, expertise, and motivation to drive TextAnalysisTool.NET forward! They inspired me to simplify the contribution process and they have been making a steady stream of enhancements for a while now. It's time to take things to the next level, and today marks the first public update to TextAnalysisTool.NET in a long time!

     

    The new source for all things TextAnalysisTool is: the TextAnalysisTool.NET home page

    That's where you'll find an overview, download link, release notes, and other resources. The page is owned by the new TextAnalysisTool GitHub organization, so all of us are able to make changes and publish new releases. There's also an issue tracker, so users can report bugs, comment on issues, update TODOs, make suggestions, etc..

    The new 2015-01-07 release can be downloaded from there, and includes the following changes since the 2013-05-07 release:

    2015-01-07 by Uriel Cohen (http://github.com/cohen-uriel)
    ----------
    * Added a tooltip to the loaded file indicator in the status bar
    * Fixed a bug where setting a marker used in an active filter causes the
      current selection of lines to be changed
    
    2015-01-07 by David Anson (http://dlaa.me/)
    ----------
    * Improve HTML representation of clipboard text when copying for more
      consistent paste behavior
    
    2015-01-01 by Uriel Cohen (http://github.com/cohen-uriel)
    ----------
    * Fixed a bug where TAB characters are omitted in the display
    * Fixed a bug where lines saved to file include an extra white space at the
      start
    
    2014-12-21 by Uriel Cohen (http://github.com/cohen-uriel)
    ----------
    * Changed compilation to target .NET Framework 4.0
    
    2014-12-11 by Uriel Cohen (http://github.com/cohen-uriel)
    ----------
    * Redesigned the status bar indications to be consistent with Visual Studio and
      added the number of currently selected lines
    
    2014-12-04 by Uriel Cohen (http://github.com/cohen-uriel)
    ----------
    * Added the ability to append an existing filters file to the current filters
      list
    
    2014-12-01 by Uriel Cohen (http://github.com/cohen-uriel)
    ----------
    * Added recent file/filter menus for easy access to commonly-used files
    * Added a new settings registry key to set the
      maximum number of recent files or filter files allowed in the
      corresponding file menus
    * Fixed bug where pressing SPACE with no matching lines from filters
      crashed the application
    * Fixed a bug where copy-pasting lines from the application to Lync
      resulted in one long line without carriage returns
    
    2014-11-11 by Uriel Cohen (http://github.com/cohen-uriel)
    ----------
    * Added support for selection of background color in the filters
      (different selection of colors than the foreground colors)
    * The background color can be saved and loaded with the filters
    * Filters from previous versions that lack a background color will have the
      default background color
    * Saving foreground color field in filters to 'foreColor' attribute.
      Old 'color' attribute is still being loaded for backward compatibility
      purposes.
    * Changed control alignment in Find dialog and Filter dialog
    
    2014-10-21 by Mike Morante (http://github.com/mike-mo)
    ----------
    * Fix localization issue with the build string generation
    
    2014-04-22 by Mike Morante (http://github.com/mike-mo)
    ----------
    * Line metadata is now visually separate from line text contents
    * Markers can be shown always/never/when in use to have more room for line text
      and the chosen setting persists across sessions
    * Added statusbar panel funnel icon to reflect the current status of the Show
      Only Filtered Lines setting
    
    2014-02-27 by Mike Morante (http://github.com/mike-mo)
    ----------
    * Added zoom controls to quickly increase/decrease the font size
    * Zoom level persists across sessions
    * Added status bar panel to show current zoom level
    

     

    These improvements were all possible thanks to the time and dedication of the new contributors (and organization members):

    Please join me in thanking these generous souls for taking time out of their busy schedule to contribute to TextAnalysisTool.NET! They've been a pleasure to work with, and a great source of ideas and suggestions. I've been really pleased with their changes and hope you find the new TextAnalysisTool.NET more useful than ever!

    Tags: Technical TextAnalysisTool Utilities
  • A trip down memory (footprint) lane [Download for the original TextAnalysisTool, circa 2001]
    Monday, September 22nd 2014

    As you might guess from the name, TextAnalysisTool.NET (introductory blog post, related links) was not the first version of the tool. The original implementation was written in C, compiled for x86, slightly less capable, and named simply TextAnalysisTool. I got an email asking for a download link recently, so I dug up a copy and am posting it for anyone who's interested.

    The UI should be very familiar to TextAnalysisTool.NET users:

    The original TextAnalysisTool filtering a simple file

    The behavior is mostly the same as well (though the different hot key for "add filter" trips me up pretty consistently).

    A few notes:

    • The code is over 13 years old
    • So I'm not taking feature requests :)
    • But it runs on vintage operating systems (seriously, this is before Windows XP)
    • And it also runs great on Windows 8.1 (yay backward compatibility!)
    • It supports:
      • Text filters
      • Regular expressions
      • Markers
      • Find
      • Go to
      • Reload
      • Copy/paste
      • Saved configurations
      • Multi-threading
    • But does not support:
      • Colors
      • Rich selection
      • Rich copy
      • Line counts
      • Filter hot keys
      • Plugins
      • Unicode

    Because it uses ASCII-encoding for strings (vs. .NET's Unicode representation), you can reasonably expect loading a text file in TextAnalysisTool to use about half as much memory as it does in TextAnalysisTool.NET. However, as a 32-bit application, TextAnalysisTool is limited to the standard 2GB virtual address space of 32-bit processes on Windows (even on a 64-bit OS). On the other hand, TextAnalysisTool.NET is an architecture-neutral application and can use the full 64-bit virtual address space on a 64-bit OS. There may be rare machine configurations where the physical/virtual memory situation is such that older TextAnalysisTool can load a file newer TextAnalysisTool.NET can't - so if you're stuck, give it a try!

    Aside: If you're really adventurous, you can try using EditBin to set the /LARGEADDRESSAWARE option on TextAnalysisTool.exe to get access to more virtual address space on a 64-bit OS or via /3GB on a 32-bit OS. But be warned that you're well into "undefined behavior" territory because I don't think that switch even existed when I wrote TextAnalysisTool. I've tried it briefly and things seem to work - but this is definitely sketchy. :)

    Writing the original TextAnalysisTool was a lot of fun and contributed significantly to a library of C utility functions I used at the time called ToolBox. It also provided an excellent conceptual foundation upon which to build TextAnalysisTool.NET in addition to good lessons about how to approach the problem space. If I ever get around to writing a third version (TextAnalysisTool.WPF? TextAnalysisTool.Next?), it will take inspiration from both projects - and handle absurdly-large files.

    So if you're curious to try a piece of antique software, click here to download the original TextAnalysisTool.

    But for everything else, you should probably click here to download the newer TextAnalysisTool.NET.

    Tags: Technical TextAnalysisTool Utilities
  • Plug it in, plug it in [Sample code for two TextAnalysisTool.NET plug-ins demonstrates support for custom file types]
    Tuesday, April 29th 2014

    A few days ago, @_yabloki tweeted asking how to write a TextAnalysisTool.NET plug-in. I've answered this question a few times in email, but never blogged it before now.

    To understand the basis of the question, you need to know what TextAnalysisTool.NET is; for that, I refer you to the TextAnalysisTool.NET page for an overview.

    Animated GIF showing basic TextAnalysisTool.NET functionality

    To understand the rest of the question, you need to know what a plug-in is; for that, there's the following paragraph from the documentation:

    TextAnalysisTool.NET's support for plug-ins allows users to add in their own
    code that understands specialized file types.  Every time a file is opened,
    each plug-in is given a chance to take responsibility for parsing that file.
    When a plug-in takes responsibility for parsing a file, it becomes that plug-
    in's job to produce a textual representation of the file for display in the
    usual line display.  If no plug-in supports a particular file, then it gets
    opened using TextAnalysisTool.NET's default parser (which displays the file's
    contents directly).  One example of what a plug-in could do is read a binary
    file format and produce meaningful textual output from it (e.g., if the file is
    compressed or encrypted).  Another plug-in might add support for the .zip
    format and display a list of the files within the archive.  A particularly
    ambitious plug-in might translate text files from one language to another.  The
    possibilities are endless!
    

     

    Armed with an understanding of TextAnalysisTool.NET and its support for plug-ins, we're ready to look at the interface plug-ins must implement:

    namespace TextAnalysisTool.NET.Plugin
    {
        /// <summary>
        /// Interface that all TextAnalysisTool.NET plug-ins must implement
        /// </summary>
        internal interface ITextAnalysisToolPlugin
        {
            /// <summary>
            /// Gets a meaningful string describing the type of file supported by the plug-in
            /// </summary>
            /// <remarks>
            /// Used to populate the "Files of type" combo box in the Open file dialog
            /// </remarks>
            /// <example>
            /// "XML Files"
            /// </example>
            /// <returns>descriptive string</returns>
            string GetFileTypeDescription();
    
            /// <summary>
            /// Gets the file type pattern describing the type(s) of file supported by the plug-in
            /// </summary>
            /// <remarks>
            /// Used to populate the "Files of type" combo box in the Open file dialog
            /// </remarks>
            /// <example>
            /// "*.xml"
            /// </example>
            /// <returns>file type pattern</returns>
            string GetFileTypePattern();
    
            /// <summary>
            /// Indicates whether the plug-in is able to parse the specified file
            /// </summary>
            /// <param name="fileName">full path to the file</param>
            /// <remarks>
            /// Called whenever a file is being opened to give the plug-in a chance to handle it;
            /// ideally the result can be returned based solely on the file name, but it is
            /// acceptable to open, read, and close the file if necessary
            /// </remarks>
            /// <returns>true iff the file is supported</returns>
            bool IsFileTypeSupported(string fileName);
    
            /// <summary>
            /// Returns a TextReader instance that will be used to read the specified file
            /// </summary>
            /// <param name="fileName">full path to the file</param>
            /// <remarks>
            /// The only methods that will be called (and therefore need to be implemented) are
            /// TextReader.ReadLine() and IDisposable.Dispose()
            /// </remarks>
            /// <returns>TextReader instance</returns>
            System.IO.TextReader GetReaderForFile(string fileName);
        }
    }
    

    Disclaimer: I wrote TextAnalysisTool.NET many years ago as a way to learn the (then) newly-released .NET 1.0 Framework. Extensibility frameworks like MEF weren't available yet, so please forgive the omission! :)

     

    As you can see, the plug-in interface is simple, straightforward, automatically integrates into the standard File|Open UI, and leaves a great deal of freedom around implementation and function. Specifically, the TextReader instance returned by GetReaderForFile can do pretty much whatever you want. For example:

    • Simple tweaks to the input (ex: normalizing time stamps)
    • Filtering of the input (ex: to remove irrelevant lines)
    • Complex transformations of the input (ex: format conversions)
    • Completely unrelated data (ex: input from a network socket)

    There's a lot of flexibility, and maybe the open-endedness is daunting? :) To make things concrete, I've packaged two of the samples I came up with during the original plug-in definition.

     

    TATPlugin_SampleData

    Loads files named like 3.lines and renders that many lines of sample text into the display.

    Input (file name):

    3.lines
    

    Output:

    1: The quick brown fox jumps over a lazy dog.
    2: The quick brown fox jumps over a lazy dog.
    3: The quick brown fox jumps over a lazy dog.
    

     

    TATPlugin_XMLFormatter

    Loads well-formed XML and pretty-prints it for easier reading.

    Input:

    <root><element><nested>value</nested></element><element><shallow><deep>value</deep></shallow></element></root>
    

    Output:

    <root>
      <element>
        <nested>value</nested>
      </element>
      <element>
        <shallow>
          <deep>value</deep>
        </shallow>
      </element>
    </root>
    

     

    [Click here to download the source code and supporting files for the sample TextAnalysisTool.NET plug-ins]

    The download ZIP also includes Plugin.cs (the file defining the above interface), a few sample data files, and some trivial Build.cmd scripts to compile everything from a Visual Studio Developer Command Prompt (or similar environment where csc.exe and MSBuild.exe are available).

    Note: When experimenting with the samples, remember that TextAnalysisTool.NET loads its plugins from the current directory at startup. So put a copy of TextAnalysisTool.NET (and its .config file) alongside the DLL outputs in the root of the samples directory and remember to re-start it if you change one of the samples. To check that plug-ins are loaded successfully, use the Help|Installed plug-ins menu item.

    Aside: Plug-ins are generally UI-less, but they don't have to be - take a look at what Tomer did with the WPPFormatter plug-in for an example.

    Tags: Technical TextAnalysisTool Utilities
  • 64 bits ought to be enough for everybody [TextAnalysisTool.NET update for .NET 2.0 and 64-bit enables the analysis of larger files!]
    Monday, May 20th 2013

    TextAnalysisTool.NET is a free program designed to excel at viewing, searching, and navigating large files quickly and efficiently.

    I wrote the first version of TextAnalysisTool back in 2000 using C++ and Win32. In 2003, I rewrote it using the new .NET 1.0 Framework - and upgraded to .NET 1.1 later that year. There were a variety of improvements between then and 2006, the date of the last update to TextAnalysisTool.NET. Along the way, I've heard from a lot of people who use this tool to simplify their daily workflow! In the past year, I've started getting requests for 64-bit support from folks working with extremely large files that don't fit in the 4GB virtual address space limits of a normal 32-bit process on Windows. Although .NET 1.1 didn't support 64-bit processes, .NET 2.0 does, and I've decided it's finally time to take the plunge.:)

    Animated GIF showing basic TextAnalysisTool.NET functionality

     

    With this release, TextAnalysisTool.NET has been compiled using the .NET 2.0 toolset and the AnyCPU option which automatically matches a process to the architecture of its host operating system. On 32-bit OSes, TextAnalysisTool.NET will continue to run as a 32-bit process, but on 64-bit OSes, it will run as a 64-bit process and have access to a significantly larger address space. This makes it possible to work with larger log files without the risk of crashing into the 4GB memory limit (which can end up being as low as 1.7 GB in practice)!

    Other than a few exceedingly minor string updates, I have made no changes to the way TextAnalysisTool.NET behaves - so the new version should feel just like the previous one. The framework update means .NET 1.1 is no longer a supported platform and .NET 2.0 is now natively supported. The included .config file allows the same executable to run under .NET 4.0 as-is (for example on Windows 8 without the optional ".NET Framework 3.5" feature installed).

    If you've ever run out of memory using TextAnalysisTool.NET, please give this new version a try! And if not, go ahead and continue using the previous version without worrying that you're missing out on anything.:)

     

    Click here to download the latest version of TextAnalysisTool.NET

    Click here to visit the TextAnalysisTool.NET web page for more information

     

    Many thanks to everyone for all the great feedback - I love getting messages from people around the world who are using TextAnalysisTool.NET to make their lives easier!

     

    Aside: As a matter of technical interest, details on the one bug I found with 64-bit TextAnalysisTool.NET: the following code had worked fine for the last decade (message.WParam is an IntPtr via Form.WndProc for WM_MOUSEWHEEL):
    int wheelDelta = ((int)message.WParam)>>16; // HIWORD(WPARAM)
    However, when that assignment ran under .NET 2.0 on a 64-bit OS, it quickly threw OverflowException! I was surprised, but it turns out this is documented behavior. Because that line interoperates with the Windows API, I couldn't change the types involved - but I could tweak the code to avoid the exception by avoiding the problematic explicit conversion:
    int wheelDelta = ((int)((long)message.WParam))>>16; // HIWORD(WPARAM)
    Yep, the proverbial one-line fix saves the day!
    Tags: Technical TextAnalysisTool Utilities
  • TextAnalysisTool.NET to Windows 8: "Y U no run me under .NET 4?" [How to avoid the "please install .NET 3.5" dialog when running older .NET applications on the Windows 8 Developer Preview]
    Friday, September 23rd 2011

    I wrote TextAnalysisTool.NET a number of years ago to streamline the task of analyzing large log files by creating an interactive experience that combines searching, filtering, and tagging and allow the user to quickly identify interesting portions of a large log file. Although .NET 2.0 was out at the time, I targeted .NET 1.1 because it was more widely available, coming pre-installed on Windows Server 2003 (the "latest and greatest" OS then). In the years since, I've heard from folks around the world running TextAnalysisTool.NET on subsequent Windows operating systems and .NET Framework versions. Because Windows and the .NET Framework do a great job maintaining backwards compatibility, the same TextAnalysisTool.NET binary has continued to work as-is for the near-decade since its initial release.

    TextAnalysisTool.NET demonstration

     

    But the story changes with Windows 8! Although Windows 8 has the new .NET 4.5 pre-installed (including .NET 4.0 upon which it's based), it does not include .NET 3.5 (and therefore .NET 2.0). While I can imagine some very sensible reasons for the Windows team to take this approach, it's inconvenient for existing applications because .NET 4 in this scenario does not automatically run applications targeting an earlier framework version. What is cool is that the public Windows 8 Developer Preview detects when an older .NET application is run and automatically prompts the user to install .NET 3.5:

    Windows 8 .NET 3.5 install prompt

    That's pretty slick and is a really nice way to bridge the gap. However, it's still kind of annoying for users as it may not always be practical for them to perform a multi-megabyte download the first time they try to run an older .NET program. So it would be nice if there were an easy way for older .NET applications to opt into running under the .NET 4 framework that's already present on Windows 8...

    And there is! :) One aspect of .NET's "side-by-side" support involves using .config files to specify which .NET versions an application is known to work with. (For more information, please refer to the MSDN article How to: Use an Application Configuration File to Target a .NET Framework Version.) Consequently, improving the Windows 8 experience for TextAnalisisTool.NET should be as easy as creating a suitable TextAnalysisTool.NET.exe.config file in the same folder as TextAnalysisTool.NET.exe.

     

    Specifically, the following should do the trick:

    <?xml version="1.0"?>
    <configuration>
      <startup>
        <supportedRuntime version="v1.1.4322"/>
        <supportedRuntime version="v2.0.50727"/>
        <supportedRuntime version="v4.0"/>
      </startup>
    </configuration>

    And it does! :) With that TextAnalysisTool.NET.exe.config file in place, TextAnalysisTool.NET runs on a clean install of the Windows 8 Developer Preview as-is and without prompting the user to install .NET 3.5. I've updated the download ZIP to include this file so new users will automatically benefit; existing users should drop TextAnalysisTool.NET.exe.config in the right place, and they'll be set as well!

    Aside: Although this trick will work in many cases, it isn't guaranteed to work. In particular, if there has been a breaking change in .NET 4, then attempting to run a pre-.NET 4 application in this manner might fail. Therefore, it's prudent to do some verification when trying a change like this!

     

    [Click here to download a ZIP file containing TextAnalysisTool.NET, the relevant .config file, its documentation, and a ReadMe.]

     

    TextAnalysisTool.NET has proven to be extremely popular with support engineers and it's always nice to hear from new users. I hope today's post extends the usefulness of TextAnalysisTool.NET by making the Windows 8 experience as seamless as people have come to expect!

    Tags: Technical TextAnalysisTool Utilities
  • If they build it, I will come (and link to it) [WPPFormatter plug-in now available for TextAnalysisTool.NET]
    Wednesday, July 28th 2010

    I went public with TextAnalysisTool.NET a few years back - it's a handy tool for interactive log file analysis that's popular with people in many parts of the company. The introductory post has more background and detail, but this snippet from the README gives an overview:

    The Problem: For those times when you have to analyze a large amount of textual data, picking out the relevant line(s) of interest can be quite difficult. Standard text editors usually provide a generic "find" function, but the limitations of that simple approach quickly become apparent (e.g., when it is necessary to compare two or more widely separated lines). Some more sophisticated editors do better by allowing you to "bookmark" lines of interest; this can be a big help, but is often not enough.

    The Solution: TextAnalysisTool.NET - a program designed from the start to excel at viewing, searching, and navigating large files quickly and efficiently. TextAnalysisTool.NET provides a view of the file that you can easily manipulate (through the use of various filters) to display exactly the information you need - as you need it.

    And here's an animated GIF showing TextAnalysisTool.NET working with some MSBuild output:

    TextAnalysisTool.NET demonstration

     

    I don't talk about it in the introductory post, but one of the things TextAnalysisTool.NET supports is a flexible plug-in architecture for handling custom file formats. Here's what the included documentation says:

    TextAnalysisTool.NET's support for plug-ins allows users to add in their own code that understands specialized file types. Every time a file is opened, each plug-in is given a chance to take responsibility for parsing that file. When a plug-in takes responsibility for parsing a file, it becomes that plug-in's job to produce a textual representation of the file for display in the usual line display. If no plug-in supports a particular file, then it gets opened using TextAnalysisTool.NET's default parser (which displays the file's contents directly). One example of what a plug-in could do is read a binary file format and produce meaningful textual output from it (e.g., if the file is compressed or encrypted). Another plug-in might add support for the .zip format and display a list of the files within the archive. A particularly ambitious plug-in might translate text files from one language to another. The possibilities are endless!

     

    Over the years, I've been contacted by various people wanting to use the plug-in architecture to add support for specialized file formats. One of those people is Tomer Rotstein who recently released a WPPFormatter plug-in. (If the acronym "WPP" isn't familiar to you, you might start by reading the Windows software trace preprocessor entry on Wikipedia - it's the file format used by the event tracing infrastructure in Windows.) But Tomer has gone above and beyond what I ever had in mind - as you can see from the following screen shot of WPPFormatter's "open file" dialog:

    WPPFormatter demonstration

    I encourage people to read Tomer's post to get a full sense of what WPPFormatter does. There's a download link at the end of that post, so please try it out if it seems useful!

     

    And for those of you who aren't already TextAnalisisTool.NET users, please:

    [Click here to download the latest version of TextAnalysisTool.NET.]

     

    Aside: Tomer's accomplishment is even more notable when you realize that TextAnalysisTool.NET was the first .NET application I ever wrote and that it targets .NET 1.1 - from a time before there were generics, extensibility frameworks, and all the other goodness we've come to take for granted! Truth be told, the plug-in model for TextAnalysisTool.NET is downright wacky in some ways, so I congratulate Tomer on succeeding in spite of my goofy design. :)
    Tags: Technical TextAnalysisTool Utilities
  • Powerful log file analysis for everyone [Releasing TextAnalysisTool.NET!]
    Friday, June 22nd 2007

    A number of years ago, the product team I was on spent a lot of team analyzing large log files. These log files contained thousands of lines of output tracing what the code was doing, what its current state was, and gobs of other diagnostic information. Typically, we were only interested in a handful of lines - but had no idea which ones at first. Often one would start by searching for a generic error message, get some information from that, search for some more specific information, obtain more context, and continue on in that manner until the problem was identified. It was usually the case that interesting lines were spread across the entire file and could only really be understood when viewed together - but gathering them all could be inconvenient. Different people had different tricks and tools to make different aspects of the search more efficient, but nothing really addressed the end-to-end scenario and I decided I'd try to come up with something better.

    TextAnalysisTool was first released to coworkers in July of 2000 as a native C++ application written from scratch. It went through a few revisions over the next year and a half and served myself and others well during that time. Later, as the .NET Framework became popular, I decided it would be a good learning exercise to rewrite TextAnalysisTool to run on .NET as a way to learn the Framework and make some architectural improvements to the application. TextAnalysisTool.NET was released in February of 2003 as a fully managed .NET 1.0 C# application with the same functionality of the C++ application it replaced. TextAnalysisTool.NET has gone through a few revisions since then and has slowly made its way across parts of the company. (It's always neat to get an email from someone in a group I had no idea was using TextAnalysisTool.NET!) TextAnalysisTool.NET is popular enough among its users that I started getting requests to make it available outside the company so that customers could use it to help with investigations.

    The effort of getting something posted to Microsoft.com seemed overwhelming at the time, so TextAnalysisTool.NET stayed internal until now. With the latest request, I realized my blog would be a great way to help internal groups and customers by making TextAnalysisTool.NET available to the public!

    TextAnalysisTool.NET Demonstration

    You can download the latest version of TextAnalysisTool.NET by clicking here (or on the image above).

    In the above demonstration of identifying the errors and warnings from sample build output, note how the use of regular expression text filters and selective hiding of surrounding content make it easy to zoom in on the interesting parts of the file - and then zoom out to get context.

    Additional information can be found in the TextAnalysisTool.NET.txt file that's included in the ZIP download (or from within the application via Help | Documentation). The first section of that file is a tutorial and the second section gives a more detailed overview of TextAnalysisTool.NET (excerpted below). The download also includes a ReadMe.txt with release notes and a few other things worth reading.

    The Problem: For those times when you have to analyze a large amount of textual data, picking out the relevant line(s) of interest can be quite difficult. Standard text editors usually provide a generic "find" function, but the limitations of that simple approach quickly become apparent (e.g., when it is necessary to compare two or more widely separated lines). Some more sophisticated editors do better by allowing you to "bookmark" lines of interest; this can be a big help, but is often not enough.

    The Solution: TextAnalysisTool.NET - a program designed from the start to excel at viewing, searching, and navigating large files quickly and efficiently. TextAnalysisTool.NET provides a view of the file that you can easily manipulate (through the use of various filters) to display exactly the information you need - as you need it.

    Filters: Before displaying the lines of a file, TextAnalysisTool.NET passes the lines of that file through a set of user-defined filters, dimming or hiding all lines that do not satisfy any of the filters. Filters can select only the lines that contain a sub-string, those that have been marked with a particular marker type, or those that match a regular expression. A color can be associated with each filter so lines matching a particular filter stand out and so lines matching different filters can be easily distinguished. In addition to the normal "including" filters that isolate lines of text you DO want to see, there are also "excluding" filters that can be used to suppress lines you do NOT want to see. Excluding filters are configured just like including filters but are processed afterward and remove all matching lines from the set. Excluding filters allow you to easily refine your search even further.

    Markers: Markers are another way that TextAnalysisTool.NET makes it easy to navigate a file; you can mark any line with one or more of eight different marker types. Once lines have been marked, you can quickly navigate between similarly marked lines - or add a "marked by" filter to view only those lines.

    Find: TextAnalysisTool.NET also provides a flexible "find" function that allows you to search for text anywhere within a file. This text can be a literal string or a regular expression, so it's easy to find a specific line. If you decide to turn a find string into a filter, the history feature of both dialogs makes it easy.

    Summary: TextAnalysisTool.NET was written with speed and ease of use in mind throughout. It saves you time by allowing you to save and load filter sets; it lets you import text by opening a file, dragging-and-dropping a file or text from another application, or by pasting text from the clipboard; and it allows you to share the results of your filters by copying lines to the clipboard or by saving the current lines to a file. TextAnalysisTool.NET supports files encoded with ANSI, UTF-8, Unicode, and big-endian Unicode and is designed to handle large files efficiently.

    I maintain a TODO list with a variety of user requests, but I thought I'd see what kind of feedback I got from releasing TextAnalysisTool.NET to the public before I decide where to go with the next release. I welcome suggestions - and problem reports - so please share them with me if you've got any!

    I hope you find TextAnalysisTool.NET useful as I have!

    Tags: Technical TextAnalysisTool Utilities