The blog of dlaa.me

Posts tagged "Utilities"

Absolute coordinates corrupt absolutely [MouseButtonClicker gets support for absolute coordinates and virtual machine/remote desktop scenarios]

I wrote and shared MouseButtonClicker almost 12 years ago. It's a simple Windows utility to automatically click the mouse button for you. You can read about why that's interesting and how the program works in my blog post, "MouseButtonClicker clicks the mouse so you don't have to!" [Releasing binaries and source for a nifty mouse utility].

I've used MouseButtonClicker at work and at home pretty much every day since I wrote it. It works perfectly well for normal scenarios, but also virtual machine guest scenarios because mouse input is handled by the host operating system and clicks are generated as user input to the remote desktop app. That means it's possible to wave the mouse into and out of a virtual machine window and get the desired automatic-clicking behavior everywhere.

However, for a number of months now, I haven't been able to use MouseButtonClicker in one specific computing environment. The details aren't important, but you can imagine a scenario like this one where the host operating system is locked down and doesn't permit the user (me) to execute third-party code. Clearly, it's not possible to run MouseButtonClicker on the host OS in this case, but shouldn't it work fine within a virtual machine guest OS?

Actually, no. It turns out that raw mouse input messages for the VM guest are provided to that OS in absolute coordinates rather than relative coordinates. I had previously disallowed this scenario because absolute coordinates are also how tablet input devices like those made by Wacom present their input and you do not want automatic mouse button clicking when using a pen or digitizer.

But desperate times call for desperate measures and I decided to reverse my decision and add support for absolute mouse input based on this new understanding. The approach I took was to remember the previous absolute position and subtract the coordinates of the current absolute position to create a relative change - then reuse all the rest of the existing code. I used my Wacom tablet to generate absolute coordinates for development purposes, though it sadly violates the specification and reports values in the range [0, screen width/height]. This mostly doesn't matter, except the anti-jitter bounding box I describe in the original blog post is tied to the units of mouse movement.

Aside: The Wacom tablet also seems to generate a lot of spurious [0, 0] input messages (which are ignored), but I didn't spend time tracking this down because it's not a real use case.

In the relative coordinate world, ±2 pixels is very reasonable for jitter detection. However, in the absolute coordinate world (assuming an implementation that complies with the specification) 2 units out of 65,536 is insignificant. For reference, on a 4K display, each screen pixel is about 16 of these units. One could choose to scale the absolute input values according to the current width and height of the screen, but that runs into challenges when multiple screens (at different resolutions!) are present. Instead, I decided to scale all absolute input values down to a 4K maximum, thereby simulating a 4K (by 4K) screen for absolute coordinate scenarios. It's not perfect, but it's quick and it has proven accurate enough for the purposes of avoiding extra clicks due to mouse jitter.

I also took the opportunity to publish the source code on GitHub and convert from a Visual Studio solution over to a command-line build based on the Windows SDK and build tools (free for anyone to download, see the project README for more). Visual Studio changes its solution/project formats so frequently, I don't recall that I've ever come back to a project and been able to load it in the current version of VS without having to migrate it and deal with weird issues. Instead, having done this conversion, everything is simpler, more explicit, and free for all. I also took the opportunity to opt into new security flags in the compiler and linker so any 1337 haxxors trying to root my machine via phony mouse events/hardware are going to have a slightly harder time of it.

Aside: I set the compiler/linker flags to optimize for size (including an undocumented linker flag to actually exclude debug info), but the new executables are roughly double the size from before. Maybe it's the new anti-hacking protections? Maybe it's Maybelline?

With the addition of absolute coordinate support, MouseButtonClicker now works seamlessly in both environments: on the host operating system or within a virtual machine guest window. That said, there are some details to be aware of. For example, dragging the mouse out of a remote desktop window looks to the relevant OS as though the mouse was dragged to the edge of the screen and then stopped moving. That means a click should be generated, and you might be surprised to see windows getting activated when their non-client area is clicked on. If you're really unlucky, you'll drag the mouse out of the upper-right corner and a window will disappear as its close button is clicked (darn you, Fitts!). It should be possible to add heuristics to prevent this sort of behavior, but it's also pretty easy to avoid once you know about it. (Kind of like how you quickly learn to recognize inert parts of the screen where the mouse pointer can safely rest.)

The latest version of the code (with diffs to the previously published version) is available on the MouseButtonClicker GitHub project page. 32- and 64-bit binaries are available for download on the Releases page. Pointers to a few similar tools for other operating systems/environments can be found at the bottom of the README.

If you've never experienced automatic mouse button clicking, I encourage you to try it out! There's a bit of an adjustment period to work through (and it's probably not going to appeal to everyone), but auto-clicking (a.k.a. dwell-clicking, a.k.a. hover-clicking) can be quite nice once you get the hang of it.

A picture is worth a thousand words [A small script to update contact photos on iOS]

I'm always looking for new ways to develop code. My latest adventure was writing a script to update contact photos on iOS for a prettier experience in the Messages app conversation list. The script is called update-ios-contact-images.js and it runs in Scriptable, a great app for interacting with the iOS platform from JavaScript. The idea is:

There are various conditions where Apple iOS can't (or won't) synchronize Contact photos between iPhone/iPad devices. If you're in this situation and want it to "just work", you can configure each device manually. Or you can run this script to do that for you.

update-ios-contact-images.js takes a list of email addresses and optional image links and sets the photo for matching contacts in your address book. If an image link is provided, it's used as-is; if not, the Gravatar for that email address is used instead.

You can find more in the update-ios-contact-images.js repository on Github, but the code is short enough that I've included it below.

Notes

  • As I said on Twitter, this was the first meaningful programming project I did completely on iPad (and iPhone). Research, prototyping, coding, debugging, documentation, and posting to GitHub were all done on an iOS device (using a Bluetooth keyboard at times). The overall workflow has some rough edges, but many of the pieces are there today to do real-world development tasks.
  • I'm accustomed to using JavaScript Promises directly, but took this opportunity to try out async and await. The latter are definitely easier to use - and probably easier to understand (though the syntax error JavaScriptCore gives for using await outside an async function is not obvious to me). However, the lack of support for "parallelism" by async/await means you still need to know about Promises and be comfortable using helpers like Promise.all (so I wonder how much of a leaky abstraction this ends up being).
    • Yes, I know JavaScript is technically single-threaded in this context; that's why I put the word "parallelism" in quotes above. :)

Code

// update-ios-contact-images
// A Scriptable (https://scriptable.app/) script to update contact photos on iOS
// Takes a list of email accounts and image URLs and assigns each image to the corresponding contact
// https://github.com/DavidAnson/update-ios-contact-images

// List of email accounts and images to update
const accounts = [
  {
    email: "test1@example.com"
    // No "image" property; uses Gravatar
  },
  {
    email: "test2@example.com",
    image: "https://example.com/images/test2.jpg"
  }
];

// MD5 hash for Gravatar (see https://github.com/blueimp/JavaScript-MD5 and https://cdnjs.com/libraries/blueimp-md5)
eval(await (new Request("https://cdnjs.cloudflare.com/ajax/libs/blueimp-md5/2.10.0/js/md5.min.js")).loadString());

// Load all address book contacts
const contacts = await Contact.all(await ContactsContainer.all());

// For all accounts...
await Promise.all(accounts.map(account => {
  // Normalize email address
  const emailLower = account.email.trim().toLowerCase();
  // For all contacts with that email...
  return Promise.all(contacts.
    filter(contact => contact.emailAddresses.some(address => address.value.toLowerCase() === emailLower)).
    map(async contact => {
      // Use specified image or fallback to Gravatar (see https://en.gravatar.com/site/implement/images/)
      const url = account.image || `https://www.gravatar.com/avatar/${md5(emailLower)}`;
      // Load image from web
      contact.image = await (new Request(url).loadImage());
      // Update contact
      Contact.update(contact);
      console.log(`Updated contact image for "${emailLower}" to "${url}"`);
    }));
}));

// Save changes
Contact.persistChanges();
console.log("Saved changes");

Convert *all* the things [PowerShell script to convert RAW and HEIC photos to JPEG]

Like most people, I take lots of photos. Like many people, I save them in the highest-quality format (often RAW). Like some people, I edit those pictures on a desktop computer.

Support for RAW images has gotten better over the years, but there are still many tools and programs that do not support these bespoke formats. So it's handy to have a quick and easy way to convert such photos into a widely-supported format like JPEG. There are many tools to do so, but it's hard to beat a command line script for simplicity and ease of use.

I didn't know of one that met my criteria, so I wrote a PowerShell script:

ConvertTo-Jpeg - A PowerShell script that converts RAW (and other) image files to the widely-supported JPEG format

Notes:

  • This script uses the Windows.Graphics.Imaging API to decode and encode. That API supports a variety of file formats and when new formats are added to the list, they are automatically recognized by the script. Because the underlying implementation is maintained by the Windows team, it is fast and secure.
    • As it happens, support for a new format showed up in the days since I wrote this script: Windows 10's April 2018 Update added support for HEIC/HEIF images such as those created by iPhones running iOS 11.
  • The Windows.Graphics.Imaging API is intended for use by Universal Windows Platform (UWP) applications, but I am using it from PowerShell. This is unfortunately harder than it should be, but allowed me to release a single script file which anybody can read and audit.
    • Transparency is not a goal for everyone, but it's important to me - especially in today's environment where malware is so prevalent. I don't trust random code on the Internet, so I prefer to use - and create - open implementations when possible.
  • The choice of PowerShell had some drawbacks. For one, it is not a language I work with often, so I spent more time looking things up than I normally do. For another, it's clear that interoperating with UWP APIs is not a core scenario for PowerShell. In particular, calling asynchronous methods is tricky, and I did a lot of searching before I found a solution I liked: Using WinRT's IAsyncOperation in PowerShell.
  • There are some obvious improvements that could be made, but I deliberately started simple and will add features if/when the need arises.

Binary Log OBjects, gotta download 'em all! [A simple tool to download blobs from an Azure container]

The latest in a series of "I didn't want to write a thing, but couldn't find another thing that already did exactly what I wanted, which is probably because I'm too picky, but whatever" projects, azure-blob-container-download (a.k.a. abcd) is a simple, command-line tool to download all the blobs in an Azure storage container. Here's how it's described in the README:

A simple, cross-platform tool to bulk-download blobs from an Azure storage container.

Though limited in scope, it does a specific set of things vs. the official tools:

The motivation for this project was the same as with my previous post about getting an HTTPS certificate: I've migrated my website from a virtual machine to an Azure Web App. And while it's easy to enable logging for a Web App and get hourly log files in the W3C Extended Log File Format, it wasn't obvious to me how to parse those logs offline to measure traffic, referrers, etc.. (Although that's not something I've bothered with up to now, it's an ability I'd like to have.) What I wanted was a trustworthy, cross-platform tool to download all those log files to a local machine - but the options I investigated each seemed to be missing something.

So I wrote a simple Node.JS CLI and gave it a few extra features to make my life easier. The code is fairly compact and straightforward (and the dependencies minimal), so it's easy to audit. The complete options for downloading and filtering are:

Usage: abcd [options]

Options:
  --account           Storage account (or set AZURE_STORAGE_ACCOUNT)  [string]
  --key               Storage access key (or set AZURE_STORAGE_ACCESS_KEY)  [string]
  --containerPattern  Regular expression filter for container names  [string]
  --blobPattern       Regular expression filter for blob names  [string]
  --startDate         Starting date for blobs  [string]
  --endDate           Ending date for blobs  [string]
  --snapshots         True to include blob snapshots  [boolean]
  --version           Show version number  [boolean]
  --help              Show help  [boolean]

Download blobs from an Azure container.
https://github.com/DavidAnson/azure-blob-container-download

Azure Web Apps create a new log file every hour, so they add up quickly; abcd's date filtering options make it easy to perform incremental downloads. The default directory structure (based on / separators) is collapsed during download, so all files end up in the same directory (named by container) and ordered by date. The tool limits itself to one download at a time, so things proceed at a steady, moderate pace. Once blobs have finished downloading, you're free to do with them as you please. :)

Find out more on the GitHub project page for azure-blob-container-download.

This tool made from 100% recycled electrons [DHCPLite is a small, simple, configuration-free DHCP server for Windows]

Earlier this week, I posted the code for a tool I wrote many years ago:

In 2001, I wrote DHCPLite to unblock development scenarios between Windows and prototype hardware we were developing on. I came up with the simplest DHCP implementation possible and took all the shortcuts I could - but it was enough to get the job done! Since then, I've heard from other teams using DHCPLite for scenarios of their own. And recently, I was asked by some IoT devs to share DHCPLite with that community. So I dug up the code, cleaned it up a bit, got it compiling with the latest toolset, and am sharing the result here.

If that sounds interesting, please visit the DHCPLite project page on GitHub.

Trivia
  • Other than a couple of assumptions that changed in the last 15 years (ex: interface order), the original code worked pretty much as-is.
  • Now that the CRT has sensible string functions, I was able to replace my custom strncpyz helper with strncpy_s and _TRUNCATE.
  • Now that the IP Helper API is broadly available, I was able to statically link to it instead of loading dynamically via classes LibraryLoader and AnsiString.
  • Now that support for the STL is commonplace, I was able to replace the custom GrowableThingCollection implementation with std::vector.
  • I used the "one true brace style" (editors note: hyperbole!) from Code Complete, but it never really caught on and the code now honors Visual Studio defaults.

It's been a while since I wrote native code and revisiting DHCPLite was a fond trip down memory lane. I remain a firm believer in the benefits of a garbage-collected environment like JavaScript or C#/.NET for productivity and correctness - but there's also something very satisfying about writing native code and getting all the tricky little details right.

Or at least thinking you have... :)

Delayed Reaction [My experience converting a jQuery/Knockout.js application to use the React library]

It's important to stay up-to-date with technology trends and popular frameworks. That was part of the reason I wrote this blog using Node.js and it's part of the reason I recently converted a project to use the React library. That project was PassWeb, a simple, secure cloud-based password manager. I wrote PassWeb almost two years ago and use it nearly every day. If you're interested in how it works, please read the introductory blog post about PassWeb. For the purposes of this post, the thing to know is that PassWeb is built on the popular jQuery and Knockout.js frameworks.

To be clear, both frameworks are perfectly good - but switching was a great opportunity to learn about React. :)

Conversion

The original architecture was pretty much what you'd expect: application logic lives in a JavaScript file and the user interface lives in an HTML file. My goal when converting to React was to make as few changes to the logic as possible in order to minimize the risk of introducing behavioral bugs. So I worked in stages:

Having performed the bulk of the migration, all that remained was to identify and fix the handful of bugs that got introduced along the way.

Details

While JSX isn't required to use React, it's a natural fit and I chose JSX so I could get the full React experience. Putting JSX in the browser means using a transpiler to convert the embedded HTML to JavaScript. Babel provides excellent support for this via the React preset and was easy to work with. Because I was now running code through a transpiler, I also enabled the ES2015 Preset which supports newer features of the JavaScript language like let, const, and lambda expressions. I only scratched the surface of ES2015, but it was nice to be able to do so for "free".

One thing I noticed as I migrated more and more code was that much of what I was writing was boilerplate to deal with the propagation of state to and from (observable) properties. I captured this repetitive code in three helper methods and doing so significantly simplified components. (Projects like ReactLink formalize this pattern within the React ecosystem.)

Findings

Something I was curious about was how performance would differ after converting to React. For the most part, things were fast enough before that there was no need to optimize. Except for one scenario: filtering the list of items interactively. So I'd already tuned the Knockout implementation for better performance by toggling the visibility (CSS display:none) of unfiltered items instead of removing and re-adding them to/from the DOM.

When I converted to React, I used the simplest implementation and - unsurprisingly - this scenario performed worse. The first thing I did was implement the shouldComponentUpdate function on the component corresponding to each list item (as recommended by the Advanced Performance section of the docs). React's built-in performance tools are very useful and quickly showed the need for this optimization (as well as confirming the benefits). Two helpful posts that discuss the topic further are Optimizing React Performance using keys, component life cycle, and performance tools and Performance Engineering with React.

Implementing shouldComponentUpdate was a good start, but I had the same basic problem that adding and removing hundreds of elements just wasn't snappy. So I made the same visibility optimization, introducing another component to act as a thin wrapper around the existing one and deal exclusively with visibility. After that, the overall performance of the filter scenario was improved to approximate parity. (Actually, React was still a little slower for the 10,000 item case, but fared better in other areas, and I'm comfortable declaring performance roughly equivalent between the two implementations.)

Other considerations are complexity and size. Two frameworks have been replaced by one, so that's a pretty clear win on the complexity side. Size is a little murky, though. The minified size of the React framework is a little smaller then the combined sizes of jQuery and Knockout. However, the size of the new JSX file is notably larger than the templated HTML it replaces (recall that the code for logic stayed basically the same). And compiling JSX tends to expand the size of the code. But fortunately, Babel lets you minify scripts and that's enough to offset most of the growth. In the end, the React version of PassWeb is slightly smaller than the jQuery/Knockout version - but not enough to be the sole reason to convert.

Conclusion

Now that the dust has settled, would I do it all over again? Definitely! :)

Although there weren't dramatic victories in performance or size, I like the modular approach React encourages and feel it may lead to simpler code. I also like that React combines UI logic and presentation better and allowed me to completely gut the HTML file (which now contains only head and script tags). I also see value in unifying an application's state into one place (formalized by libraries like Redux), though I deliberately didn't explore that here. Most importantly, this was a good learning experience and I really enjoyed getting to know React.

I'll definitely consider React for my next project - maybe even finding an excuse to explore React Native...

Out of hibernation [A new home and a bunch of updates for TextAnalysisTool.NET]

TextAnalysisTool.NET is one of the first side projects I did at Microsoft, and one of the most popular. (Click here for relevant blog posts by tag.) Many people inside and outside the company have written me with questions, feature requests, or sometimes just to say "thank you". It's always great to hear from users, and they've provided a long list of suggestions and ideas for ways to make TextAnalysisTool.NET better.

By virtue of changing teams and roles various times over the years, I don't find myself using TextAnalysisTool.NET as much as I once did. My time and interests are spread more thinly, and I haven't been updating the tool as aggressively. (Understatement of the year?)

Various coworkers have asked for access to the code, but nothing much came of that - until recently, when a small group showed up with the interest, expertise, and motivation to drive TextAnalysisTool.NET forward! They inspired me to simplify the contribution process and they have been making a steady stream of enhancements for a while now. It's time to take things to the next level, and today marks the first public update to TextAnalysisTool.NET in a long time!

 

The new source for all things TextAnalysisTool is: the TextAnalysisTool.NET home page

That's where you'll find an overview, download link, release notes, and other resources. The page is owned by the new TextAnalysisTool GitHub organization, so all of us are able to make changes and publish new releases. There's also an issue tracker, so users can report bugs, comment on issues, update TODOs, make suggestions, etc..

The new 2015-01-07 release can be downloaded from there, and includes the following changes since the 2013-05-07 release:

2015-01-07 by Uriel Cohen (http://github.com/cohen-uriel)
----------
* Added a tooltip to the loaded file indicator in the status bar
* Fixed a bug where setting a marker used in an active filter causes the
  current selection of lines to be changed

2015-01-07 by David Anson (http://dlaa.me/)
----------
* Improve HTML representation of clipboard text when copying for more
  consistent paste behavior

2015-01-01 by Uriel Cohen (http://github.com/cohen-uriel)
----------
* Fixed a bug where TAB characters are omitted in the display
* Fixed a bug where lines saved to file include an extra white space at the
  start

2014-12-21 by Uriel Cohen (http://github.com/cohen-uriel)
----------
* Changed compilation to target .NET Framework 4.0

2014-12-11 by Uriel Cohen (http://github.com/cohen-uriel)
----------
* Redesigned the status bar indications to be consistent with Visual Studio and
  added the number of currently selected lines

2014-12-04 by Uriel Cohen (http://github.com/cohen-uriel)
----------
* Added the ability to append an existing filters file to the current filters
  list

2014-12-01 by Uriel Cohen (http://github.com/cohen-uriel)
----------
* Added recent file/filter menus for easy access to commonly-used files
* Added a new settings registry key to set the
  maximum number of recent files or filter files allowed in the
  corresponding file menus
* Fixed bug where pressing SPACE with no matching lines from filters
  crashed the application
* Fixed a bug where copy-pasting lines from the application to Lync
  resulted in one long line without carriage returns

2014-11-11 by Uriel Cohen (http://github.com/cohen-uriel)
----------
* Added support for selection of background color in the filters
  (different selection of colors than the foreground colors)
* The background color can be saved and loaded with the filters
* Filters from previous versions that lack a background color will have the
  default background color
* Saving foreground color field in filters to 'foreColor' attribute.
  Old 'color' attribute is still being loaded for backward compatibility
  purposes.
* Changed control alignment in Find dialog and Filter dialog

2014-10-21 by Mike Morante (http://github.com/mike-mo)
----------
* Fix localization issue with the build string generation

2014-04-22 by Mike Morante (http://github.com/mike-mo)
----------
* Line metadata is now visually separate from line text contents
* Markers can be shown always/never/when in use to have more room for line text
  and the chosen setting persists across sessions
* Added statusbar panel funnel icon to reflect the current status of the Show
  Only Filtered Lines setting

2014-02-27 by Mike Morante (http://github.com/mike-mo)
----------
* Added zoom controls to quickly increase/decrease the font size
* Zoom level persists across sessions
* Added status bar panel to show current zoom level

 

These improvements were all possible thanks to the time and dedication of the new contributors (and organization members):

Please join me in thanking these generous souls for taking time out of their busy schedule to contribute to TextAnalysisTool.NET! They've been a pleasure to work with, and a great source of ideas and suggestions. I've been really pleased with their changes and hope you find the new TextAnalysisTool.NET more useful than ever!

Casting a Spell✔ [A simple app that makes it easy to spell-check with a browser]

I pay attention to spelling. Maybe it's because I'm not a very good speller. Maybe it's because I have perfectionist tendencies. Maybe it's just a personality flaw.

Whatever the reason, I'm always looking for ways to avoid mistakes.

Some time ago, I wrote a code analysis rule for .NET/FxCop. More recently I shared two command-line programs to extract strings and comments from C#/JavaScript and followed up with a similar extractor for HTML/XML.

Those tools work well, but I also wanted something GUI for a more natural fit with UI-centric workflows. I prototyped a simple WPF app that worked okay, but it wasn't as ubiquitously available as I wanted. Surprisingly often, I'd find myself on a machine without latest version of the tool. (Classic first world problem, I know...) So I decided to go with a web app instead.

The key observation was that modern browsers already integrate with the host operating system's spell-checker via the spellcheck HTML attribute. By leveraging that, my app would automatically get a comprehensive dictionary, support for multiple languages, native-code performance, and support for the user's custom dictionary. #winning!

Aside: The (very related) forceSpellCheck API isn't supported by any browser I've tried. Fortunately, it's not needed for Firefox, its absence can be coded around for Chrome, and there's a simple manual workaround for Internet Explorer. Click the "Help / About" link in the app for more information.

 

Inspired by web browsers' native support for spell-checking, I've created Spell✔ (a.k.a. SpellV), a simple app that makes it easy to spell-check with a browser. Click the link to try it out now - it's offline-enabled, so you can use it again later even without a network connection!

To import content, Spell✔ supports pasting text from the clipboard, drag-dropping a file, or browsing the folder structure. For a customized experience, you can switch among multiple views of the data, including:

  • Original text (duh...)
  • Unique words, sorted alphabetically and displayed compactly for easy scanning
  • HTML/XML content (including text, comments, and attributes)
  • JSON content (string values only)
  • JavaScript code comments and string literals

Read more about how Spell✔ works, how it was built, or check out the code on the GitHub page for SpellV!

A trip down memory (footprint) lane [Download for the original TextAnalysisTool, circa 2001]

As you might guess from the name, TextAnalysisTool.NET (introductory blog post, related links) was not the first version of the tool. The original implementation was written in C, compiled for x86, slightly less capable, and named simply TextAnalysisTool. I got an email asking for a download link recently, so I dug up a copy and am posting it for anyone who's interested.

The UI should be very familiar to TextAnalysisTool.NET users:

The original TextAnalysisTool filtering a simple file

The behavior is mostly the same as well (though the different hot key for "add filter" trips me up pretty consistently).

A few notes:

  • The code is over 13 years old
  • So I'm not taking feature requests :)
  • But it runs on vintage operating systems (seriously, this is before Windows XP)
  • And it also runs great on Windows 8.1 (yay backward compatibility!)
  • It supports:
    • Text filters
    • Regular expressions
    • Markers
    • Find
    • Go to
    • Reload
    • Copy/paste
    • Saved configurations
    • Multi-threading
  • But does not support:
    • Colors
    • Rich selection
    • Rich copy
    • Line counts
    • Filter hot keys
    • Plugins
    • Unicode

Because it uses ASCII-encoding for strings (vs. .NET's Unicode representation), you can reasonably expect loading a text file in TextAnalysisTool to use about half as much memory as it does in TextAnalysisTool.NET. However, as a 32-bit application, TextAnalysisTool is limited to the standard 2GB virtual address space of 32-bit processes on Windows (even on a 64-bit OS). On the other hand, TextAnalysisTool.NET is an architecture-neutral application and can use the full 64-bit virtual address space on a 64-bit OS. There may be rare machine configurations where the physical/virtual memory situation is such that older TextAnalysisTool can load a file newer TextAnalysisTool.NET can't - so if you're stuck, give it a try!

Aside: If you're really adventurous, you can try using EditBin to set the /LARGEADDRESSAWARE option on TextAnalysisTool.exe to get access to more virtual address space on a 64-bit OS or via /3GB on a 32-bit OS. But be warned that you're well into "undefined behavior" territory because I don't think that switch even existed when I wrote TextAnalysisTool. I've tried it briefly and things seem to work - but this is definitely sketchy. :)

Writing the original TextAnalysisTool was a lot of fun and contributed significantly to a library of C utility functions I used at the time called ToolBox. It also provided an excellent conceptual foundation upon which to build TextAnalysisTool.NET in addition to good lessons about how to approach the problem space. If I ever get around to writing a third version (TextAnalysisTool.WPF? TextAnalysisTool.Next?), it will take inspiration from both projects - and handle absurdly-large files.

So if you're curious to try a piece of antique software, click here to download the original TextAnalysisTool.

But for everything else, you should probably click here to download the newer TextAnalysisTool.NET.

"That's a funny looking warthog", a post about mocking Grunt [gruntMock is a simple mock for testing Grunt.js multi-tasks]

While writing the grunt-check-pages task for Grunt.js, I wanted a way to test the complete lifecycle: to load the task in a test context, run it against various inputs, and validate the output. It didn't seem practical to call into Grunt itself, so I looked around for a mock implementation of Grunt. There were plenty of mocks for use with Grunt, but I didn't find anything that mocked the API itself. So I wrote a very simple one and used it for testing.

That worked well, so I wanted to formalize my gruntMock implementation and post it as an npm package for others to use. Along the way, I added a bunch of additional API support and pulled in domain-based exception handling for a clean, self-contained implementation. As I hoped, updating grunt-check-pages made its tests simpler and more consistent.

Although gruntMock doesn't implement the complete Grunt API, it implements enough of it that I expect most tasks to be able to use it pretty easily. If not, please let me know what's missing! :)

 

For more context, here's part of the introductory section of README.md:

gruntMock is simple mock object that simulates the Grunt task runner for multi-tasks and can be easily integrated into a unit testing environment such as Nodeunit. gruntMock invokes tasks the same way Grunt does and exposes (almost) the same set of APIs. After providing input to a task, gruntMock runs and captures its output so tests can verify expected behavior. Task success and failure are unified, so it's easy to write positive and negative tests.

Here's what gruntMock looks like in a simple scenario under Nodeunit:

var gruntMock = require('gruntmock');
var example = require('./example-task.js');

exports.exampleTest = {

  pass: function(test) {
    test.expect(4);
    var mock = gruntMock.create({
      target: 'pass',
      files: [
        { src: ['unused.txt'] }
      ],
      options: { str: 'string', num: 1 }
    });
    mock.invoke(example, function(err) {
      test.ok(!err);
      test.equal(mock.logOk.length, 1);
      test.equal(mock.logOk[0], 'pass');
      test.equal(mock.logError.length, 0);
      test.done();
    });
  },

  fail: function(test) {
    test.expect(2);
    var mock = gruntMock.create({ target: 'fail' });
    mock.invoke(example, function(err) {
      test.ok(err);
      test.equal(err.message, 'fail');
      test.done();
    });
  }
};

 

For a more in-depth example, have a look at the use of gruntMock by grunt-check-pages. That shows off integration with other mocks (specifically nock, a nice HTTP server mock) as well as the testOutput helper function that's used to validate each test case's output without duplicating code. It also demonstrates how gruntMock's unified handling of success and failure allows for clean, consistent testing of input validation, happy path, and failure scenarios.

To learn more - or experiment with gruntMock - visit gruntMock on npm or gruntMock on GitHub.

Happy mocking!