The blog of dlaa.me

Supporting both sides of the Grunt vs. Gulp debate [check-pages is a Gulp-friendly task to check various aspects of a web page for correctness]

A few months ago, I wrote about grunt-check-pages, a Grunt task to check various aspects of a web page for correctness. I use grunt-check-pages when developing my blog and have found it very handy for preventing mistakes and maintaining consistency.

Two things have changed since then:

  1. I released multiple enhancements to grunt-check-pages that make it more powerful
  2. I extracted its core functionality into the check-pages package which works well with Gulp

 

First, an overview of the improvements; here's the change log for grunt-check-pages:

  • 0.1.0 - Initial release, support for checkLinks and checkXhtml.
  • 0.1.1 - Tweak README for better formatting.
  • 0.1.2 - Support page-only mode (no link or XHTML checks), show response time for requests.
  • 0.1.3 - Support maxResponseTime option, buffer all page responses, add "no-cache" header to requests.
  • 0.1.4 - Support checkCaching and checkCompression options, improve error handling, use gruntMock.
  • 0.1.5 - Support userAgent option, weak entity tags, update nock dependency.
  • 0.2.0 - Support noLocalLinks option, rename disallowRedirect option to noRedirects, switch to ESLint, update superagent and nock dependencies.
  • 0.3.0 - Support queryHashes option for CRC-32/MD5/SHA-1, update superagent dependency.
  • 0.4.0 - Rename onlySameDomainLinks option to onlySameDomain, fix handling of redirected page links, use page order for links, update all dependencies.
  • 0.5.0 - Show location of redirected links with noRedirects option, switch to crc-hash dependency.
  • 0.6.0 - Support summary option, update crc-hash, grunt-eslint, nock dependencies.
  • 0.6.1 - Add badges for automated build and coverage info to README (along with npm, GitHub, and license).
  • 0.6.2 - Switch from superagent to request, update grunt-eslint and nock dependencies.
  • 0.7.0 - Move task implementation into reusable check-pages package.
  • 0.7.1 - Fix misreporting of "Bad link" for redirected links when noRedirects enabled.

There are now more things you can validate and better diagnostics during validation. For information about the various options, visit the grunt-check-pages package in the npm repository.

 

Secondly, I started looking into Gulp as an alternative to Grunt. My blog's Gruntfile.js is the most complicated I have, so I tried converting it to a gulpfile.js. Conveniently, existing packages supported everything I already do (test, LESS, lint) - though not what I use grunt-check-pages for (no surprise).

Clearly, the next step was to create a version of the task for Gulp - but it turns out that's not necessary! Gulp's task structure is simple enough that invoking standard asynchronous helpers is easy to do inline. So all I really needed was to factor out the core functionality into a reusable method.

Here's how that looks:

/**
 * Checks various aspects of a web page for correctness.
 *
 * @param {object} host Specifies the environment.
 * @param {object} options Configures the task.
 * @param {function} done Callback function.
 * @returns {void}
 */
module.exports = function(host, options, done) { ... }

With that in place, it's easy to invoke check-pages - whether from a Gulp task or something else entirely. The host parameter handles log/error messages (pass console for convenience), options configures things in the usual fashion, and the done callback gets called at the end (with an Error parameter if anything went wrong).

Like so:

var gulp = require("gulp");
var checkPages = require("check-pages");

gulp.task("checkDev", [ "start-development-server" ], function(callback) {
  var options = {
    pageUrls: [
      'http://localhost:8080/',
      'http://localhost:8080/blog',
      'http://localhost:8080/about.html'
    ],
    checkLinks: true,
    onlySameDomain: true,
    queryHashes: true,
    noRedirects: true,
    noLocalLinks: true,
    linksToIgnore: [
      'http://localhost:8080/broken.html'
    ],
    checkXhtml: true,
    checkCaching: true,
    checkCompression: true,
    maxResponseTime: 200,
    userAgent: 'custom-user-agent/1.2.3',
    summary: true
  };
  checkPages(console, options, callback);
});

gulp.task("checkProd", function(callback) {
  var options = {
    pageUrls: [
      'http://example.com/',
      'http://example.com/blog',
      'http://example.com/about.html'
    ],
    checkLinks: true,
    maxResponseTime: 500
  };
  checkPages(console, options, callback);
});

As a result, grunt-check-pages has become a thin wrapper over check-pages and there's no duplication between the two packages (though each has a complete set of tests just to be safe). For information about the options above, visit the check-pages package in the npm repository.

 

The combined effect is that I'm able to do a better job validating web site updates and I can use whichever of Grunt or Gulp feels more appropriate for a given scenario. That's good for peace of mind - and a great way to become more familiar with both tools!