The blog of dlaa.me

Respect my securitah! [The check-pages suite now prefers HTTPS and includes a CLI]

There are many best practices to keep in mind when maintaining a web site, so it's helpful to have tools that check for common mistakes. I've previously written about two Node.js packages I created for this purpose, check-pages and grunt-check-pages, both of which can be easily integrated into an automated workflow. I updated them recently and wish to highlight two aspects.

HTTPS

There's a movement underway to make the Internet safer, and one of the best ways is to use the secure HTTPS protocol when browsing the web. Not all sites support HTTPS, but many do, and it's good to link to the secure version of a page when available. The trick is knowing when that's possible - especially for links created long ago or before a site was updated to support HTTPS. That's where the new --preferSecure option comes in: it raises an error whenever a page links to potentially-secure content insecurely. Scanning a site with the --checkLinks/--preferSecure option enabled is now an easy way to identify links that could be updated to provide a safer browsing experience.

Aside: The moarTLS Chrome extension does a similar thing in the browser; check it out!

CLI

check-pages is easy to integrate into an automated workflow, but sometimes it's nice to run one-off tests or experiment interactively with a site's configuration. To that end, I created a simple command-line wrapper that exposes all the check-pages functionality (including --preferSecure) in a way that's easy to use on the platform/shell of your choice. Simply install it via npm, point it at the page(s) of interest, and review the list of possible issues. Here's the output of the --help command:

Usage: check-pages <page URLs> [options]

Checks:
  --checkLinks        Validates each link on a page  [boolean]
  --checkCaching      Validates Cache-Control/ETag  [boolean]
  --checkCompression  Validates Content-Encoding  [boolean]
  --checkXhtml        Validates page structure  [boolean]

checkLinks options:
  --linksToIgnore     List of URLs to ignore  [array]
  --noEmptyFragments  Fails for empty fragments  [boolean]
  --noLocalLinks      Fails for local links  [boolean]
  --noRedirects       Fails for HTTP redirects  [boolean]
  --onlySameDomain    Ignores links to other domains  [boolean]
  --preferSecure      Verifies HTTPS when available  [boolean]
  --queryHashes       Verifies query string file hashes  [boolean]

Options:
  --summary          Summarizes issues after running  [boolean]
  --terse            Results on one line, no progress  [boolean]
  --maxResponseTime  Response timeout (milliseconds)  [number]
  --userAgent        Custom User-Agent header  [string]
  --version          Show version number  [boolean]
  --help             Show help  [boolean]

Checks various aspects of a web page for correctness.
https://github.com/DavidAnson/check-pages-cli