The blog of dlaa.me
Tag: "Web"
  • Just another rando with a polyfill [math-random-polyfill.js is a browser-based polyfill for JavaScript's Math.random() that tries to make it more random]
    Thursday, December 8th 2016

    JavaScript's Math.random() function has a well-deserved reputation for not generating truly random numbers. (Gasp!) Modern browsers offer a solution with the crypto.getRandomValues() function that new code should be using instead. However, most legacy scripts haven't been - and won't be - updated for the new hotness.

    I wanted to improve the behavior of legacy code and looked around for a polyfill of Math.random() that leveraged crypto.getRandomValues() to generate output, but didn't find one. It seemed straightforward to implement, so I created math-random-polyfill.js with the tagline: A browser-based polyfill for JavaScript's Math.random() that tries to make it more random.

    You can learn more about math-random-polyfill.js on its GitHub project page which includes the following:

    The MDN documentation for Math.random() explicitly warns that return values should not be used for cryptographic purposes. Failing to heed that advice can lead to problems, such as those documented in the article TIFU by using Math.random(). However, there are scenarios - especially involving legacy code - that don't lend themselves to easily replacing Math.random() with crypto.getRandomValues(). For those scenarios, math-random-polyfill.js attempts to provide a more random implementation of Math.random() to mitigate some of its disadvantages.

    ...

    math-random-polyfill.js works by intercepting calls to Math.random() and returning the same 0 <= value < 1 based on random data provided by crypto.getRandomValues(). Values returned by Math.random() should be completely unpredictable and evenly distributed - both of which are true of the random bits returned by crypto.getRandomValues(). The polyfill maps those values into floating point numbers by using the random bits to create integers distributed evenly across the range 0 <= value < Number.MAX_SAFE_INTEGER then dividing by Number.MAX_SAFE_INTEGER + 1. This maintains the greatest amount of randomness and precision during the transfer from the integer domain to the floating point domain.

    I've included a set of unit tests meant to detect the kinds of mistakes that would compromise the usefulness of math-random-polyfill.js. The test suite passes on the five (most popular) browsers I tried, which leads me to be cautiously optimistic about the validity and viability of this approach. :)

    Tags: Technical Web
  • Sky-Hole Revisited [Pi-Hole in a cloud VM for easy DNS-based ad-blocking]
    Monday, November 21st 2016

    I wrote about my adventures running a Pi-Hole in the cloud for DNS-based ad-blocking roughly a year ago. In the time since, I've happily used a Sky-Hole for all the devices and traffic at home. When updating my Sky-Hole virtual machine recently, I used a simpler approach than before and wanted to briefly document the new workflow.

    For more context on why someone might want to use a DNS-based ad-blocker, please refer to the original post.

    Installation

    1. Create an Ubuntu Server virtual machine with your cloud provider of choice (such as Azure or AWS)

      Note: Thanks to improvements by the Pi-Hole team, it's now able to run in the smallest virtual machine size

    2. Connect via SSH and update the package database:

      sudo apt-get update

    3. Install Pi-Hole:

      curl -L https://install.pi-hole.net | bash

      Note: Running scripts directly from the internet is risky, so consider using the alternate install instead

    4. Open the dnsmasq configuration file:

      sudo nano /etc/dnsmasq.d/01-pihole.conf

    5. Turn off logging by commenting-out the corresponding line:

      #log-queries

    6. Open the Pi-Hole configuration file:

      sudo nano /etc/pihole/setupVars.conf

    7. Update it to use an invalid address for blocked domains:

      IPv4_address=0.0.0.0

    8. Re-generate the block list:

      sudo /opt/pihole/gravity.sh

    9. Verify the block list looks reasonable:

      cat /etc/pihole/gravity.list

    10. Verify logging is off:

      cat /var/log/pihole.log

    11. Reboot to ensure everything loads successfully:

      sudo reboot

    12. Grant access to the virtual machine's public IP address by opening the relevant network ports (incoming UDP and TCP on port 53)

    Don't forget

    If you use a Pi-Hole regularly, please consider donating to the Pi-Hole project so the maintainers can continue developing and improving it.

    Tags: Miscellaneous Technical Web
  • Free as in ... HTTPS certificates? [Obtaining and configuring a free HTTPS certificate for an Azure Web App with a custom domain]
    Wednesday, May 18th 2016

    Providing secure access to all Internet content - not just that for banking and buying - is quickly becoming the norm. Although setting up a web site has been fairly easy for years, enabling HTTPS for that site was more challenging. The Let's Encrypt project is trying to improve things for everyone - by making certificates free and easier to use, they enable more sites to offer secure access.

    Let's Encrypt is notable for (at least) two achievements. The first is lowering the cost for anyone to obtain a certificate - you can't beat free! The second is simplifying the steps to enable HTTPS on a server. Thus far, Let's Encrypt has focused their efforts on Linux systems, so the process for Windows servers hasn't changed much. Further complicating things, many sites nowadays are hosted by services like Azure or CloudFlare, which makes validating ownership more difficult.

    As someone who is in the process of migrating content from a virtual machine with a custom domain to an Azure Web App, I've been looking for an easy way to make use of Let's Encrypt certificates. A bit of searching turned up some helpful resources:

    Nothing was exactly what I wanted, so I came up with the following approach based on tweaks to the first two articles above. The Let's Encrypt tool runs on Linux, so I use that platform exclusively. Everything can be done in a terminal window, so it's easily scripted. There is no need to open a firewall or use another machine; everything can be done in one place. And by taking advantage of the nifty ability to boot from a Live CD, the technique is easy to apply even if you don't have a Linux box handy.

    1. Boot an Ubuntu 16.04 Live CD
    2. Run "Software & Updates" and enable the "universe" repository
    3. sudo apt install letsencrypt
    4. sudo apt install git
    5. git config --global user.email "user@example.com"
    6. git config --global user.name "User Name"
    7. git clone https://example.scm.azurewebsites.net:443/Example.git
      • Be sure /.well-known/acme-challenge/web.config exits and is configured to allow extension-less files:
        <configuration>
          <system.webServer>
            <staticContent>
              <mimeMap fileExtension="" mimeType="text/plain"/>
            </staticContent>
          </system.webServer>
        </configuration>
        
    8. sudo letsencrypt certonly --manual --domain example.com --domain www.example.com --email user@example.com --agree-tos --text
      • Note: Include the --test-cert option when trying things out
    9. Repeat for each domain:
      1. nano verification-file and paste the provided content
      2. git add verification-file
      3. git commit -m "Add verification file."
      4. git push
      5. Allow Let's Encrypt to verify ownership by fetching the verification file
    10. sudo openssl pkcs12 -export -inkey /etc/letsencrypt/live/example.com/privkey.pem -in /etc/letsencrypt/live/example.com/fullchain.pem -out fullchain.pfx -passout pass:your-password
    11. Follow the steps to Configure a custom domain name in Azure App Service using fullchain.pfx
    12. Enjoy browsing your site securely!
    Tags: Technical Web
  • Respect my securitah! [The check-pages suite now prefers HTTPS and includes a CLI]
    Wednesday, May 11th 2016

    There are many best practices to keep in mind when maintaining a web site, so it's helpful to have tools that check for common mistakes. I've previously written about two Node.js packages I created for this purpose, check-pages and grunt-check-pages, both of which can be easily integrated into an automated workflow. I updated them recently and wish to highlight two aspects.

    HTTPS

    There's a movement underway to make the Internet safer, and one of the best ways is to use the secure HTTPS protocol when browsing the web. Not all sites support HTTPS, but many do, and it's good to link to the secure version of a page when available. The trick is knowing when that's possible - especially for links created long ago or before a site was updated to support HTTPS. That's where the new --preferSecure option comes in: it raises an error whenever a page links to potentially-secure content insecurely. Scanning a site with the --checkLinks/--preferSecure option enabled is now an easy way to identify links that could be updated to provide a safer browsing experience.

    Aside: The moarTLS Chrome extension does a similar thing in the browser; check it out!

    CLI

    check-pages is easy to integrate into an automated workflow, but sometimes it's nice to run one-off tests or experiment interactively with a site's configuration. To that end, I created a simple command-line wrapper that exposes all the check-pages functionality (including --preferSecure) in a way that's easy to use on the platform/shell of your choice. Simply install it via npm, point it at the page(s) of interest, and review the list of possible issues. Here's the output of the --help command:

    Usage: check-pages <page URLs> [options]
    
    Checks:
      --checkLinks        Validates each link on a page  [boolean]
      --checkCaching      Validates Cache-Control/ETag  [boolean]
      --checkCompression  Validates Content-Encoding  [boolean]
      --checkXhtml        Validates page structure  [boolean]
    
    checkLinks options:
      --linksToIgnore     List of URLs to ignore  [array]
      --noEmptyFragments  Fails for empty fragments  [boolean]
      --noLocalLinks      Fails for local links  [boolean]
      --noRedirects       Fails for HTTP redirects  [boolean]
      --onlySameDomain    Ignores links to other domains  [boolean]
      --preferSecure      Verifies HTTPS when available  [boolean]
      --queryHashes       Verifies query string file hashes  [boolean]
    
    Options:
      --summary          Summarizes issues after running  [boolean]
      --terse            Results on one line, no progress  [boolean]
      --maxResponseTime  Response timeout (milliseconds)  [number]
      --userAgent        Custom User-Agent header  [string]
      --version          Show version number  [boolean]
      --help             Show help  [boolean]
    
    Checks various aspects of a web page for correctness.
    https://github.com/DavidAnson/check-pages-cli
    
    Tags: Node.js Technical Web
  • Delayed Reaction [My experience converting a jQuery/Knockout.js application to use the React library]
    Tuesday, March 22nd 2016

    It's important to stay up-to-date with technology trends and popular frameworks. That was part of the reason I wrote this blog using Node.js and it's part of the reason I recently converted a project to use the React library. That project was PassWeb, a simple, secure cloud-based password manager. I wrote PassWeb almost two years ago and use it nearly every day. If you're interested in how it works, please read the introductory blog post about PassWeb. For the purposes of this post, the thing to know is that PassWeb is built on the popular jQuery and Knockout.js frameworks.

    To be clear, both frameworks are perfectly good - but switching was a great opportunity to learn about React. :)

    Conversion

    The original architecture was pretty much what you'd expect: application logic lives in a JavaScript file and the user interface lives in an HTML file. My goal when converting to React was to make as few changes to the logic as possible in order to minimize the risk of introducing behavioral bugs. So I worked in stages:

    Having performed the bulk of the migration, all that remained was to identify and fix the handful of bugs that got introduced along the way.

    Details

    While JSX isn't required to use React, it's a natural fit and I chose JSX so I could get the full React experience. Putting JSX in the browser means using a transpiler to convert the embedded HTML to JavaScript. Babel provides excellent support for this via the React preset and was easy to work with. Because I was now running code through a transpiler, I also enabled the ES2015 Preset which supports newer features of the JavaScript language like let, const, and lambda expressions. I only scratched the surface of ES2015, but it was nice to be able to do so for "free".

    One thing I noticed as I migrated more and more code was that much of what I was writing was boilerplate to deal with the propagation of state to and from (observable) properties. I captured this repetitive code in three helper methods and doing so significantly simplified components. (Projects like ReactLink formalize this pattern within the React ecosystem.)

    Findings

    Something I was curious about was how performance would differ after converting to React. For the most part, things were fast enough before that there was no need to optimize. Except for one scenario: filtering the list of items interactively. So I'd already tuned the Knockout implementation for better performance by toggling the visibility (CSS display:none) of unfiltered items instead of removing and re-adding them to/from the DOM.

    When I converted to React, I used the simplest implementation and - unsurprisingly - this scenario performed worse. The first thing I did was implement the shouldComponentUpdate function on the component corresponding to each list item (as recommended by the Advanced Performance section of the docs). React's built-in performance tools are very useful and quickly showed the need for this optimization (as well as confirming the benefits). Two helpful posts that discuss the topic further are Optimizing React Performance using keys, component life cycle, and performance tools and Performance Engineering with React.

    Implementing shouldComponentUpdate was a good start, but I had the same basic problem that adding and removing hundreds of elements just wasn't snappy. So I made the same visibility optimization, introducing another component to act as a thin wrapper around the existing one and deal exclusively with visibility. After that, the overall performance of the filter scenario was improved to approximate parity. (Actually, React was still a little slower for the 10,000 item case, but fared better in other areas, and I'm comfortable declaring performance roughly equivalent between the two implementations.)

    Other considerations are complexity and size. Two frameworks have been replaced by one, so that's a pretty clear win on the complexity side. Size is a little murky, though. The minified size of the React framework is a little smaller then the combined sizes of jQuery and Knockout. However, the size of the new JSX file is notably larger than the templated HTML it replaces (recall that the code for logic stayed basically the same). And compiling JSX tends to expand the size of the code. But fortunately, Babel lets you minify scripts and that's enough to offset most of the growth. In the end, the React version of PassWeb is slightly smaller than the jQuery/Knockout version - but not enough to be the sole reason to convert.

    Conclusion

    Now that the dust has settled, would I do it all over again? Definitely! :)

    Although there weren't dramatic victories in performance or size, I like the modular approach React encourages and feel it may lead to simpler code. I also like that React combines UI logic and presentation better and allowed me to completely gut the HTML file (which now contains only head and script tags). I also see value in unifying an application's state into one place (formalized by libraries like Redux), though I deliberately didn't explore that here. Most importantly, this was a good learning experience and I really enjoyed getting to know React.

    I'll definitely consider React for my next project - maybe even finding an excuse to explore React Native...

    Tags: Technical Utilities Web
  • It's alive ... photo! [Live Photos via Web Components]
    Wednesday, November 18th 2015

    I'd been meaning to learn more about the Web Components standard and recently found the inspiration to do so in the form of a small project to explore the idea of bringing Apple's "Live Photo" experience to the web:

    Apple introduced Live Photos with iOS 9, a feature that automatically associates a short video with every picture that's taken. I was skeptical at first, wondering how relevant this would be for static content; and it turns out not to be all that compelling for some kinds of photos. But for dynamic scenes or people in motion, the video can add some really interesting context!

    Live Photos on iOS are (naturally) smooth and easy to use. I wondered what it might be like to bring a similar experience to the web. I'd also been looking for a reason to explore Web Components. And so live-photo-web was born!

    To find out more, please visit live-photo-web on GitHub and/or try out the interactive demo!

    Tags: Technical Web
  • Pie in the Sky-Hole [A Pi-Hole in the cloud for ad-blocking via DNS]
    Monday, August 24th 2015

    Inspired by Marco Arment's recent post about blocking advertisements on the web, I decided to explore the same idea. However, while Marco focuses on the annoyance of advertisements, I am interested in the security benefits of removing them. There have been numerous incidents of otherwise respectable websites compromising the security of their users due to the advertisements they include. Searches for "web site hacked 'ad network'" on Google and Bing provide some examples; another is this XSS attack on Troy Hunt's site, which is interesting thanks to the detailed analysis Troy provides. Popular sites of all kinds have been compromised in this way, and one might argue they should be treated as attackers because of the approach used to serve third-party ads.

    Marco's article describes an in-browser solution for ad-blocking, but I prefer something that automatically protects all the machines on my network (at least, while they're using the network; see below). So I set out looking for something that works at the network level and came across Pi-Hole, a DNS-based ad-blocker for the Raspberry Pi. Aside from the fact that I don't own a Pi, this seemed like exactly what I wanted. ;)

    Fortunately, there are no actual dependencies on Pi hardware, so I decided to create my own Pi-Hole on a server in the cloud - thus the name "Sky-Hole". To do so, I opened the Microsoft Azure Portal, created a small virtual machine running Ubuntu Server 15.04, and configured it according to the manual instructions for Pi-Hole (with a few customizations outlined below). Then I updated my wireless router to use Sky-Hole as the DNS server for my home network - and all my devices stopped showing advertisements!

    Directions

    I used a minimal set of steps to configure the Sky-Hole and list them below so they're easy to reproduce. I made a couple of tweaks to the Pi-Hole process along the way and explain them in turn.

    First, create a virtual machine to run everything on (I've used both Microsoft Azure and Amazon Web Services, but any provider should do). Then, install dnsmasq:

    sudo apt-get -y install dnsmasq
    sudo update-rc.d dnsmasq enable
    sudo mv /etc/dnsmasq.conf /etc/dnsmasq.orig
    sudo nano /etc/dnsmasq.conf
    

    Configure dnsmasq.conf as follows (replacing "sky-hole" on the last line with the host name of your virtual machine):

    domain-needed
    bogus-priv
    no-resolv
    server=8.8.8.8
    server=8.8.4.4
    interface=eth0
    listen-address=127.0.0.1
    cache-size=10000
    log-queries
    log-facility=/var/log/pihole.log
    local-ttl=300
    addn-hosts=/etc/pihole/gravity.list
    host-record=sky-hole,127.0.0.1,::1
    

    The addn-hosts option is meant to be optional, but I needed it because /etc/hosts was not updated by gravity.sh. The host-record option was necessary to avoid a "sudo: unable to resolve host" error which showed up whenever I enabled dnsmasq. (Though this may be an artifact of the default virtual machine configuration under Azure.)

    Update 2015-08-30: host-record was similarly necessary on AWS, where the automatically-assigned host name was of the form ip-123-123-123-123.

    Now, download the Pi-Hole script and run it to generate the list of domain names to block:

    sudo curl -o /usr/local/bin/gravity.sh https://raw.githubusercontent.com/jacobsalmela/pi-hole/master/gravity.sh
    sudo chmod 755 /usr/local/bin/gravity.sh
    sudo /usr/local/bin/gravity.sh
    sudo sed -i "s/^[0-9\.]\+\s/0.0.0.0 /g" /etc/pihole/gravity.list
    

    The last line is my own and replaces the virtual machine's IP address with an unusable 0.0.0.0 address when redirecting undesirable sites. Because I'm not running a web server on the Sky-Hole, this seems like a more appropriate way to block unwanted domain names. (Besides, hostname -I in Azure reports the virtual machine's internal address which is on a private network.)

    Restart dnsmasq to apply the changes:

    sudo service dnsmasq restart
    

    Now, test things locally via ping, dig, nslookup (or similar) to verify that desirable domain names are returned as-is and undesirable ones are blocked by returning the 0.0.0.0 IP. Assuming that's the case, update the virtual machine to accept incoming UDP traffic on port 53 (per the DNS specification) and test again from a different machine. If everything is working as expected, configure your router to use the Sky-Hole's public IP address for DNS resolution. This automatically applies to all devices on the local network and avoids the need to update each one manually.

    Update 2015-08-30: You may also want to enable TCP traffic on port 53 (per RFC 5966).

    Congratulations, you're done!

    Notes

    • The nice thing about this approach is that it covers all the machines on your network. However, it can only protect machines when they're connected to that network. Taking a phone or tablet elsewhere or using cellular data exempts a device from this kind of protection.
      • So this may be an argument in favor of per-device ad-blocking - though perhaps as a strategy to be used in addition to (rather than instead of) a network-wide approach.
    • When creating the virtual machine, I used the Basic A1 size which would cost about $34.97 per month on Azure (though I don't plan to leave it running very long).
      • I tried the A0 size first (which would have cost $13.39 per month on Azure), but it ran out of memory building the domain list, seemingly due to this known issue.
    • As I note above, I chose not to configure a local web server on my Sky-Hole. While doing so offers interesting benefits, it didn't seem compelling for the purposes of this experiment and I preferred to keep thing simple. Should you choose to, directions are available in the Pi-Hole documentation.
    • If you end up using Pi-Hole like this (or on its own) please consider donating to the author, Jacob Salmela, to help support his work.

    Conclusion

    I'm only been running Sky-Hole for a couple of days, but the usability and performance improvements for some sites are quite noticeable. More importantly, it seems to me the browsing experience is necessarily safer by virtue of removing not just a subset of traffic, but the subset which is most likely to contain unwanted content.

    As an experiment and a learning experience, Sky-Hole has been a successful side-project. I hope others find it interesting or thought-provoking and I welcome comments on improving or enhancing the approach!

    Tags: Miscellaneous Technical Web
  • Lint-free documentation [markdownlint is a Node.js style checker and lint tool for Markdown files]
    Tuesday, May 12th 2015

    I'm a strong believer in using static analysis tools to identify problems and catch mistakes. The Node.js/io.js community has some great options for linting JavaScript code (ex: JSHint and ESLint), and I use them regularly. But code isn't the only important asset - documentation can be just as important to a project's success.

    The open-source community has pretty much standardized on Markdown for documentation which is a great choice because it's easy to read, write, and understand. That said, Markdown has a syntax, so there are "right" and "wrong" ways to do things - and not all parsers handle nuances the same way (though the CommonMark effort is trying to standardize). In particular, there are constructs that can lead to missing/broken text in some parsers but which are not obviously wrong in the original Markdown.

    To show what I mean, I created a Gist of common Markdown mistakes. If you're not a Markdown expert, you might learn something by comparing the source and output. :)

    Aside: The Markdown parser used by GitHub is quite good - but many issues are user error and it can't (yet) read your mind.

     

    You shouldn't need to be a Markdown expert to avoid silly mistakes - that's what we have computers for. When I looked around for a Node-based linter, I didn't see anything - but I did find a very nice implementation for Ruby by Mark Harrison. I don't tend to have Ruby available in my development environment, but I had an itch to scratch, so I installed it and added a couple of rules to Mark's tool for the checks I wanted. Mark kindly accepted the corresponding pull requests, and all was well.

    Except that once I'd tasted of the fruit of Markdown linting, I wanted to integrate it into other workflows - many of which are exclusively Node-based. I briefly entertained the idea of creating a Node package to install Ruby then use it to install and run a Ruby gem - but that made my head hurt...

     

    So I prototyped a Node version of markdownlint by porting a few rules over and then ran the idea by Mark. He was supportive (and raised some great points!), so I gradually ported the rest of the rules to JavaScript with the same numbering/naming system to make it easy for people to migrate between the two tools. Mark already had a fantastic test infrastructure and great documentation for rules, so I shamelessly reused both in the Node version. Configuration for JavaScript tools is typically JSON, so the Node version uses a slightly different format than Ruby (though both are simple/obvious). I started with a fully asynchronous API for efficiency, but ended up adding a synchronous version for scenarios where that's more convenient. I strived to achieve functional parity with the Ruby implementation (and continue to do so as Mark makes updates!), but duplicating the CLI was a non-goal (please have a look at the mdl gem if that's what you need).

    If this sounds interesting, please have a look at markdownlint on GitHub. As of this writing, it supports the same set of ~40 rules that the Ruby implementation does - you can read all about them in Mark's fantastic Rules.md. markdownlint exposes a single API which can be called in an asynchronous or synchronous manner and accepts an options object to identify the files/strings to lint and the set of rules to apply. It returns a simple object that lists the items that were checked along with the line numbers for any violations. The documentation shows of all of this and includes examples of calling markdownlint from both gulp and Grunt.

     

    To make sure markdownlint works well, I've integrated it into some of my own projects, including this blog which I wrote specifically to allow authoring in Markdown. That's a nice start, but it doesn't prove markdownlint can handle larger projects with significant documentation written by different people at different times. For that you'd need to integrate with a project like ESLint which has extensive documentation that's entirely Markdown-based.

    So I did. :) Supporting ESLint was one of the motivating factors behind porting markdownlint to Node in the first place: I love the tool and use it in all my projects. The documentation is excellent, but every now and then I'd come across weird or broken text. After submitting a couple of pull requests with fixes, I decided adding a Markdown linter to their test script would be a better way to keep typos out of the documentation. It turns out this was on the team's radar as well, and they - especially project owner Nicholas - were very helpful and accommodating as I introduced markdownlint and tweaked things to satisfy some of the rules.

     

    At this point, maybe I've convinced you markdownlint works for my own purposes and that it works for some other purposes, but it's likely you have special requirements or would like to "try before you buy". (Which seems an ironic thing to say about free software, but there's a cost to everything, so maybe it's not that unreasonable after all.) Well, I have just the thing for you:

    An interactive markdownlint demo that runs in the browser!

    Although browser support was not (is not!) a goal, the relevant code is all JavaScript with just one dependency (that itself offers browser support) and only two methods that need polyfills (trimLeft/trimRight). So it was actually fairly straightforward (with some help from Browserify) to create a standalone, offline-enabled web page that lets anyone use a (modern) browser to experiment with markdownlint and validate arbitrary content. To make it super easy to get started, I made some deliberate mistakes in the sample content for the demo - feel free to fix them for me. :)

     

    In summary:

    • Markdown is great
    • It's easy to read and write
    • Sometimes it doesn't do what you think
    • There are tools to help
    • markdownlint is one of them
    • Get it for Ruby or Node
    • Or try it in the browser
    Tags: Node.js Technical Web
  • Supporting both sides of the Grunt vs. Gulp debate [check-pages is a Gulp-friendly task to check various aspects of a web page for correctness]
    Tuesday, February 10th 2015

    A few months ago, I wrote about grunt-check-pages, a Grunt task to check various aspects of a web page for correctness. I use grunt-check-pages when developing my blog and have found it very handy for preventing mistakes and maintaining consistency.

    Two things have changed since then:

    1. I released multiple enhancements to grunt-check-pages that make it more powerful
    2. I extracted its core functionality into the check-pages package which works well with Gulp

     

    First, an overview of the improvements; here's the change log for grunt-check-pages:

    • 0.1.0 - Initial release, support for checkLinks and checkXhtml.
    • 0.1.1 - Tweak README for better formatting.
    • 0.1.2 - Support page-only mode (no link or XHTML checks), show response time for requests.
    • 0.1.3 - Support maxResponseTime option, buffer all page responses, add "no-cache" header to requests.
    • 0.1.4 - Support checkCaching and checkCompression options, improve error handling, use gruntMock.
    • 0.1.5 - Support userAgent option, weak entity tags, update nock dependency.
    • 0.2.0 - Support noLocalLinks option, rename disallowRedirect option to noRedirects, switch to ESLint, update superagent and nock dependencies.
    • 0.3.0 - Support queryHashes option for CRC-32/MD5/SHA-1, update superagent dependency.
    • 0.4.0 - Rename onlySameDomainLinks option to onlySameDomain, fix handling of redirected page links, use page order for links, update all dependencies.
    • 0.5.0 - Show location of redirected links with noRedirects option, switch to crc-hash dependency.
    • 0.6.0 - Support summary option, update crc-hash, grunt-eslint, nock dependencies.
    • 0.6.1 - Add badges for automated build and coverage info to README (along with npm, GitHub, and license).
    • 0.6.2 - Switch from superagent to request, update grunt-eslint and nock dependencies.
    • 0.7.0 - Move task implementation into reusable check-pages package.
    • 0.7.1 - Fix misreporting of "Bad link" for redirected links when noRedirects enabled.

    There are now more things you can validate and better diagnostics during validation. For information about the various options, visit the grunt-check-pages package in the npm repository.

     

    Secondly, I started looking into Gulp as an alternative to Grunt. My blog's Gruntfile.js is the most complicated I have, so I tried converting it to a gulpfile.js. Conveniently, existing packages supported everything I already do (test, LESS, lint) - though not what I use grunt-check-pages for (no surprise).

    Clearly, the next step was to create a version of the task for Gulp - but it turns out that's not necessary! Gulp's task structure is simple enough that invoking standard asynchronous helpers is easy to do inline. So all I really needed was to factor out the core functionality into a reusable method.

    Here's how that looks:

    /**
     * Checks various aspects of a web page for correctness.
     *
     * @param {object} host Specifies the environment.
     * @param {object} options Configures the task.
     * @param {function} done Callback function.
     * @returns {void}
     */
    module.exports = function(host, options, done) { ... }
    

    With that in place, it's easy to invoke check-pages - whether from a Gulp task or something else entirely. The host parameter handles log/error messages (pass console for convenience), options configures things in the usual fashion, and the done callback gets called at the end (with an Error parameter if anything went wrong).

    Like so:

    var gulp = require("gulp");
    var checkPages = require("check-pages");
    
    gulp.task("checkDev", [ "start-development-server" ], function(callback) {
      var options = {
        pageUrls: [
          'http://localhost:8080/',
          'http://localhost:8080/blog',
          'http://localhost:8080/about.html'
        ],
        checkLinks: true,
        onlySameDomain: true,
        queryHashes: true,
        noRedirects: true,
        noLocalLinks: true,
        linksToIgnore: [
          'http://localhost:8080/broken.html'
        ],
        checkXhtml: true,
        checkCaching: true,
        checkCompression: true,
        maxResponseTime: 200,
        userAgent: 'custom-user-agent/1.2.3',
        summary: true
      };
      checkPages(console, options, callback);
    });
    
    gulp.task("checkProd", function(callback) {
      var options = {
        pageUrls: [
          'http://example.com/',
          'http://example.com/blog',
          'http://example.com/about.html'
        ],
        checkLinks: true,
        maxResponseTime: 500
      };
      checkPages(console, options, callback);
    });
    

    As a result, grunt-check-pages has become a thin wrapper over check-pages and there's no duplication between the two packages (though each has a complete set of tests just to be safe). For information about the options above, visit the check-pages package in the npm repository.

     

    The combined effect is that I'm able to do a better job validating web site updates and I can use whichever of Grunt or Gulp feels more appropriate for a given scenario. That's good for peace of mind - and a great way to become more familiar with both tools!

    Tags: Node.js Technical Web
  • Everything old is new again [crc-hash is a Node.js Crypto Hash implementation for the CRC algorithm]
    Tuesday, January 27th 2015

    Yep, another post about hash functions... True, I could have stopped when I implemented CRC-32 for .NET or when I implemented MD5 for Silverlight. Certainly, sharing the code for four versions of ComputeFileHashes could have been a good laurel upon which to rest.

    But then I started using Node.js, and found one more hash-oriented itch to scratch. :)

    From the project page:

    Node.js's Crypto module implements the Hash class which offers a simple Stream-based interface for creating hash digests of data. The createHash function supports many popular algorithms like SHA and MD5, but does not include older/simpler CRC algorithms like CRC-32. Fortunately, the crc package in npm provides comprehensive CRC support and offers an API that can be conveniently used by a Hash subclass.

    crc-hash is a Crypto Hash wrapper for the crc package that makes it easy for Node.js programs to use the CRC family of hash algorithms via a standard interface.

    With just one (transitive!) dependency, crc-hash is lightweight. Because it exposes a common interface, it's easy to integrate with existing scenarios. Thanks to crc, it offers support for all the popular CRC algorithms. You can learn more on the crc-hash npm page or the crc-hash GitHub page.

    Notes:

    • One of the great things about the Node community is the breadth of packages available. In this case, I was able to leverage the comprehensive crc package by alexgorbatchev for all the algorithmic bits.
    • After being indifferent on the topic of badges, I discovered shields.io and its elegance won me over. You can see the five badges I picked near the top of README.md on the npm/GitHub pages above.
    Tags: Node.js Technical Web