The blog of dlaa.me

Posts tagged "Technical"

Sky-Hole Revisited [Pi-Hole in a cloud VM for easy DNS-based ad-blocking]

I wrote about my adventures running a Pi-Hole in the cloud for DNS-based ad-blocking roughly a year ago. In the time since, I've happily used a Sky-Hole for all the devices and traffic at home. When updating my Sky-Hole virtual machine recently, I used a simpler approach than before and wanted to briefly document the new workflow.

For more context on why someone might want to use a DNS-based ad-blocker, please refer to the original post.

Installation

  1. Create an Ubuntu Server virtual machine with your cloud provider of choice (such as Azure or AWS)

    Note: Thanks to improvements by the Pi-Hole team, it's now able to run in the smallest virtual machine size

  2. Connect via SSH and update the package database:

    sudo apt-get update

  3. Install Pi-Hole:

    curl -L https://install.pi-hole.net | bash

    Note: Running scripts directly from the internet is risky, so consider using the alternate install instead

  4. Open the dnsmasq configuration file:

    sudo nano /etc/dnsmasq.d/01-pihole.conf

  5. Turn off logging by commenting-out the corresponding line:

    #log-queries

  6. Open the Pi-Hole configuration file:

    sudo nano /etc/pihole/setupVars.conf

  7. Update it to use an invalid address for blocked domains:

    IPv4_address=0.0.0.0

  8. Re-generate the block list:

    sudo /opt/pihole/gravity.sh

  9. Verify the block list looks reasonable:

    cat /etc/pihole/gravity.list

  10. Verify logging is off:

    cat /var/log/pihole.log

  11. Reboot to ensure everything loads successfully:

    sudo reboot

  12. Grant access to the virtual machine's public IP address by opening the relevant network ports (incoming UDP and TCP on port 53)

Don't forget

If you use a Pi-Hole regularly, please consider donating to the Pi-Hole project so the maintainers can continue developing and improving it.

Binary Log OBjects, gotta download 'em all! [A simple tool to download blobs from an Azure container]

The latest in a series of "I didn't want to write a thing, but couldn't find another thing that already did exactly what I wanted, which is probably because I'm too picky, but whatever" projects, azure-blob-container-download (a.k.a. abcd) is a simple, command-line tool to download all the blobs in an Azure storage container. Here's how it's described in the README:

A simple, cross-platform tool to bulk-download blobs from an Azure storage container.

Though limited in scope, it does a specific set of things vs. the official tools:

The motivation for this project was the same as with my previous post about getting an HTTPS certificate: I've migrated my website from a virtual machine to an Azure Web App. And while it's easy to enable logging for a Web App and get hourly log files in the W3C Extended Log File Format, it wasn't obvious to me how to parse those logs offline to measure traffic, referrers, etc.. (Although that's not something I've bothered with up to now, it's an ability I'd like to have.) What I wanted was a trustworthy, cross-platform tool to download all those log files to a local machine - but the options I investigated each seemed to be missing something.

So I wrote a simple Node.JS CLI and gave it a few extra features to make my life easier. The code is fairly compact and straightforward (and the dependencies minimal), so it's easy to audit. The complete options for downloading and filtering are:

Usage: abcd [options]

Options:
  --account           Storage account (or set AZURE_STORAGE_ACCOUNT)  [string]
  --key               Storage access key (or set AZURE_STORAGE_ACCESS_KEY)  [string]
  --containerPattern  Regular expression filter for container names  [string]
  --blobPattern       Regular expression filter for blob names  [string]
  --startDate         Starting date for blobs  [string]
  --endDate           Ending date for blobs  [string]
  --snapshots         True to include blob snapshots  [boolean]
  --version           Show version number  [boolean]
  --help              Show help  [boolean]

Download blobs from an Azure container.
https://github.com/DavidAnson/azure-blob-container-download

Azure Web Apps create a new log file every hour, so they add up quickly; abcd's date filtering options make it easy to perform incremental downloads. The default directory structure (based on / separators) is collapsed during download, so all files end up in the same directory (named by container) and ordered by date. The tool limits itself to one download at a time, so things proceed at a steady, moderate pace. Once blobs have finished downloading, you're free to do with them as you please. :)

Find out more on the GitHub project page for azure-blob-container-download.

Free as in ... HTTPS certificates? [Obtaining and configuring a free HTTPS certificate for an Azure Web App with a custom domain]

Providing secure access to all Internet content - not just that for banking and buying - is quickly becoming the norm. Although setting up a web site has been fairly easy for years, enabling HTTPS for that site was more challenging. The Let's Encrypt project is trying to improve things for everyone - by making certificates free and easier to use, they enable more sites to offer secure access.

Let's Encrypt is notable for (at least) two achievements. The first is lowering the cost for anyone to obtain a certificate - you can't beat free! The second is simplifying the steps to enable HTTPS on a server. Thus far, Let's Encrypt has focused their efforts on Linux systems, so the process for Windows servers hasn't changed much. Further complicating things, many sites nowadays are hosted by services like Azure or CloudFlare, which makes validating ownership more difficult.

As someone who is in the process of migrating content from a virtual machine with a custom domain to an Azure Web App, I've been looking for an easy way to make use of Let's Encrypt certificates. A bit of searching turned up some helpful resources:

Nothing was exactly what I wanted, so I came up with the following approach based on tweaks to the first two articles above. The Let's Encrypt tool runs on Linux, so I use that platform exclusively. Everything can be done in a terminal window, so it's easily scripted. There is no need to open a firewall or use another machine; everything can be done in one place. And by taking advantage of the nifty ability to boot from a Live CD, the technique is easy to apply even if you don't have a Linux box handy.

  1. Boot an Ubuntu 16.04 Live CD

  2. Run "Software & Updates" and enable the "universe" repository

  3. sudo apt install letsencrypt

  4. sudo apt install git

  5. git config --global user.email "user@example.com"

  6. git config --global user.name "User Name"

  7. git clone https://example.scm.azurewebsites.net:443/Example.git

    • Be sure /.well-known/acme-challenge/web.config exits and is configured to allow extension-less files:

      <configuration>
        <system.webServer>
          <staticContent>
            <mimeMap fileExtension="" mimeType="text/plain"/>
          </staticContent>
        </system.webServer>
      </configuration>
      
  8. sudo letsencrypt certonly --manual --domain example.com --domain www.example.com --email user@example.com --agree-tos --text

    • Note: Include the --test-cert option when trying things out
  9. Repeat for each domain:

    1. nano verification-file and paste the provided content
    2. git add verification-file
    3. git commit -m "Add verification file."
    4. git push
    5. Allow Let's Encrypt to verify ownership by fetching the verification file
  10. sudo openssl pkcs12 -export -inkey /etc/letsencrypt/live/example.com/privkey.pem -in /etc/letsencrypt/live/example.com/fullchain.pem -out fullchain.pfx -passout pass:your-password

  11. Follow the steps to Configure a custom domain name in Azure App Service using fullchain.pfx

  12. Enjoy browsing your site securely!

Respect my securitah! [The check-pages suite now prefers HTTPS and includes a CLI]

There are many best practices to keep in mind when maintaining a web site, so it's helpful to have tools that check for common mistakes. I've previously written about two Node.js packages I created for this purpose, check-pages and grunt-check-pages, both of which can be easily integrated into an automated workflow. I updated them recently and wish to highlight two aspects.

HTTPS

There's a movement underway to make the Internet safer, and one of the best ways is to use the secure HTTPS protocol when browsing the web. Not all sites support HTTPS, but many do, and it's good to link to the secure version of a page when available. The trick is knowing when that's possible - especially for links created long ago or before a site was updated to support HTTPS. That's where the new --preferSecure option comes in: it raises an error whenever a page links to potentially-secure content insecurely. Scanning a site with the --checkLinks/--preferSecure option enabled is now an easy way to identify links that could be updated to provide a safer browsing experience.

Aside: The moarTLS Chrome extension does a similar thing in the browser; check it out!

CLI

check-pages is easy to integrate into an automated workflow, but sometimes it's nice to run one-off tests or experiment interactively with a site's configuration. To that end, I created a simple command-line wrapper that exposes all the check-pages functionality (including --preferSecure) in a way that's easy to use on the platform/shell of your choice. Simply install it via npm, point it at the page(s) of interest, and review the list of possible issues. Here's the output of the --help command:

Usage: check-pages <page URLs> [options]

Checks:
  --checkLinks        Validates each link on a page  [boolean]
  --checkCaching      Validates Cache-Control/ETag  [boolean]
  --checkCompression  Validates Content-Encoding  [boolean]
  --checkXhtml        Validates page structure  [boolean]

checkLinks options:
  --linksToIgnore     List of URLs to ignore  [array]
  --noEmptyFragments  Fails for empty fragments  [boolean]
  --noLocalLinks      Fails for local links  [boolean]
  --noRedirects       Fails for HTTP redirects  [boolean]
  --onlySameDomain    Ignores links to other domains  [boolean]
  --preferSecure      Verifies HTTPS when available  [boolean]
  --queryHashes       Verifies query string file hashes  [boolean]

Options:
  --summary          Summarizes issues after running  [boolean]
  --terse            Results on one line, no progress  [boolean]
  --maxResponseTime  Response timeout (milliseconds)  [number]
  --userAgent        Custom User-Agent header  [string]
  --version          Show version number  [boolean]
  --help             Show help  [boolean]

Checks various aspects of a web page for correctness.
https://github.com/DavidAnson/check-pages-cli

This tool made from 100% recycled electrons [DHCPLite is a small, simple, configuration-free DHCP server for Windows]

Earlier this week, I posted the code for a tool I wrote many years ago:

In 2001, I wrote DHCPLite to unblock development scenarios between Windows and prototype hardware we were developing on. I came up with the simplest DHCP implementation possible and took all the shortcuts I could - but it was enough to get the job done! Since then, I've heard from other teams using DHCPLite for scenarios of their own. And recently, I was asked by some IoT devs to share DHCPLite with that community. So I dug up the code, cleaned it up a bit, got it compiling with the latest toolset, and am sharing the result here.

If that sounds interesting, please visit the DHCPLite project page on GitHub.

Trivia
  • Other than a couple of assumptions that changed in the last 15 years (ex: interface order), the original code worked pretty much as-is.
  • Now that the CRT has sensible string functions, I was able to replace my custom strncpyz helper with strncpy_s and _TRUNCATE.
  • Now that the IP Helper API is broadly available, I was able to statically link to it instead of loading dynamically via classes LibraryLoader and AnsiString.
  • Now that support for the STL is commonplace, I was able to replace the custom GrowableThingCollection implementation with std::vector.
  • I used the "one true brace style" (editors note: hyperbole!) from Code Complete, but it never really caught on and the code now honors Visual Studio defaults.

It's been a while since I wrote native code and revisiting DHCPLite was a fond trip down memory lane. I remain a firm believer in the benefits of a garbage-collected environment like JavaScript or C#/.NET for productivity and correctness - but there's also something very satisfying about writing native code and getting all the tricky little details right.

Or at least thinking you have... :)

Delayed Reaction [My experience converting a jQuery/Knockout.js application to use the React library]

It's important to stay up-to-date with technology trends and popular frameworks. That was part of the reason I wrote this blog using Node.js and it's part of the reason I recently converted a project to use the React library. That project was PassWeb, a simple, secure cloud-based password manager. I wrote PassWeb almost two years ago and use it nearly every day. If you're interested in how it works, please read the introductory blog post about PassWeb. For the purposes of this post, the thing to know is that PassWeb is built on the popular jQuery and Knockout.js frameworks.

To be clear, both frameworks are perfectly good - but switching was a great opportunity to learn about React. :)

Conversion

The original architecture was pretty much what you'd expect: application logic lives in a JavaScript file and the user interface lives in an HTML file. My goal when converting to React was to make as few changes to the logic as possible in order to minimize the risk of introducing behavioral bugs. So I worked in stages:

Having performed the bulk of the migration, all that remained was to identify and fix the handful of bugs that got introduced along the way.

Details

While JSX isn't required to use React, it's a natural fit and I chose JSX so I could get the full React experience. Putting JSX in the browser means using a transpiler to convert the embedded HTML to JavaScript. Babel provides excellent support for this via the React preset and was easy to work with. Because I was now running code through a transpiler, I also enabled the ES2015 Preset which supports newer features of the JavaScript language like let, const, and lambda expressions. I only scratched the surface of ES2015, but it was nice to be able to do so for "free".

One thing I noticed as I migrated more and more code was that much of what I was writing was boilerplate to deal with the propagation of state to and from (observable) properties. I captured this repetitive code in three helper methods and doing so significantly simplified components. (Projects like ReactLink formalize this pattern within the React ecosystem.)

Findings

Something I was curious about was how performance would differ after converting to React. For the most part, things were fast enough before that there was no need to optimize. Except for one scenario: filtering the list of items interactively. So I'd already tuned the Knockout implementation for better performance by toggling the visibility (CSS display:none) of unfiltered items instead of removing and re-adding them to/from the DOM.

When I converted to React, I used the simplest implementation and - unsurprisingly - this scenario performed worse. The first thing I did was implement the shouldComponentUpdate function on the component corresponding to each list item (as recommended by the Advanced Performance section of the docs). React's built-in performance tools are very useful and quickly showed the need for this optimization (as well as confirming the benefits). Two helpful posts that discuss the topic further are Optimizing React Performance using keys, component life cycle, and performance tools and Performance Engineering with React.

Implementing shouldComponentUpdate was a good start, but I had the same basic problem that adding and removing hundreds of elements just wasn't snappy. So I made the same visibility optimization, introducing another component to act as a thin wrapper around the existing one and deal exclusively with visibility. After that, the overall performance of the filter scenario was improved to approximate parity. (Actually, React was still a little slower for the 10,000 item case, but fared better in other areas, and I'm comfortable declaring performance roughly equivalent between the two implementations.)

Other considerations are complexity and size. Two frameworks have been replaced by one, so that's a pretty clear win on the complexity side. Size is a little murky, though. The minified size of the React framework is a little smaller then the combined sizes of jQuery and Knockout. However, the size of the new JSX file is notably larger than the templated HTML it replaces (recall that the code for logic stayed basically the same). And compiling JSX tends to expand the size of the code. But fortunately, Babel lets you minify scripts and that's enough to offset most of the growth. In the end, the React version of PassWeb is slightly smaller than the jQuery/Knockout version - but not enough to be the sole reason to convert.

Conclusion

Now that the dust has settled, would I do it all over again? Definitely! :)

Although there weren't dramatic victories in performance or size, I like the modular approach React encourages and feel it may lead to simpler code. I also like that React combines UI logic and presentation better and allowed me to completely gut the HTML file (which now contains only head and script tags). I also see value in unifying an application's state into one place (formalized by libraries like Redux), though I deliberately didn't explore that here. Most importantly, this was a good learning experience and I really enjoyed getting to know React.

I'll definitely consider React for my next project - maybe even finding an excuse to explore React Native...

Catch common Markdown mistakes as you make them [markdownlint is a Visual Studio Code extension to lint Markdown files]

The lightweight, cross-platform Visual Studio Code editor recently gained support for extensions, third party packages that add or enhance capabilities of the tool. Of particular interest to me are linters, syntax checkers that help avoid mistakes and maintain consistency when working with a language (either code or markup). I've previously written about markdownlint, a Node.js linter for the Markdown markup language. After looking at the VS Code API, it seemed straightforward to create a markdownlint extension for Code. I did so and published markdownlint to the extension gallery where it can be installed via the command ext install markdownlint. What's nice about editor integration for a linter is that feedback is immediate and interactive: mistakes are highlighted as they're made and it's easy to click a link for information about any rule violation.

If linting Markdown is something that interests you, please try the markdownlint extension for VS Code and share your feedback!

It's alive ... photo! [Live Photos via Web Components]

I'd been meaning to learn more about the Web Components standard and recently found the inspiration to do so in the form of a small project to explore the idea of bringing Apple's "Live Photo" experience to the web:

Apple introduced Live Photos with iOS 9, a feature that automatically associates a short video with every picture that's taken. I was skeptical at first, wondering how relevant this would be for static content; and it turns out not to be all that compelling for some kinds of photos. But for dynamic scenes or people in motion, the video can add some really interesting context!

Live Photos on iOS are (naturally) smooth and easy to use. I wondered what it might be like to bring a similar experience to the web. I'd also been looking for a reason to explore Web Components. And so live-photo-web was born!

To find out more, please visit live-photo-web on GitHub and/or try out the interactive demo!

Pie in the Sky-Hole [A Pi-Hole in the cloud for ad-blocking via DNS]

Inspired by Marco Arment's recent post about blocking advertisements on the web, I decided to explore the same idea. However, while Marco focuses on the annoyance of advertisements, I am interested in the security benefits of removing them. There have been numerous incidents of otherwise respectable websites compromising the security of their users due to the advertisements they include. Searches for "web site hacked 'ad network'" on Google and Bing provide some examples; another is this XSS attack on Troy Hunt's site, which is interesting thanks to the detailed analysis Troy provides. Popular sites of all kinds have been compromised in this way, and one might argue they should be treated as attackers because of the approach used to serve third-party ads.

Marco's article describes an in-browser solution for ad-blocking, but I prefer something that automatically protects all the machines on my network (at least, while they're using the network; see below). So I set out looking for something that works at the network level and came across Pi-Hole, a DNS-based ad-blocker for the Raspberry Pi. Aside from the fact that I don't own a Pi, this seemed like exactly what I wanted. ;)

Fortunately, there are no actual dependencies on Pi hardware, so I decided to create my own Pi-Hole on a server in the cloud - thus the name "Sky-Hole". To do so, I opened the Microsoft Azure Portal, created a small virtual machine running Ubuntu Server 15.04, and configured it according to the manual instructions for Pi-Hole (with a few customizations outlined below). Then I updated my wireless router to use Sky-Hole as the DNS server for my home network - and all my devices stopped showing advertisements!

Directions

I used a minimal set of steps to configure the Sky-Hole and list them below so they're easy to reproduce. I made a couple of tweaks to the Pi-Hole process along the way and explain them in turn.

First, create a virtual machine to run everything on (I've used both Microsoft Azure and Amazon Web Services, but any provider should do). Then, install dnsmasq:

sudo apt-get -y install dnsmasq
sudo update-rc.d dnsmasq enable
sudo mv /etc/dnsmasq.conf /etc/dnsmasq.orig
sudo nano /etc/dnsmasq.conf

Configure dnsmasq.conf as follows (replacing "sky-hole" on the last line with the host name of your virtual machine):

domain-needed
bogus-priv
no-resolv
server=8.8.8.8
server=8.8.4.4
interface=eth0
listen-address=127.0.0.1
cache-size=10000
log-queries
log-facility=/var/log/pihole.log
local-ttl=300
addn-hosts=/etc/pihole/gravity.list
host-record=sky-hole,127.0.0.1,::1

The addn-hosts option is meant to be optional, but I needed it because /etc/hosts was not updated by gravity.sh. The host-record option was necessary to avoid a "sudo: unable to resolve host" error which showed up whenever I enabled dnsmasq. (Though this may be an artifact of the default virtual machine configuration under Azure.)

Update 2015-08-30: host-record was similarly necessary on AWS, where the automatically-assigned host name was of the form ip-123-123-123-123.

Now, download the Pi-Hole script and run it to generate the list of domain names to block:

sudo curl -o /usr/local/bin/gravity.sh https://raw.githubusercontent.com/jacobsalmela/pi-hole/master/gravity.sh
sudo chmod 755 /usr/local/bin/gravity.sh
sudo /usr/local/bin/gravity.sh
sudo sed -i "s/^[0-9\.]\+\s/0.0.0.0 /g" /etc/pihole/gravity.list

The last line is my own and replaces the virtual machine's IP address with an unusable 0.0.0.0 address when redirecting undesirable sites. Because I'm not running a web server on the Sky-Hole, this seems like a more appropriate way to block unwanted domain names. (Besides, hostname -I in Azure reports the virtual machine's internal address which is on a private network.)

Restart dnsmasq to apply the changes:

sudo service dnsmasq restart

Now, test things locally via ping, dig, nslookup (or similar) to verify that desirable domain names are returned as-is and undesirable ones are blocked by returning the 0.0.0.0 IP. Assuming that's the case, update the virtual machine to accept incoming UDP traffic on port 53 (per the DNS specification) and test again from a different machine. If everything is working as expected, configure your router to use the Sky-Hole's public IP address for DNS resolution. This automatically applies to all devices on the local network and avoids the need to update each one manually.

Update 2015-08-30: You may also want to enable TCP traffic on port 53 (per RFC 5966).

Congratulations, you're done!

Notes

  • The nice thing about this approach is that it covers all the machines on your network. However, it can only protect machines when they're connected to that network. Taking a phone or tablet elsewhere or using cellular data exempts a device from this kind of protection.
    • So this may be an argument in favor of per-device ad-blocking - though perhaps as a strategy to be used in addition to (rather than instead of) a network-wide approach.
  • When creating the virtual machine, I used the Basic A1 size which would cost about $34.97 per month on Azure (though I don't plan to leave it running very long).
    • I tried the A0 size first (which would have cost $13.39 per month on Azure), but it ran out of memory building the domain list, seemingly due to this known issue.
  • As I note above, I chose not to configure a local web server on my Sky-Hole. While doing so offers interesting benefits, it didn't seem compelling for the purposes of this experiment and I preferred to keep thing simple. Should you choose to, directions are available in the Pi-Hole documentation.
  • If you end up using Pi-Hole like this (or on its own) please consider donating to the author, Jacob Salmela, to help support his work.

Conclusion

I'm only been running Sky-Hole for a couple of days, but the usability and performance improvements for some sites are quite noticeable. More importantly, it seems to me the browsing experience is necessarily safer by virtue of removing not just a subset of traffic, but the subset which is most likely to contain unwanted content.

As an experiment and a learning experience, Sky-Hole has been a successful side-project. I hope others find it interesting or thought-provoking and I welcome comments on improving or enhancing the approach!

Not romantically binding [promise-ring wraps Node.js callbacks with native ES6 Promises]

JavaScript Promises are a powerful way of working with asynchronous code. They make sequencing operations easy and offer a clear, predictable way to handle errors that might occur along the way. Much has been written about the benefits of Promises and I won't try to repeat it here.

What I do hope to do is make Promises a slightly more natural part of the Node.js development experience. In version 0.12.* (as well as in io.js), ES6 Promises are natively available. But the standard set of modules (such as File System) still use their original callback-based design and there's a bit of a disconnect between how you might want to write something and how you're able to. Fortunately, most of the Promise libraries that are already available include wrappers to convert callback-based functions into ones that return a Promise. However, most of those libraries assume you'll be using their custom implementation of Promise (from the "olden days" when that was the only option). And while different Promises/A+ implementations are meant to be interoperable, it seems silly to pull in a second Promise implementation when a perfectly good one is already available.

That's where promise-ring comes in: it's a tiny npm package that provides functions to convert typical callback-based APIs into their Promise-based counterparts using the V8 JavaScript engine's native Promise implementation. Briefly:

promise-ring is small, simple library with no dependencies that eases the use of native JavaScript Promises in projects without a Promise library.

Documentation is available in the README along with runnable samples demonstrating the use of each API. It's all quite simple and exactly what you'd expect. A bonus feature is the wrapAll function which makes it easier to work with modules that expose many different callback-based functions (such as the File System module; see below).

For an example of using promise-ring and Promises to simplify code, here is a typical callback-based snippet to copy a file onto itself:

var fs = require("fs");

// Copy a file onto itself using callbacks
fs.stat(file, function(err) {
  if (err) {
    console.error(err);
  } else {
    fs.readFile(file, encoding, function(errr, content) {
      if (errr) {
        console.error(errr);
      } else {
        fs.writeFile(file, content, encoding, function(errrr) {
          if (errrr) {
            console.error(errrr);
          } else {
            console.log("Copied " + file);
          }
        });
      }
    });
  }
});

And here's the same code converted to use Promises via promise-ring:

var pr = require("promise-ring");
var fsp = pr.wrapAll(require("fs"));

// Copy a file onto itself using Promises
fsp.stat(file)
  .then(function() {
    return fsp.readFile(file, encoding);
  })
  .then(function(content) {
    return fsp.writeFile(file, content, encoding);
  })
  .then(function() {
    console.log("Copied " + file);
  })
  .catch(console.error);

The second implementation is more concise, easier to follow, and DRY-er. That's the power of Promises! :)

Find out more by visiting promise-ring on GitHub or promise-ring in the npm gallery.