The blog of dlaa.me

Pie in the Sky-Hole [A Pi-Hole in the cloud for ad-blocking via DNS]

Inspired by Marco Arment's recent post about blocking advertisements on the web, I decided to explore the same idea. However, while Marco focuses on the annoyance of advertisements, I am interested in the security benefits of removing them. There have been numerous incidents of otherwise respectable websites compromising the security of their users due to the advertisements they include. Searches for "web site hacked 'ad network'" on Google and Bing provide some examples; another is this XSS attack on Troy Hunt's site, which is interesting thanks to the detailed analysis Troy provides. Popular sites of all kinds have been compromised in this way, and one might argue they should be treated as attackers because of the approach used to serve third-party ads.

Marco's article describes an in-browser solution for ad-blocking, but I prefer something that automatically protects all the machines on my network (at least, while they're using the network; see below). So I set out looking for something that works at the network level and came across Pi-Hole, a DNS-based ad-blocker for the Raspberry Pi. Aside from the fact that I don't own a Pi, this seemed like exactly what I wanted. ;)

Fortunately, there are no actual dependencies on Pi hardware, so I decided to create my own Pi-Hole on a server in the cloud - thus the name "Sky-Hole". To do so, I opened the Microsoft Azure Portal, created a small virtual machine running Ubuntu Server 15.04, and configured it according to the manual instructions for Pi-Hole (with a few customizations outlined below). Then I updated my wireless router to use Sky-Hole as the DNS server for my home network - and all my devices stopped showing advertisements!

Directions

I used a minimal set of steps to configure the Sky-Hole and list them below so they're easy to reproduce. I made a couple of tweaks to the Pi-Hole process along the way and explain them in turn.

First, create a virtual machine to run everything on (I've used both Microsoft Azure and Amazon Web Services, but any provider should do). Then, install dnsmasq:

sudo apt-get -y install dnsmasq
sudo update-rc.d dnsmasq enable
sudo mv /etc/dnsmasq.conf /etc/dnsmasq.orig
sudo nano /etc/dnsmasq.conf

Configure dnsmasq.conf as follows (replacing "sky-hole" on the last line with the host name of your virtual machine):

domain-needed
bogus-priv
no-resolv
server=8.8.8.8
server=8.8.4.4
interface=eth0
listen-address=127.0.0.1
cache-size=10000
log-queries
log-facility=/var/log/pihole.log
local-ttl=300
addn-hosts=/etc/pihole/gravity.list
host-record=sky-hole,127.0.0.1,::1

The addn-hosts option is meant to be optional, but I needed it because /etc/hosts was not updated by gravity.sh. The host-record option was necessary to avoid a "sudo: unable to resolve host" error which showed up whenever I enabled dnsmasq. (Though this may be an artifact of the default virtual machine configuration under Azure.)

Update 2015-08-30: host-record was similarly necessary on AWS, where the automatically-assigned host name was of the form ip-123-123-123-123.

Now, download the Pi-Hole script and run it to generate the list of domain names to block:

sudo curl -o /usr/local/bin/gravity.sh https://raw.githubusercontent.com/jacobsalmela/pi-hole/master/gravity.sh
sudo chmod 755 /usr/local/bin/gravity.sh
sudo /usr/local/bin/gravity.sh
sudo sed -i "s/^[0-9\.]\+\s/0.0.0.0 /g" /etc/pihole/gravity.list

The last line is my own and replaces the virtual machine's IP address with an unusable 0.0.0.0 address when redirecting undesirable sites. Because I'm not running a web server on the Sky-Hole, this seems like a more appropriate way to block unwanted domain names. (Besides, hostname -I in Azure reports the virtual machine's internal address which is on a private network.)

Restart dnsmasq to apply the changes:

sudo service dnsmasq restart

Now, test things locally via ping, dig, nslookup (or similar) to verify that desirable domain names are returned as-is and undesirable ones are blocked by returning the 0.0.0.0 IP. Assuming that's the case, update the virtual machine to accept incoming UDP traffic on port 53 (per the DNS specification) and test again from a different machine. If everything is working as expected, configure your router to use the Sky-Hole's public IP address for DNS resolution. This automatically applies to all devices on the local network and avoids the need to update each one manually.

Update 2015-08-30: You may also want to enable TCP traffic on port 53 (per RFC 5966).

Congratulations, you're done!

Notes

  • The nice thing about this approach is that it covers all the machines on your network. However, it can only protect machines when they're connected to that network. Taking a phone or tablet elsewhere or using cellular data exempts a device from this kind of protection.
    • So this may be an argument in favor of per-device ad-blocking - though perhaps as a strategy to be used in addition to (rather than instead of) a network-wide approach.
  • When creating the virtual machine, I used the Basic A1 size which would cost about $34.97 per month on Azure (though I don't plan to leave it running very long).
    • I tried the A0 size first (which would have cost $13.39 per month on Azure), but it ran out of memory building the domain list, seemingly due to this known issue.
  • As I note above, I chose not to configure a local web server on my Sky-Hole. While doing so offers interesting benefits, it didn't seem compelling for the purposes of this experiment and I preferred to keep thing simple. Should you choose to, directions are available in the Pi-Hole documentation.
  • If you end up using Pi-Hole like this (or on its own) please consider donating to the author, Jacob Salmela, to help support his work.

Conclusion

I'm only been running Sky-Hole for a couple of days, but the usability and performance improvements for some sites are quite noticeable. More importantly, it seems to me the browsing experience is necessarily safer by virtue of removing not just a subset of traffic, but the subset which is most likely to contain unwanted content.

As an experiment and a learning experience, Sky-Hole has been a successful side-project. I hope others find it interesting or thought-provoking and I welcome comments on improving or enhancing the approach!