Edward Thomson

Google Analytics for Your Podcast

October 27, 2017  •  5:57 PM

I built a little open source side project: Google Analytics Handler, which reports tracking information to Google Analytics on the server side.

It's for my new podcast, All Things Git, so that we can track the RSS and audio downloads from the podcast's website.

If this was 1997 and I was hosting the website on my own server, I could just crunch my Apache logs for the data. But it's not; it's 2017 so of course I'm hosting the podcast in the cloud. The website is an Azure Web App and the audio downloads are hosted in Azure CDN.

And it's nearly perfect! Performance? Amazing! Cost? Low! But log files? Not so much.

Martin suggested a hosted analytics platform to track RSS requests, and another to track audio downloads. But that's two new bits of analytics to go along with Google Analytics, which we use to track the page visits. Requests being tracked in three distinct places? Ugh.

Thankfully, Google Analytics offers the Measurement Protocol API - which lets you report events like page views manually. So I built an ASP handler to report requests on RSS and audio downloads to Google Analytics.

For RSS requests, the handler simply opens the RSS file that exists on disk and returns it to the client before reporting to Google Analytics. This is nice because it's a transparent change - the handler is loaded by the web configuration only in production. It's just a few lines in the Web.config:

<configuration>
  <system.webServer>
    <handlers>
      <add name="RssHandler"
           verb="*"
           path="rss.xml"
           type="GoogleAnalyticsHandler.GoogleAnalyticsHandler, GoogleAnalyticsHandler"
           resourceType="Unspecified" />
    </handlers>
  </system.webServer>
</configuration>

For audio, we don't host the audio directly on the web site, so for the actual podcast itself, the handler redirects to the audio files in Azure CDN. I set it up so that any request in the /episodes/audio folder is redirected straight to the CDN:

<configuration>
  <system.webServer>
    <handlers>
      <add name="AudioHandler"
           verb="*"
           path="/episodes/audio/*"
           type="GoogleAnalyticsHandler.GoogleAnalyticsHandler, GoogleAnalyticsHandler"
           resourceType="Unspecified" />
    </handlers>
  </system.webServer>
  <location path="episodes/audio">
    <appSettings>
      <add key="redirect-root" value="https://mycdn.azureedge.net/episodes/" />
    </appSettings>
  </location>
</configuration>

As soon as I deployed the handler and configuration to production, I was seeing results in the real-time tab of Google Analytics:

Google Analytics

So the Google Analytics Handler makes it very straightforward to add Google Analytics tracking to your media assets like audio and video, and to your non-HTML pages like text and XML.

Changing Titles in Google Authenticator

August 23, 2017  •  5:27 PM

Surely you know that the best practice for securing your accounts is to enable two-factor authentication:

When all that is between you and an attacker getting into your account is a single password, you’re running a risk that is far greater than what you need be taking. A password is one factor – “something you know”. Now if we add something you have such as your mobile phone and the email service verifies your identity when you first log on by sending an SMS to that thing you have, the security position of your email changes fundamentally.

Troy Hunt, 10 email security fundamentals for everyday people

And hopefully you're using an application as your second factor, instead of text messages. Text messages may not work when you travel to foreign countries, but you're also reliant upon your wireless carrier to keep your data secure:

Instead, use TOTP (Time-based One-Time Pad) to get a six digit number from a local application. There are many applications that support TOTP, but I keep it old school, and use the Google Authenticator application.

The problem with the Google Authenticator app, though, is that it doesn't let you edit the title of a website (the "issuer") once you've set it up. So you end up with a number that's missing a title, and there's no good way to identify it.

Here, the first entry is obviously for my Microsoft account, but the second entry…? I have no idea what it's for:

Google Authenticator Missing a Title

Thankfully, TOTP is a published standard, so you can actually create - and then scan - your own QR code based on the secret number that you're given when you turn on two-factor authentication:

Facebook 2FA Enablement

The QR code that you scan to set up a new account is generated by constructing a URL with the secret number and some metadata, and then encoding that with a QR generator. The format is:

otpauth://totp/account_name?secret=secret_key&issuer=Website_Title

The account_name - as the name suggests - reflects the name of your account on the website. This is your username or email address, generally. Google Authenticator shows this as the second line of the key.

The secret_key is the secret key that the web site gives you when you enable TOTP. (In the example above, it's XXXX ABCD XXXX ABCD).

Finally, the issuer is the name of the website itself. This is the larger header displayed above your key.

It's such a simple mechanism that you can just create a new URL with those values and then use your favorite QR generating tool to create a QR code for your custom URL. (Remember to URL-encode any of your values!)

If you don't have a QR generator (I didn't) then you can install the very simple qrencode package and generate a QR code into an image file.

Better still, you can specify ANSI as the output type:

% qrencode -t ANSI otpauth://totp/ethomson@edwardthomson.com?secret=XXXXABCDXXXXABCD&issuer=My%20Title

And it will dump a QR code straight to your console:

QR on the Console

Now you just point Google Authenticator at your terminal window, and you can see that it adds a secret with a custom title of "My Title":

Google Authenticator with a Custom Title

Voila!

Upgrading git for CVE 2017-1000117

August 14, 2017  •  12:11 PM

A security vulnerability in Git has been announced: a bug in URL parsing can cause git clone to execute arbitrary commands. These URLs look quite suspicious, so it's unlikely that you'd be convinced through social engineering to clone them yourself. But they can be hidden in repository submodules.

Unless you're a Continuous Integration build agent, I hope that it's quite uncommon that you git clone --recursive a repository that you do not trust. So this vulnerability is rather uncommon, but as with any security vulnerability that has the possibility of remote code execution, you should upgrade your Git clients immediately.

Git version 2.14.1 is the latest and greatest version of Git, and has been patched. But most people don't actually build from source, so your version of Git is probably provided to you by a distribution. You may have different versions available to you - ones that have had the patches applied by your vendor - so you may not be able to determine if you're vulnerable simply by looking at the version number.

Here's some simple steps to determine whether you're vulnerable and some upgrade instructions if you are.

Are you vulnerable?

You can easily (and safely) check to see if your version of Git is vulnerable to this recent security vulnerable. Run this from a command prompt:

git clone -q ssh://-q/ /tmp/gittest

Note: this will not actually clone any repositories to your system, and it will not execute any dangerous commands.

If you see:

fatal: strange hostname '-q' blocked

Congratulations - you are already running a version of Git that is not vulnerable.

If, instead, you see:

fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

Then your version of Git is vulnerable and you should upgrade immediately.

Windows

Windows is quite easy to upgrade. Simply grab the newest version of Git for Windows (version 2.14.1) from https://git-for-windows.github.io/.

macOS

Apple ships Git with Xcode but unfortunately, they do not update it regularly, even for security vulnerabilities. As a result, you'll need to upgrade to the version that is included by a 3rd party. Homebrew is the preferred package manager for macOS.

  1. If you have not yet installed Homebrew, you can install it by running:

    /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
    

    at a command prompt.

  2. After that, you can use Homebrew to install git:

    brew install git
    
  3. Add the Homebrew install location (/usr/local) to your PATH.

    echo 'export PATH="/usr/local/bin:$PATH"' >> ~/.bashrc
    
  4. Close all open Terminal sessions, quit Terminal.app, and re-open it.

Linux (Debian, Ubuntu)

If you're using the current version of Ubuntu or Debian, then they'll have the latest version ready. If you're on a stable system, like a server, you should be running an LTS release - a "long term support" version - where they backport security patches like this one. So you should simply need to:

  1. Get the latest information about the available software versions from the remote repository:

    Debian, Ubuntu:

    sudo apt-get update
    

    Red Hat, CentOS:

    sudo yum update
    
  2. Install the latest version of git:

    Debian, Ubuntu:

    sudo apt-get install git
    

    Red Hat, CentOS:

    sudo yum update git
    

Ensuring that you're patched

Now if you run:

git clone -q ssh://-q/ /tmp/gittest

at a command prompt, then you should see:

fatal: strange hostname '-q' blocked

And now you're patched against the git security vulnerability, CVE 2017-1000117.

(Re)introducing git-dad

June 18, 2017  •  8:18 PM

After I dropped the git-recover script, @MordodeMaru asked me on Twitter if we could have a git dad command to help you out when you're in a jam:

But when I started thinking about my stepdad and his banter with his coworkers, I thought that if a git dad command was really going to help you out, it would help you when you mistyped the git add command. It could bring you a bit of levity… a dad joke!

And what better day than Father's Day to make that happen:

Now when you mistype git add as git dad, it will still add your file to the index, but it will also give you the prize of a dad joke.

All you have to do is grab git-dad and put it in your PATH.

On Dad Jokes and Calculus

I'd love to claim credit for this wonderful addition to the Git ecosystem, but just as I was getting ready to publish this, I did a quick search for "git dad" and I realized that Tim Petterson had already come up with the idea.

And, honestly, I would like to claim that I just happened to have the same idea. That this was totally independent discovery, like Calculus (and almost as important a contribution to humanity). But the truth is that I probably heard him talking about it. Perhaps it was in his awesome talk at Git Merge about aliases this year. Anyway, I'm sure that somewhere I got the idea from him and it stuck in my head, lying dormant until it was resurrected on Twitter.

But why would we need a second version of git dad? Surely one is enough.

You'll notice that this solution is a bit different than his solution, though. If you have an alias that starts with a bang (!) it will execute a non-Git command. (Normally, a Git alias just invokes another Git command; starting your alias with a ! allows you to invoke any command.)

But, if you have an alias that runs a non-Git command, then the alias will only be executed from the root of the repository's working directory. So an alias for !git add will work from the root of the repository's working directory, but not if you're inside some folder beneath that.

Using a script instead of an alias will solve this problem.

I did like his idea of using icanhazdadjoke.com instead of hardcoding some dad jokes. It's a bit slower than if they were hardcoded, but let's face it, that extra time spent is totally worth it to have a fresh, neverending supply.

Happy father's day!

Introducing git-recover

June 15, 2017  •  2:04 PM

I'm old enough to remember the old Norton UNERASE command: it was part of the old Norton Utilities for MS-DOS. It made clever use of the FAT filesystem to find files that were recently deleted, show them to you and let you undelete them.

git-recover brings that idea to your Git repositories.

Every time you add a version of a file to your git repository - that is to say, every time you run git add - Git will put a copy of that file in its object database. That means that if you accidentally delete a file that you were working on, if you ever ran git add on it, you can probably recover it.

Tell me more

I thought you'd never ask!

(Wait, you didn't? If you really aren't interested in the nitty gritty of how Git manages the index, then I guess you can skip this section. But who isn't interested in that!?!)

Git's index is a "staging area" that will become the next commit. If you recall from my discussion about how Git works, a commit in Git is a snapshot of the entire repository at a single point in time. And the index is also a snapshot: it contains a list of all the files in the repository that will make up the next commit.

You can see this if you look at the index, and Git provides a tool to do just that: git ls-files --stage. When I've just cloned a repository:

% git clone /tmp/foo_repo .
% git ls-files --stage
100644 6af0abcdfc7822d5f87315af1bb3367484ee3c0c 0   foo.txt

And when I add a new file to this repository, I can inspect the index again, and will see the new file:

% git add bar.txt
% git ls-files --stage
100644 ce013625030ba8dba906f756967f9e9ca394464a 0   bar.txt
100644 6af0abcdfc7822d5f87315af1bb3367484ee3c0c 0   foo.txt

Note that the entry for bar.txt contains the object ID of the file. When you run git add, Git actually adds the file to its object database, and takes the resulting object ID (the SHA-1 hash of the file) and places that in the index.

You can see the file on disk - Git has added it to the repository as a loose object:

% ls -Flas .git/objects/ce/013625030ba8dba906f756967f9e9ca394464a
4 -r--r--r--  1 ethomson  staff  21 14 Jun 23:58 .git/objects/ce/013625030ba8dba906f756967f9e9ca394464a

So Git has prepared this new file for our commit. But what if we don't commit this file? What if, instead, we git rm it? Or if we make some more changes to bar.txt and add those instead?

% echo "different changes" > bar.txt
% git add bar.txt

Now we've overwritten our original changes to bar.txt:

% git ls-files --stage
100644 4a95512212b2f24397fe2df5a2554935bd0a032a 0   bar.txt
100644 6af0abcdfc7822d5f87315af1bb3367484ee3c0c 0   foo.txt

You can see that the object ID for bar.txt had changed - reflecting our new file. But what's happened to the original file we added? Where is object ce01362?

It's still in our object database:

% ls -Flas .git/objects/ce/013625030ba8dba906f756967f9e9ca394464a
4 -r--r--r--  1 ethomson  staff  21 14 Jun 23:58 .git/objects/ce/013625030ba8dba906f756967f9e9ca394464a

But we never committed it, so this object is not pointed to by any commit in the graph. Nor is it in our index anymore. This unreference blob is "garbage" and - eventually - Git will garbage collect it.

But until it does, we can recover it!

Using git-recover

The simplest way to use git-recover is to use it in interactive mode: just run git recover -i. It will show you the first few lines of each "orphaned" file - those that were once git added to the repository but were never committed - and let you recover them (or not).

% git recover -i
Recoverable orphaned git blobs:

61c2562a7b851b69596f0bcad1d8f54c400be977  (Thu 15 Jun 2017 12:20:22 CEST)
> Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
> tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim
> veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
> commodo consequat. Duis aute irure dolor in reprehenderit in voluptate

Recover this file? [y,n,v,f,q,?]: 

You can also run git-recover without any arguments, and it will show you all the "orphaned" blobs that you can recover. You can then inspect an object to decide if it's something that you're interested in (using git show).

% git recover
Recoverable orphaned git blobs:

61c2562a7b851b69596f0bcad1d8f54c400be977  Thu 15 Jun 2017 12:20:22 CEST

When you find the object that you want to recover, you can run git-recover <objectid> to pull it out of the object database and write it to disk.

You can specify the filename to write with the (optional) -f flag:

% git recover 61c2562 -f greeking.txt
Writing 61c2562: greeking.txt.

Specifying the filename is helpful, because you may have rules set up in your .gitattributes file on a per-file or per-file extension basis. Using the -f flag will make sure that these rules are executed.

How to get it

git-recover is a shell script - you can just download it and go.

When you put git-recover in your PATH, then it becomes a proper git command, and you can run git recover (notice the space instead of the dash).

Please open an issue or a pull request if you have problems or improvements.