Edward Thomson

A security vulnerability in Git has been announced: a bug in submodule resolution can cause git clone --recursive to execute arbitrary commands.

What's the problem?

When a Git repository contains a submodule, that submodule's repository structure is stored alongside the parent's, inside the .git folder. This structure is generally stored in a folder with the same name as the submodule, however the name of this folder is configurable by a file in the parent repository.

Vulnerable versions of git allow the folder name to contain a path that is not necessarily beneath the .git directory. This can allow an attacker to carefully create a parent repository that has another Git repository checked in, as a folder inside that parent repository. Then that repository that's checked in can be added as a submodule to the parent repository. That submodule's location can be set outside of the .git folder, pointing to the checked-in repository inside the parent itself.

When you recursively clone this parent repository, Git will look at the submodule that has been configured, then look for where to store that submodule's repository. It will follow the configuration into the parent repository itself, to the repository that's been checked in as a folder. That repository will be used to check out the submodule… and, unfortunately, any hooks in that checked-in repository will be run.

So the attacker can bundle this repository configuration with a malicious post-checkout hook, and their code will be executed immediately upon your (recursive) clone of the repository.

Hosting providers

Thankfully, since most of us rely on a hosting provider to store our code, we can stop this vulnerability by simply blocking the repositories there. Visual Studio Team Services is actively blocking any repository that tries to set up a git submodule outside of the .git directory. I'm told that GitLab and GitHub are, too, and presumably other hosting providers are blocking these malicious repositories as well.

Upgrade your client

Blocking these repositories on the hosting providers shuts down an important attack vector, and I hope that it's unlikely that you git clone --recursive a repository that you don't trust. Despite that, you should still upgrade your client.

Git version 2.17.1 is the latest and greatest version of Git, and has been patched. But most people don't actually build from source, so your version of Git is probably provided to you by a distribution. You may have different versions available to you - ones that have had the patches applied by your vendor - so you may not be able to determine if you're vulnerable simply by looking at the version number.

Here's some simple steps to determine whether you're vulnerable and some upgrade instructions if you are.

Are you vulnerable?

You can easily (and safely) check to see if your version of Git is vulnerable to this recent security vulnerable. Run this from a temporary directory:

git init test && \
  cd test && \
  git update-index --add --cacheinfo 120000,e69de29bb2d1d6434b8b29ae775ad8c2e48c5391,.gitmodules

Note: this will not actually clone any repositories to your system, and it will not execute any dangerous commands.

If you see:

error: Invalid path '.gitmodules'
fatal: git update-index: --cacheinfo cannot add .gitmodules

Congratulations - you are already running a version of Git that is not vulnerable.

If, instead, you see nothing, then your version of Git is vulnerable and you should upgrade immediately.

Windows

Windows is quite easy to upgrade. Simply grab the newest version of Git for Windows (version 2.17.1) from https://gitforwindows.org/.

macOS

Apple ships Git with Xcode but unfortunately, they do not update it regularly, even for security vulnerabilities. As a result, you'll need to upgrade to the version that is included by a 3rd party. Homebrew is the preferred package manager for macOS.

  1. If you have not yet installed Homebrew, you can install it by running:

    /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
    

    at a command prompt.

  2. After that, you can use Homebrew to install git:

    brew install git
    
  3. Add the Homebrew install location (/usr/local) to your PATH.

    echo 'export PATH="/usr/local/bin:$PATH"' >> ~/.bashrc
    
  4. Close all open Terminal sessions, quit Terminal.app, and re-open it.

Linux (Debian, Ubuntu)

If you're using the current version of Ubuntu or Debian, then they'll have the latest version ready. If you're on a stable system, like a server, you should be running an LTS release - a "long term support" version - where they backport security patches like this one. So you should simply need to:

  1. Get the latest information about the available software versions from the remote repository:

    Debian, Ubuntu:

    sudo apt-get update
    

    Red Hat, CentOS:

    sudo yum update
    
  2. Install the latest version of git:

    Debian, Ubuntu:

    sudo apt-get install git
    

    Red Hat, CentOS:

    sudo yum update git
    

Ensuring that you're patched

Now if you run:

git init test && \
  cd test && \
  git update-index --add --cacheinfo 120000,e69de29bb2d1d6434b8b29ae775ad8c2e48c5391,.gitmodules

at a command prompt, then you should see:

error: Invalid path '.gitmodules'
fatal: git update-index: --cacheinfo cannot add .gitmodules

And now you're patched against the git security vulnerability, CVE 2018-11234 and CVE 2018-11235.

Thanks to Junio Hamano, Jeff King, Johannes Schindelin and the rest of the Git security community for their work to keep our source code safe and secure.

If you're interested in security vulnerabilities in Git, please join me at NDC Oslo, where I'll talk you through the details of this security issue and others.

tl;dr: If you just want the instructions for configuration, they're here.

I spend a lot of time writing cross-platform software, which means a lot of time writing code on Windows or testing my code there. So the Windows Subsystem for Linux has been a lifesaver for me, since it lets me run Linux applications — in fact, a whole Debian distribution — on my Windows machine (without needing to run a virtual machine).

I was talking to someone about this last week at the Build 2018 conference, and they mentioned that they liked WSL but they really wished that they had a GUI credential manager — like the Git Credential Manager — on the Linux side.

They were surprised when I told them that they could! 🤯

If you're not familiar with the Git Credential Manager, it allows you t authenticate to a remote Git server easily, even if you have a complex authentication pattern like Azure Active Directory or two-factor authentication. Git Credential Manager integrates into the authentication flow for services like Visual Studio Team Services, Bitbucket and GitHub and — once you're authenticated to your hosting provider — requests a new authentication token and stores sit securely in the Windows Credential Manager. After the first time, you can use git to talk to your hosting provider without needing to re-authenticate; it will just use the token in the Windows Credential Manager.

This gets set up for you automatically when you install Git for Windows but you can also configure it to work with Windows Subsystem for Linux.

Git Credential Manager on Windows Subsystem for Linux

You can set it up by running1:

git config --global credential.helper "/mnt/c/Program\ Files/Git/mingw64/libexec/git-core/git-credential-manager.exe"

Now any git operation you perform within Windows Subsystem for Linux will use the credential manager. If you already have credentials cached for a host, it will simply read them out of the credential manager. Otherwise, you'll get the same nice UI dialog experience, even if you're in a Linux console.

This support relies on the fact that Windows Subsystem for Linux and Windows itself can interoperate and you can invoke Windows applications from WSL.2

  1. This is the default path for a Git for Windows installation; you may need to tweak this if you're using Cygwin or mingw.) 

  2. Note, however, that you do need to update to the Windows 10 April 2018 update; prior versions had a problem with sharing stdin/stdout when the Windows application was a .NET application instead of Win32. 

Introducing ntlmclient

May 6, 2018  •  11:55 PM

I’d like to announce ntlmclient, a new open source library that I built. Usually I'd be announcing it proudly and encouraging you to use my code — but this time, I’d ask you to please not use it.

See, this new library is a library that performs NTLM2 authentication. And, to be honest, I’d like to ask you to not perform NTLM2 authentication at all. But — if you really must use NTLM2 — then I suppose that this new library will do the job.

I intend to add this to libgit2, the Git library that backs clients like GitKraken and gmaster. Because regrettably, we really must use NTLM2. Many people still use NTLM2 with their on-premises Team Foundation Server instances, and we’d like all the tools that use libgit2 to be able to talk to their Git repositories hosted in TFS.

At the moment, libgit2 can already speak NTLM2 on Windows clients; using this library will enable Unix platforms to speak NTLM2 as well.

A bit of background

My first experience with NTLM was way back in 2006, when I was working at Teamprise. We were building cross-platform tools to talk to Microsoft Team Foundation Server; we had a plug-in for the Eclipse IDE, a standalone GUI tool, and a command-line client for Windows, Mac, Linux and a bunch of legacy Unix platforms.

(Today these tools live on as Microsoft Team Explorer Everywhere.)

We faced a lot of challenges reimplementing Microsoft’s tools — and one of those early features that we needed to implement was the NTLM2 authentication protocol. Since we were building a plug-in for the Eclipse IDE, we built our entire client suite in Java. And — regrettably — our HTTP stack didn’t support NTLM2 at the time, only the older LM and NTLM protocols.

But LM and NTLM are truly ancient algorithms, so modern systems disable them both, in favor of the slightly less ancient NTLM2 algorithm. So at Teamprise, we were forced to learn about, and ultimately implement, NTLM2 ourselves.

That was over a decade ago, and it certainly hasn’t gotten any better with age.

How NTLM2 works

Many people still have their Windows servers — and some of the applications on them — to use NTLM2. That’s because it’s not without its advantages: it’s the simplest way to enable "single sign-on". When you sign in to your local computer, it hashes your password and stores that hash in memory. This is the same hash that the server — or your Active Directory server — has stored. Later, when you communicate with a server that wants you to authenticate with NTLM2, you encrypt a shared random value that the server gives you using that hash. Then you send that encrypted value to the server — it will encrypt the same value with it’s hash and if they match, it will prove that you have entered the same password without actually having to transmit the password itself, or even keep it in memory in plaintext.

This is a clever way to allow you to authenticate to a remote server without having to type your password. But there’s a better way.

Alternatives

Kerberos also enables single sign-on, but instead of relying on ciphers like RC4 and HMAC-MD5, Kerberos is built on modern ciphers. Microsoft Active Directory is built around Kerberos, so it’s obviously well-supported on Windows, but Kerberos is also an industry standard. There are great implementations available including MIT’s and Heimdal.

However, the reality is that Kerberos requires some additional configuration on Windows servers. And this configuration is absolutely worth it on a production machine. If you want to support single sign-on, you should probably be using Kerberos in production. But if you’re just spinning up a test server, it’s sometimes worth it to just use NTLM2. And the reality is that NTLM2 over an encrypted connection like TLS is still a reasonable solution.

Fundamentally, a lot of people still use it.

So I created a new NTLM2 client library. It’s basically a port of Team Explorer Everywhere’s NTLM2 code that’s been used in production for over a decade — but it’s a port to C, with minimal dependencies. It only requires a cryptography library for the underlying cipher support. On macOS, ntlmclient will use Common Crypto, the system’s cryptography libraries. On Linux, ntlmclient uses either OpenSSL or mbedTLS, whichever library you have on your system.

So, please, don’t use NTLM2. If you need single sign-on support, you're probably best off using Kerberos. And if you don't need single sign-on support, just use Basic authentication over TLS.

But if you do need to support NTLM2 — like if you need to talk to an on-premises Team Foundation Server that wasn’t configured with an SPN for Kerberos — then I hope my new ntlmclient library helps.

Git with Unity

March 22, 2018  •  11:54 PM
This is a follow-up to my post earlier this week introducing the correct .gitattributes settings for line endings; it dives a little bit more deeply into some of the configuration that you might be interested in if you're getting started building games in Unity.

Around the VSTS team, we've got a lot of gamers. A couple of people on the team used to develop commercial games, or even Xbox itself. Some people build games in their spare time, as hobby projects. And of course some don't want to develop games, but still love to play them.

And me? I'm actually none of those. I get excited every time there's a new Mario platformer, but otherwise, I don't really play games. But I wanted to try my hand at hacking on them, so I decided to grab Unity and give it a spin.

So far I haven't actually built anything useful, but I did start to understand how Unity fits in with Git.

.gitignore

The .gitignore file is a metadata file that controls how Git operates on your repository. Files listed in .gitignore will be — like the name implies — ignored. They won't show up in git status and they won't be added to the repository.

It's important to make sure that you .gitignore your build output directories, any cache data and temporary files or directories that your tools make. And Unity actually has a lot of these, including Build directories and a cache directory, Library.

Thankfully, you don't have to know what all these directories are. You can instead go to the gitignore repository, which contains crowd-sourced best practices that you can just drop into your repository. There's a .gitignore file customized for each type of project you might encounter. So you can just grab the Unity .gitignore, drop it into your repository, and go.

Easier still: if you use a Git hosting provider like GitHub or Visual Studio Team Services, you can select one of these .gitignore files when you create the repository. We have the same crowd-sourced .gitignore files ready to go, so that when you create a repository they're there for you to get started with.

Unity .gitignore in VSTS

.gitattributes

The other metadata file that Git uses is called .gitattributes. You might be familiar with .gitattributes because you use it to control how Git handles line endings for your file.

You do use .gitattributes to configure your line endings, don't you?

But .gitattributes is more than just line endings - you can also configure how files are merged when two people change the same file in two different branches.

This means that you can set up Unity's "Smart Merge" functionality. By default, Git is totally unaware of the type of content that you're checking in. If a file is changed in two different branches, it will try to merge the file just by looking at the lines, without understanding them.

But Unity includes a semantic merge tool that understands the actual contents of the scene files, so it can help deal with merging them. You just need to configure .gitattributes to use it.

You can add these lines to your .gitattributes:

*.anim merge=unityyamlmerge eol=lf
*.asset merge=unityyamlmerge eol=lf
*.controller merge=unityyamlmerge eol=lf
*.mat merge=unityyamlmerge eol=lf
*.meta merge=unityyamlmerge eol=lf
*.physicsMaterial merge=unityyamlmerge eol=lf
*.physicsMaterial2D merge=unityyamlmerge eol=lf
*.prefab merge=unityyamlmerge eol=lf
*.unity merge=unityyamlmerge eol=lf

Git LFS

One of the great things about Git is that it's a distributed version control system. That means that you get an entire copy of the repository from the server. That means not just all of the files in the current version of the branch that you're interested in, but all the branches, and all the history that you've ever checked in.

This means that you can work completely disconnected from your server: you can run git log or git blame to analyze the changes that have been made, even if you're on an airplane1.

But it's problematic when you're checking in large files. If you have assets like images, audio or movies, Git starts to choke. And it's not even the size of the assets themselves as much as the history that's problematic.

If you have a 100K PNG, then that's not so bad. The problem is that you've changed that 100K ping a dozen times. Now you've got 1.2 MB in history that you have to download every time you run git clone. And that's just one file. So it adds up very quicky.

Git LFS helps here: it's the Large File Storage extension to Git.

Instead of storing every copy of these assets in the repository directly, Git LFS stores this data in a separate location, the large file storage area. In the repository, it just checks in a little stub file, the "git-lfs pointer file", that lets Git LFS know where it can get the data when it needs it.

So when you clone the repository, you don't download all those assets, just the tiny (128 byte) git-lfs pointer files. When git needs the files, to write them to your working directory, Git LFS will download them from the server and put them on disk. It's a nice hybrid system between a totally distributed version control system, and a centralized system.

You can download git-lfs - or, if you use Git for Windows, it's already included. It's easy to set up, you just add some more lines to your .gitattributes file to make sure that your textures and artwork are handled by LFS:

*.jpg filter=lfs diff=lfs merge=lfs -text
*.gif filter=lfs diff=lfs merge=lfs -text
*.png filter=lfs diff=lfs merge=lfs -text

*.wav filter=lfs diff=lfs merge=lfs -text
*.ogg filter=lfs diff=lfs merge=lfs -text
*.mp3 filter=lfs diff=lfs merge=lfs -text

*.mp4 filter=lfs diff=lfs merge=lfs -text
*.mov filter=lfs diff=lfs merge=lfs -text

*.fbx filter=lfs diff=lfs merge=lfs -text
*.blend filter=lfs diff=lfs merge=lfs -text
*.obj filter=lfs diff=lfs merge=lfs -text

Locking

Git LFS 2.0 introduces the ability to put advisory locks on files. This is critical if you're working with multiple artists. Otherwise, two people might start working on the same image. When they go to merge their branches, they'll realize that they've both done this work, and they have a merge conflict.

Unfortunately, there's no "smart merge" for images. They'll have to figure out how to resolve this manually - probably losing one of the other's work.

The new locking functionality does require additional support on the server. Both GitHub and Visual Studio Team Services offer locking, so if you're using one of those services, you can just run:

git lfs lock file.png

to lock a file. When you've finished editing, and want to unlock it, you can run:

git lfs unlock file.png

I'm excited to continue playing with Unity for building games. But to be completely honest, I'm even more excited to use a totally new tool with Git. There's a lot of new functionality here, and I'm looking forward to learning how the Git community can help make Unity developers even more productive with version control.

  1. I'm old enough to remember when airplanes didn't have Wifi. Back in those bad old days, the version control nerds used to talk about "working on an airplane" meaning "working without being able to talk to your version control server". 

Git for Windows: Line Endings

March 20, 2018  •  5:04 PM

If you’re on a team of Windows developers - or more importantly, on a cross-platform development team - one of the things that comes up constantly is line endings. Your line ending settings can be the difference between development productivity and constant frustration.

The key to dealing with line endings is to make sure your configuration is committed to the repository, using .gitattributes. For most people, this is as simple as creating a file named .gitattributes at the root of your repository that contains one line:

* text=auto

With this set, Windows users will have text files converted from Windows style line endings (\r\n) to Unix style line endings (\n) when they’re added to the repository.

If you're bored already, you can probably stop reading right now. For most developers - in most repositories - this is all you need to know.

Why not core.autocrlf?

Originally, Git for Windows introduced a different approach for line endings that you may have seen: core.autocrlf. This is a similar approach to the attributes mechanism: the idea is that a Windows user will set a Git configuration option core.autocrlf=true and their line endings will be converted to Unix style line endings when they add files to the repository.

The difference between these two options is subtle, but critical: the .gitattributes is set in the repository, so its shared with everybody. But core.autocrlf is set in the local Git configuration. That means that everybody has to remember to set it, and set it identically.

The first, best option you have to get this right is when you’re installing Git for Windows:

Git for Windows Installer: core.autocrlf

You probably want the first option, but you’d be forgiven if you didn’t know that the first time you ran the installer.

The problem with core.autocrlf is that if some people have it set to true and some don’t, you’ll get a mix of line endings in your repository. And that’s not good - because his setting doesn’t just tell Git what you want it to do with files going in to your repository. It also tells Git what you’ve already done, and what the line endings look like on the files that are already checked in.

This is why one of the most common symptoms of a line ending configuration problem is seeing “phantom changes”: running git status tells you that you’ve changed a file, but running git diff doesn’t show you any changes. How can that be? Line endings.

Phantom Changes

Imagine that some file is checked in to your repository with Windows-style line endings. For some reason, somebody hadn't set core.autocrlf=true when they added the file. You, on the other hand, being a diligent Git for Windows user, did set that option.

When you run git status, git will look at that file to decide whether you've made any changes to it. When it compares what's on disk to what's in your repository, it will convert the line endings on-disk from Windows-style style to Unix-style in the repository. Since the existing file in the repository had Windows-style line endings, and you expect them to be Unix style, git will determine that the file is different. (It is, byte for byte, different.)

By using .gitattributes, you ensure that these settings exist at the repository level, instead of leaving it up to individual users to understand to configure correctly. This means there’s no opportunity for misconfiguration by an individual user.

Of course, the best time to set this up is at the very moment you create your repository, before you add any files. Doing it after the fact means that you may still have some files added with the wrong configuration.

Over time, these files will be updated as you edit them. You can try to renormalize files, updating the line endings, but doing so will cause annoying merge conflicts for anybody who created a branch before the renormalization.

What About Binaries?

Generally speaking, git is pretty good at detecting whether a file is a binary or not. If it decides that a file is a binary, then it will refuse to convert line endings. But it's still good practice to configure git not to convert line endings for your binary files.

You can remove the text attribute from files that you don't want to have line ending conversions. For example, if you have PNGs in your repository, your .gitattributes might look like this:

* text=auto
*.png -text

Of course, there are more advanced settings in your .gitattributes that can be applied. These are especially useful in particular development scenarios. We'll dive deeper into some of those - like using Unity - in the next blog post.