Edward Thomson

Git for Windows: Line Endings

March 20, 2018  •  5:04 PM

If you’re on a team of Windows developers - or more importantly, on a cross-platform development team - one of the things that comes up constantly is line endings. Your line ending settings can be the difference between development productivity and constant frustration.

The key to dealing with line endings is to make sure your configuration is committed to the repository, using .gitattributes. For most people, this is as simple as creating a file named .gitattributes at the root of your repository that contains one line:

* text=auto

With this set, Windows users will have text files converted from Windows style line endings (\r\n) to Unix style line endings (\n) when they’re added to the repository.

If you're bored already, you can probably stop reading right now. For most developers - in most repositories - this is all you need to know.

Why not core.autocrlf?

Originally, Git for Windows introduced a different approach for line endings that you may have seen: core.autocrlf. This is a similar approach to the attributes mechanism: the idea is that a Windows user will set a Git configuration option core.autocrlf=true and their line endings will be converted to Unix style line endings when they add files to the repository.

The difference between these two options is subtle, but critical: the .gitattributes is set in the repository, so its shared with everybody. But core.autocrlf is set in the local Git configuration. That means that everybody has to remember to set it, and set it identically.

The first, best option you have to get this right is when you’re installing Git for Windows:

Git for Windows Installer: core.autocrlf

You probably want the first option, but you’d be forgiven if you didn’t know that the first time you ran the installer.

The problem with core.autocrlf is that if some people have it set to true and some don’t, you’ll get a mix of line endings in your repository. And that’s not good - because his setting doesn’t just tell Git what you want it to do with files going in to your repository. It also tells Git what you’ve already done, and what the line endings look like on the files that are already checked in.

This is why one of the most common symptoms of a line ending configuration problem is seeing “phantom changes”: running git status tells you that you’ve changed a file, but running git diff doesn’t show you any changes. How can that be? Line endings.

Phantom Changes

Imagine that some file is checked in to your repository with Windows-style line endings. For some reason, somebody hadn't set core.autocrlf=true when they added the file. You, on the other hand, being a diligent Git for Windows user, did set that option.

When you run git status, git will look at that file to decide whether you've made any changes to it. When it compares what's on disk to what's in your repository, it will convert the line endings on-disk from Windows-style style to Unix-style in the repository. Since the existing file in the repository had Windows-style line endings, and you expect them to be Unix style, git will determine that the file is different. (It is, byte for byte, different.)

By using .gitattributes, you ensure that these settings exist at the repository level, instead of leaving it up to individual users to understand to configure correctly. This means there’s no opportunity for misconfiguration by an individual user.

Of course, the best time to set this up is at the very moment you create your repository, before you add any files. Doing it after the fact means that you may still have some files added with the wrong configuration.

Over time, these files will be updated as you edit them. You can try to renormalize files, updating the line endings, but doing so will cause annoying merge conflicts for anybody who created a branch before the renormalization.

What About Binaries?

Generally speaking, git is pretty good at detecting whether a file is a binary or not. If it decides that a file is a binary, then it will refuse to convert line endings. But it's still good practice to configure git not to convert line endings for your binary files.

You can remove the text attribute from files that you don't want to have line ending conversions. For example, if you have PNGs in your repository, your .gitattributes might look like this:

* text=auto
*.png -text

Of course, there are more advanced settings in your .gitattributes that can be applied. These are especially useful in particular development scenarios. We'll dive deeper into some of those - like using Unity - in the next blog post.

Git Security: Further Reading

February 21, 2018  •  7:31 PM

In my talk at Git Merge today, I mentioned that there were some other difficulties handling security issues in Git. Here's some more information:

Talking about Git at Barcelona .NET Core: 6 March 2018

February 21, 2018  •  7:31 PM

I'm excited to be speaking at the Barcelona .NET Core group on Tuesday, 6 March, 2018. I'll be talking about How Microsoft "Got Git", adopting the Git version control system in Visual Studio and VSTS, and "Scripting Git", how to build .NET applications that interact with Git repositories using LibGit2Sharp.

Please join me! Especially if you're already in town for the Git Merge conference happening on the 7th and 8th of March. And if you're on the fence about coming to Git Merge, I can offer you 15% off your ticket, just because you seem like a nice person. When you register, just use the code:

GITCOMMUNITY15

I hope to see you there!

Merge vs Rebase: Do They Produce the Same Result?

December 21, 2017  •  10:29 PM

I get asked quite a lot whether I recommend a merge-based workflow, or one where people rebase onto master. But to be quite honest, I couldn't possibly care less. Your workflow is your workflow after all, it's up to your team to work in the way that's most productive for you. For some teams that's merging, for some teams that's rebasing… n the end, the code gets integrated and the end result is the same either way, whether you merge or rebase it, right?

Right?

If you're a rebase fan, you've probably run into cases where you get conflicts during a rebase that you wouldn't get during a merge. But that's not very interesting… is there a case where merge and rebase both finish and produce a result, but a different tree?

Is git-merge guaranteed to produce the same results as git-rebase?

No!

It's actually not a guarantee; in fact, you can create two branches that merge differently than they rebase. To avoid any spoilers, I've hidden the details in case you want to think about this on your own. 🤔 Click "expand" below to see the details.

Click to expand…

You can follow along with this GitHub repository.

Two branches

Imagine that you have two branches, one is master, and the other is the unimaginatively named branch branch. They're both based off a common ancestor 0d7088f. Further, imagine that your branch has two commits based off that common ancestor:

Ancestor 0d7088f branch 3f3ca4f branch 09d3ac4
One One One
Two 2 Two
Three Three Three
Four Four Four
Five Five Five
Six Six Six
Seven Seven 7
Eight Eight Eight

Finally, imagine that your master branch has a single commit based off the common ancestor:

Ancestor 0d7088f master f2e864b
One One
Two 2
Three Three
Four Four
Five Five
Six Six
Seven Seven
Eight Eight

What happens when you try to merge or rebase these?

Merge

When Git merges two branches, it only look at the tip commit in each branch, and compares them to their common ancestor. It does not look at any intermediate commits. In the above example, when we merge branch into master, the algorithm looks at the changes made in branch by comparing commit 09d3ac4 to the common ancestor commit 0d7088f. It also looks at the changes made in master by comparing commit f2e864b to the common ancestor commit.

The merge algorithm compares each line1 in the common ancestor, comparing it to the file in branch and the file in master. If the line is unchanged in all branches, then there's no problem - that line is brought into the merge result. In this example, line 1 in unchanged in both branches, so line 1 of the merge result will be One.

If a line is changed in only one branch, then that change is brought forward into the merge result. In this example, line 7 is changed only in branch. So in the resulting merge, line 7 will have the contents from branch, which is the digit 7. Also, line 2 is changed only in master, so in the merge result it will be the digit 2.

Merge Result
One
2
Three
Four
Five
Six
7
Eight

Remember that merge only looks at the tip commits, so comparing the common ancestor to branch, line two appears unchanged, since the ancestor and tip are identical.

Rebase

Rebase works a bit differently - instead of doing a three-way merge between the tip commits on each branch, it tries to replay the commits on one branch onto another. In the above example, if we want to rebase branch onto master, then Git will create a patch for each commit on branch and apply those patches onto master.2

When you rebase, Git will switch you to the master branch, checking out f2e864b. Then Git will apply the differences between the common ancestor and the first commit on branch. In this example, the patch between the common ancestor and the branch changes line two from Two to 2. But that's already the value of the file in master. So there's nothing to do, and the patch for 3f3ca4f applies cleanly.

Then a patch for the second commit on the branch is applied: it changes like two back to the text representation, and changes line seven to a digit. So the rebase result is:

Rebase Result
One
Two
Three
Four
Five
Six
7
Eight

So rebase preserves the changes in the branch while merge preserved the changes in master.

Conclusion

Generally these sorts of changes will cause a conflict instead of different results. It was key that in branch we changed the contents of line 2 back to the contents in the common ancestor. That allowed the merge engine to consider that the line in branch was unchanged.

Merge Result Rebase Result
One One
2 Two
Three Three
Four Four
Five Five
Six Six
7 7
Eight Eight

So… is this a problem?

It might seem concerning that this comes up when there was an apparent revert of your changes. Logically, both the branch and the master branches changed line two, but then branch changed it back. So although this seems rather derived, it's not that unlikely.

But whether you prefer a merge workflow or a rebase workflow, you should be careful of your integration and following good development practices:

  1. Code review, ideally using pull requests, so that your team members have visibility into changes before they're integrated into master.

  2. Continuous integration builds and tests, as part of your integration workflow. Ideally, with build policies to ensure that builds succeed and tests pass.

So make sure to do proper code reviews, which keep this an interesting difference instead of an actual problem in your workflow.

  1. Strictly speaking, the merge engine doesn't actually look at lines, it looks at groups of lines, or "hunks". But it's easier to reason about individual lines for this example. 

  2. By default, rebase will create and then apply patches, but when invoked with git rebase --merge then it will cherry-pick the changes. This uses the merge engine instead of patch application, but in this example, the results are the same. 

Creating Mac Disk Images (DMG) with VSTS Build Agents

December 15, 2017  •  9:02 PM

I like to pretend that for most of my career, I've only worked on Unix platforms. But that's not exactly true; in fact, I've spent most of my time building cross-platform applications. But I pretend because writing cross-platform applications is hard. And writing cross-platform UI applications is especially hard. So it's exciting to be able to use Electron to be able to easily create cross-platform UI.

I've started working on an Electron app, and even though it's not done yet, and definitely not ready to release, one of the things that I've been thinking about is packaging it for distribution.

Most Electron apps are bundled as ZIP archives, and maybe tarballs on Linux. And that's fine but a really cool user experience is to provide a disk image for macOS. Instead of just a zipped up .app folder, this lets you add some branding, and lets you add a link to the Applications folder so that you can just drag and drop:

Example DMG Background

One of the problems with creating a DMG, though, is that you need a Mac to do it. You can't easily create DMGs on a Linux box or a Windows VM. You need a proper Mac, and historically, it's been hard to find Macs in the cloud for CI hosts.

This was especially problematic if you needed to build any native code. You'd need Windows, Linux and Mac hosts in order to target all your platforms and, until recently, no CI system offered hosted build environments for all the platforms.

But now, Visual Studio Team Services offers hosted Windows, Linux and Mac systems for continuous integration build hosts. And they're free to get started with.

This means that you can use the macOS build hosts to package up your .app into a nicely branded disk image.

Create a background image

One of the great things about using a DMG is that you can create a custom background for the disk image. You can add your logo and provide some simple instructions that people should drag and drop your app into the Applications folder.

Example DMG background

If you're using Photoshop, make sure you set it up as a high DPI image. Here, I want a window size of 500x300, so I'm creating an image that is 1000x600 at 144dpi. If you do this, make sure you use the Save File dialog and select a PNG; the quick export functionality will save as 72dpi.

Create a DMG Generator Repository

Create a Git repository that contains the necessary bits to generate a DMG with all the artwork. You'll need to create a new repository that includes:

  1. The background image that you created for you app.

  2. The create-dmg program written by Andrey Tarantsov.

    You can check this in directly, but I set this up as a submodule; the build step in VSTS can fetch submodules, even if they're hosted in other services. So you can have your proprietary software hosted in VSTS and it can reference Open Source submodules hosted in GitHub.

  3. A shell script to execute create-dmg with all the appropriate arguments. The exact icon placement dimensions will depend on the size of your background and its layout. I named mine build.sh; an example shell script for App.app (using example DMG background, above) might look like:

    #!/bin/sh
       
    create-dmg/create-dmg \
        --volname "App" \
        --background "background.png" \
        --window-pos 200 120 \
        --window-size 500 320 \
        --icon-size 80 \
        --icon "App.app" 125 175 \
        --hide-extension "App.app" \
        --app-drop-link 375 175 \
        "App.dmg" \
        "App.app"
    

    (Remember to set your shell script executable.)

Set up the CI/CD pipeline

It's easy to set up a build pipeline for a hosted Mac build agent. In your Visual Studio Team Services account, navigate to the Build page, and select New Definition.

New Build Definition

VSTS offers helpful build templates to get you started, but we're just running a shell script, so select empty process.

  1. In the "Process" tab, give it a name like Generate DMG, and set the agent queue to Hosted macOS Preview.

    Hosted macOS Preview

  2. In the "Get sources" tab, select where you've hosted the Git repository that contains your build script. Mine is - of course - hosted in a Git repository in Visual Studio Team Services, but the VSTS build process can build repositories that are hosted anywhere.

    Git Repository Hosting Providers

    If you added create-dmg as a submodule, then make sure to show "Advanced settings" and check the "Checkout submodules" option.

    Checkout Submodules

  3. Click the + button on build phase to add a new build task to Phase 1. In the task selection list, scroll or search for the "Command Line" task and click "Add".

    Add a Command Line task

    Select the new task, and give it a display name like "Run Generator". In the "Tool" option, enter "/bin/sh", since we wrote a shell script. Finally, in "Arguments", enter the name of your shell script, "build.sh".

    Generator Configuration

  4. Click the + button on the build phase to add another build task. In the task selection list, scroll or search for the "Publish Build Artifacts" task and click "Add".

    Add a Publish Build Artifacts task

    This task will add the resultant DMG as a build artifact that we can download or deploy.

    In the "Path to publish" option, enter the name of the DMG that your build script generated. In my example, this is "App.dmg".

    Finally, in the "Artifact name" option, enter the name you want to save this artifact as; this is what you'll download or deploy. I recommend using the name of your DMG here.

    Publish Details

And that's all - it's just a few simple steps to create a DMG.

Build Steps

Queue Your Build

Once you've configured your build, you can select "Save & queue" at the top of the page to save your build definition and queue the build to create a DMG.

Save and Queue

When you do, you'll get a notification that your build has been queued. You can select the build number to follow a link to that build.

Build Queued

And once the build completes successfully, you'll be able to navigate to the "Artifacts" tab and download your DMG.

Artifacts

Obviously you could - and should - also deploy this to a download site automatically, creating a deployment pipeline in VSTS.

Example

I created a build script that will download the latest version of Visual Studio Code, unzip it, and then bundle it up as a branded DMG. You can adapt it to your needs.

https://github.com/ethomson/vscode_dmg

When you build this repository in the Visual Studio Team Services build process, you'll end up with a DMG that includes a branded Visual Studio Code experience:

Visual Studio Code DMG