Edward Thomson

Talking about Git at Barcelona .NET Core: 6 March 2018

February 21, 2018  •  7:31 PM

I'm excited to be speaking at the Barcelona .NET Core group on Tuesday, 6 March, 2018. I'll be talking about How Microsoft "Got Git", adopting the Git version control system in Visual Studio and VSTS, and "Scripting Git", how to build .NET applications that interact with Git repositories using LibGit2Sharp.

Please join me! Especially if you're already in town for the Git Merge conference happening on the 7th and 8th of March. And if you're on the fence about coming to Git Merge, I can offer you 15% off your ticket, just because you seem like a nice person. When you register, just use the code:

GITCOMMUNITY15

I hope to see you there!

Merge vs Rebase: Do They Produce the Same Result?

December 21, 2017  •  10:29 PM

I get asked quite a lot whether I recommend a merge-based workflow, or one where people rebase onto master. But to be quite honest, I couldn't possibly care less. Your workflow is your workflow after all, it's up to your team to work in the way that's most productive for you. For some teams that's merging, for some teams that's rebasing… n the end, the code gets integrated and the end result is the same either way, whether you merge or rebase it, right?

Right?

If you're a rebase fan, you've probably run into cases where you get conflicts during a rebase that you wouldn't get during a merge. But that's not very interesting… is there a case where merge and rebase both finish and produce a result, but a different tree?

Is git-merge guaranteed to produce the same results as git-rebase?

No!

It's actually not a guarantee; in fact, you can create two branches that merge differently than they rebase. To avoid any spoilers, I've hidden the details in case you want to think about this on your own. 🤔 Click "expand" below to see the details.

Click to expand…

You can follow along with this GitHub repository.

Two branches

Imagine that you have two branches, one is master, and the other is the unimaginatively named branch branch. They're both based off a common ancestor 0d7088f. Further, imagine that your branch has two commits based off that common ancestor:

Ancestor 0d7088f branch 3f3ca4f branch 09d3ac4
One One One
Two 2 Two
Three Three Three
Four Four Four
Five Five Five
Six Six Six
Seven Seven 7
Eight Eight Eight

Finally, imagine that your master branch has a single commit based off the common ancestor:

Ancestor 0d7088f master f2e864b
One One
Two 2
Three Three
Four Four
Five Five
Six Six
Seven Seven
Eight Eight

What happens when you try to merge or rebase these?

Merge

When Git merges two branches, it only look at the tip commit in each branch, and compares them to their common ancestor. It does not look at any intermediate commits. In the above example, when we merge branch into master, the algorithm looks at the changes made in branch by comparing commit 09d3ac4 to the common ancestor commit 0d7088f. It also looks at the changes made in master by comparing commit f2e864b to the common ancestor commit.

The merge algorithm compares each line1 in the common ancestor, comparing it to the file in branch and the file in master. If the line is unchanged in all branches, then there's no problem - that line is brought into the merge result. In this example, line 1 in unchanged in both branches, so line 1 of the merge result will be One.

If a line is changed in only one branch, then that change is brought forward into the merge result. In this example, line 7 is changed only in branch. So in the resulting merge, line 7 will have the contents from branch, which is the digit 7. Also, line 2 is changed only in master, so in the merge result it will be the digit 2.

Merge Result
One
2
Three
Four
Five
Six
7
Eight

Remember that merge only looks at the tip commits, so comparing the common ancestor to branch, line two appears unchanged, since the ancestor and tip are identical.

Rebase

Rebase works a bit differently - instead of doing a three-way merge between the tip commits on each branch, it tries to replay the commits on one branch onto another. In the above example, if we want to rebase branch onto master, then Git will create a patch for each commit on branch and apply those patches onto master.2

When you rebase, Git will switch you to the master branch, checking out f2e864b. Then Git will apply the differences between the common ancestor and the first commit on branch. In this example, the patch between the common ancestor and the branch changes line two from Two to 2. But that's already the value of the file in master. So there's nothing to do, and the patch for 3f3ca4f applies cleanly.

Then a patch for the second commit on the branch is applied: it changes like two back to the text representation, and changes line seven to a digit. So the rebase result is:

Rebase Result
One
Two
Three
Four
Five
Six
7
Eight

So rebase preserves the changes in the branch while merge preserved the changes in master.

Conclusion

Generally these sorts of changes will cause a conflict instead of different results. It was key that in branch we changed the contents of line 2 back to the contents in the common ancestor. That allowed the merge engine to consider that the line in branch was unchanged.

Merge Result Rebase Result
One One
2 Two
Three Three
Four Four
Five Five
Six Six
7 7
Eight Eight

So… is this a problem?

It might seem concerning that this comes up when there was an apparent revert of your changes. Logically, both the branch and the master branches changed line two, but then branch changed it back. So although this seems rather derived, it's not that unlikely.

But whether you prefer a merge workflow or a rebase workflow, you should be careful of your integration and following good development practices:

  1. Code review, ideally using pull requests, so that your team members have visibility into changes before they're integrated into master.

  2. Continuous integration builds and tests, as part of your integration workflow. Ideally, with build policies to ensure that builds succeed and tests pass.

So make sure to do proper code reviews, which keep this an interesting difference instead of an actual problem in your workflow.

  1. Strictly speaking, the merge engine doesn't actually look at lines, it looks at groups of lines, or "hunks". But it's easier to reason about individual lines for this example. 

  2. By default, rebase will create and then apply patches, but when invoked with git rebase --merge then it will cherry-pick the changes. This uses the merge engine instead of patch application, but in this example, the results are the same. 

Creating Mac Disk Images (DMG) with VSTS Build Agents

December 15, 2017  •  9:02 PM

I like to pretend that for most of my career, I've only worked on Unix platforms. But that's not exactly true; in fact, I've spent most of my time building cross-platform applications. But I pretend because writing cross-platform applications is hard. And writing cross-platform UI applications is especially hard. So it's exciting to be able to use Electron to be able to easily create cross-platform UI.

I've started working on an Electron app, and even though it's not done yet, and definitely not ready to release, one of the things that I've been thinking about is packaging it for distribution.

Most Electron apps are bundled as ZIP archives, and maybe tarballs on Linux. And that's fine but a really cool user experience is to provide a disk image for macOS. Instead of just a zipped up .app folder, this lets you add some branding, and lets you add a link to the Applications folder so that you can just drag and drop:

Example DMG Background

One of the problems with creating a DMG, though, is that you need a Mac to do it. You can't easily create DMGs on a Linux box or a Windows VM. You need a proper Mac, and historically, it's been hard to find Macs in the cloud for CI hosts.

This was especially problematic if you needed to build any native code. You'd need Windows, Linux and Mac hosts in order to target all your platforms and, until recently, no CI system offered hosted build environments for all the platforms.

But now, Visual Studio Team Services offers hosted Windows, Linux and Mac systems for continuous integration build hosts. And they're free to get started with.

This means that you can use the macOS build hosts to package up your .app into a nicely branded disk image.

Create a background image

One of the great things about using a DMG is that you can create a custom background for the disk image. You can add your logo and provide some simple instructions that people should drag and drop your app into the Applications folder.

Example DMG background

If you're using Photoshop, make sure you set it up as a high DPI image. Here, I want a window size of 500x300, so I'm creating an image that is 1000x600 at 144dpi. If you do this, make sure you use the Save File dialog and select a PNG; the quick export functionality will save as 72dpi.

Create a DMG Generator Repository

Create a Git repository that contains the necessary bits to generate a DMG with all the artwork. You'll need to create a new repository that includes:

  1. The background image that you created for you app.

  2. The create-dmg program written by Andrey Tarantsov.

    You can check this in directly, but I set this up as a submodule; the build step in VSTS can fetch submodules, even if they're hosted in other services. So you can have your proprietary software hosted in VSTS and it can reference Open Source submodules hosted in GitHub.

  3. A shell script to execute create-dmg with all the appropriate arguments. The exact icon placement dimensions will depend on the size of your background and its layout. I named mine build.sh; an example shell script for App.app (using example DMG background, above) might look like:

    #!/bin/sh
       
    create-dmg/create-dmg \
        --volname "App" \
        --background "background.png" \
        --window-pos 200 120 \
        --window-size 500 320 \
        --icon-size 80 \
        --icon "App.app" 125 175 \
        --hide-extension "App.app" \
        --app-drop-link 375 175 \
        "App.dmg" \
        "App.app"
    

    (Remember to set your shell script executable.)

Set up the CI/CD pipeline

It's easy to set up a build pipeline for a hosted Mac build agent. In your Visual Studio Team Services account, navigate to the Build page, and select New Definition.

New Build Definition

VSTS offers helpful build templates to get you started, but we're just running a shell script, so select empty process.

  1. In the "Process" tab, give it a name like Generate DMG, and set the agent queue to Hosted macOS Preview.

    Hosted macOS Preview

  2. In the "Get sources" tab, select where you've hosted the Git repository that contains your build script. Mine is - of course - hosted in a Git repository in Visual Studio Team Services, but the VSTS build process can build repositories that are hosted anywhere.

    Git Repository Hosting Providers

    If you added create-dmg as a submodule, then make sure to show "Advanced settings" and check the "Checkout submodules" option.

    Checkout Submodules

  3. Click the + button on build phase to add a new build task to Phase 1. In the task selection list, scroll or search for the "Command Line" task and click "Add".

    Add a Command Line task

    Select the new task, and give it a display name like "Run Generator". In the "Tool" option, enter "/bin/sh", since we wrote a shell script. Finally, in "Arguments", enter the name of your shell script, "build.sh".

    Generator Configuration

  4. Click the + button on the build phase to add another build task. In the task selection list, scroll or search for the "Publish Build Artifacts" task and click "Add".

    Add a Publish Build Artifacts task

    This task will add the resultant DMG as a build artifact that we can download or deploy.

    In the "Path to publish" option, enter the name of the DMG that your build script generated. In my example, this is "App.dmg".

    Finally, in the "Artifact name" option, enter the name you want to save this artifact as; this is what you'll download or deploy. I recommend using the name of your DMG here.

    Publish Details

And that's all - it's just a few simple steps to create a DMG.

Build Steps

Queue Your Build

Once you've configured your build, you can select "Save & queue" at the top of the page to save your build definition and queue the build to create a DMG.

Save and Queue

When you do, you'll get a notification that your build has been queued. You can select the build number to follow a link to that build.

Build Queued

And once the build completes successfully, you'll be able to navigate to the "Artifacts" tab and download your DMG.

Artifacts

Obviously you could - and should - also deploy this to a download site automatically, creating a deployment pipeline in VSTS.

Example

I created a build script that will download the latest version of Visual Studio Code, unzip it, and then bundle it up as a branded DMG. You can adapt it to your needs.

https://github.com/ethomson/vscode_dmg

When you build this repository in the Visual Studio Team Services build process, you'll end up with a DMG that includes a branded Visual Studio Code experience:

Visual Studio Code DMG

git-open supports Visual Studio Team Services

December 6, 2017  •  6:25 PM

I'm really excited that git-open 2.0 was just released. This newest version of git-open includes some changes that I contributed to support for Visual Studio Team Services and Team Foundation Server.

git-open already had great support for GitHub, so I was already using it for my open source projects. It also supports BitBucket and GitLab. But I wanted to use it for my day job, where all my repositories are hosted in VSTS.

What is git-open?

git-open is a nifty tool that will let you run a command on the command-line, git open, to open up a web browser to your Git repository hosting provider. This means that I can run a single command, right from within my Git repository, and it will open up a browser and navigate to my project in Visual Studio Team Services.

git-open

This is amazing because as much as I try to hang out in the command-line, that's not always useful for large projects. If I'm hacking on my (comparatively) little open source project, libgit2, it's easy enough to use ctags or to grep over the whole codebase. But for a big project that's not so easy.

What I really want is to use the great Roslyn-powered code search functionality in VSTS. So now I can open up a new browser pointing at VSTS without having to click around to get a new browser window open.

It will even look at the current branch that you're on and navigate there in the browser interface. This is perfect if you're starting your branch workflow by creating a new branch from a VSTS work item.

Create Branch from Work Item

When I start my branch workflow this way, I've always got a branch on the server that I fetch to my client. And then I can push my changes back and eventually open a pull request. And git open supports this workflow beautifully, it will look at the branch that I have checked out locally and try to navigate me there in the web browser. It can navigate to branches in Git repositories hosted in several hosting providers, including VSTS and TFS.

But it's not just branches - it can also navigate to the work item hub. This gives you a complete end-to-end branch workflow: create a new branch from a work item in VSTS, pull it down, and then you can use git open --issue to navigate back to the VSTS work items in your browser.

git open --issue

Getting Started

To get started with git-open, all you have to do is download the latest release, unzip it and then put the git-open shell script somewhere in your path.

git has an extension model where all the commands you run - say git pull - are actually separate commands. They're not all just built in to the git command itself. When you run git pull, git goes looking for an executable in your path named git-pull.

So if you want to add your own Git commands, you can just put an executable named git-something into your path. When you run git something, Git will find your git-something executable.

Tip
I have a directory in my home directory where I put useful shell scripts that I make sure is included in my path. On Unix systems this is ~/bin. On Windows machines, this is %HOMEPATH%/bin.

Pro-Tip
I also manage all these scripts - along with my dotfiles - in a Git repository. So when I start working on a new machine, I can just git clone my dotfiles repository.

Google Analytics for Your Podcast

October 27, 2017  •  5:57 PM

I built a little open source side project: Google Analytics Handler, which reports tracking information to Google Analytics on the server side.

It's for my new podcast, All Things Git, so that we can track the RSS and audio downloads from the podcast's website.

If this was 1997 and I was hosting the website on my own server, I could just crunch my Apache logs for the data. But it's not; it's 2017 so of course I'm hosting the podcast in the cloud. The website is an Azure Web App and the audio downloads are hosted in Azure CDN.

And it's nearly perfect! Performance? Amazing! Cost? Low! But log files? Not so much.

Martin suggested a hosted analytics platform to track RSS requests, and another to track audio downloads. But that's two new bits of analytics to go along with Google Analytics, which we use to track the page visits. Requests being tracked in three distinct places? Ugh.

Thankfully, Google Analytics offers the Measurement Protocol API - which lets you report events like page views manually. So I built an ASP handler to report requests on RSS and audio downloads to Google Analytics.

For RSS requests, the handler simply opens the RSS file that exists on disk and returns it to the client before reporting to Google Analytics. This is nice because it's a transparent change - the handler is loaded by the web configuration only in production. It's just a few lines in the Web.config:

<configuration>
  <system.webServer>
    <handlers>
      <add name="RssHandler"
           verb="*"
           path="rss.xml"
           type="GoogleAnalyticsHandler.GoogleAnalyticsHandler, GoogleAnalyticsHandler"
           resourceType="Unspecified" />
    </handlers>
  </system.webServer>
</configuration>

For audio, we don't host the audio directly on the web site, so for the actual podcast itself, the handler redirects to the audio files in Azure CDN. I set it up so that any request in the /episodes/audio folder is redirected straight to the CDN:

<configuration>
  <system.webServer>
    <handlers>
      <add name="AudioHandler"
           verb="*"
           path="/episodes/audio/*"
           type="GoogleAnalyticsHandler.GoogleAnalyticsHandler, GoogleAnalyticsHandler"
           resourceType="Unspecified" />
    </handlers>
  </system.webServer>
  <location path="episodes/audio">
    <appSettings>
      <add key="redirect-root" value="https://mycdn.azureedge.net/episodes/" />
    </appSettings>
  </location>
</configuration>

As soon as I deployed the handler and configuration to production, I was seeing results in the real-time tab of Google Analytics:

Google Analytics

So the Google Analytics Handler makes it very straightforward to add Google Analytics tracking to your media assets like audio and video, and to your non-HTML pages like text and XML.