How to Use Git to Track Changes in Translation Files
There’s been a lot of buzz about managing translations through version control. As localization becomes more integrated into your projects, it’s important that members of your development and localization teams stay synchronized. In this post, we’ll discuss best practices when using Git to manage your own projects.
What is Git?
Git is a version control system (VCS) designed by Linus Torvalds, the creator of Linux. Git tracks changes to source code for multiple users, allowing teams to easily share the same codebase.
Version control is an extensive topic, but we’ll focus on three key components: repositories, commits, and branches.
Repositories are directories that hold your code files. Repositories also double as workspaces. You can create a new repository on your local machine, or you can copy (clone) a repository from a remote machine and work on it locally.
Commits are changes to source code that are applied to a repository. Commits make it easy to group similar changes for organizing, managing, and auditing tasks.
Branches are deviations in the codebase. They let you create new working copies of code without disrupting the original copy. You can easily switch between branches, and any changes made to a branch can be merged into another branch. Git repositories provide a default master branch that forms the trunk for other branches.
There are many other components to Git including forks, push and pull requests, staging, and tagging. For a more comprehensive overview of Git, see Pro Git, an open source book covering a wide range of Git’s features.
Hosting Git Repositories Online
You can host a Git repository on any computer, but there are websites that provide online repository hosting. GitHub, BitBucket, GitLab, Gitorious, and Kiln all provide hosting for public and private repositories.
Git Best Practices
When working with multiple developers, it’s vital to set a version control strategy that everyone understands and agrees with. This section discusses Git best practices that ensure consistency and provide a reliable trail of changes.
Stay in Sync
A code repository is only effective if it’s kept up-to-date with frequent commits. Commits should consist of small, related changes rather than large, varied changes. For instance, if you fix a typo, you can apply the change in a single commit. However, if you fix a typo and a bug, you should make two commits: one for the typo and one for the bugfix. Small, related commits make it easy to not only track changes but to roll back changes if necessary. It also helps other developers understand exactly what changed while you were working on the file.
Branch Out
Branching does more than just organize code changes: it lets developers modify the codebase without impacting another user’s workspace. A branch is essentially a snapshot of the base branch at a particular commit. New commits build off of that initial commit, allowing the branch to extend independently of other branches including the one it’s based on.
Imagine your organization is preparing a new software release. While the code is being finalized, one of your developers starts working on a new feature for the next release. Not wanting to lose work, he commits his changes before leaving for the night. Without a good branching policy in place, his commit ends up introducing incompatible features into master
and the automated build fails.
If a new branch was created specifically for that feature, the developer could commit his changes to a different codebase than the one used for release. The next section shows commonly used branching techniques. There are several well-tested methods, but one of the more popular is the Gitflow Workflow.
The Gitflow Workflow
Originally proposed by Vincent Driessen, the Gitflow Workflow establishes two main branches: master
and develop.Master
is limited to production-ready code, while develop
contains ongoing development changes. Other branches – features, bugfixes, and releases – are based off of these two core branches.
Example of a repository using the Gitflow Workflow (Atlassian)
When development is started on a new feature, develop
branches off into a new branch. Once that feature is complete, it gets merged back into the develop
branch. As the product nears release, develop
branches into release
, which is limited to preparing the code for the upcoming release. In the meantime, changes can still be committed to the other branches. When the release is ready, it’s merged back into develop
as well as master
.
Besides develop
, the only other branch that derives directly from master is hotfix
. When a hotfix is released, hotfix
is merged back into master
and develop
in order to incorporate the changes into future releases. This results in a flow where changes are logically separated, but still combined into the main branches upon completion.
Combine Often
Merging changes is critical to ensuring consistency across branches. Once a feature is finalized, its changes need to join the changes submitted by other developers in other branches. This can be accomplished using two methods: merging and rebasing.
Merging vs. Rebasing
Merging and rebasing both solve the problem of synchronizing two branches, but they go about it in very different ways. For instance, imagine you’re working with a repository that has two branches: master
and feature
. You’ve been assigned to work on new code and commit your changes to feature
. In the meantime, other developers are committing changes to master. Once you’ve finished coding, you need to update master
with the changes made to feature
.
Merging resolves this by creating a new commit in feature
containing all of the intermediate commits from master
. While this fully preserves both branches, it also generates a lot of extraneous commits, as it needs to incorporate each change from master
all the way back to the branch point.
Merging the master branch into a feature branch (Atlassian)
Rebasing takes the opposite approach by committing changes from feature
into master
. Rebasing rewinds to the commit in master
that feature
is based off of, then reapplies each commit from feature
to master
. While this reduces the number of commits, it has the effect of essentially rewriting the project’s history.
Rebasing the same feature branch onto master (Atlassian)
Git’s rebase
command lets you determine exactly how the branches are integrated. For instance, interactive rebasing will walk you through each feature
commit and let you alter the commit before it’s applied to master
. For more details, see Merging vs. Rebasing in the Atlassian Git Tutorial.
Synchronizing Git With Transifex
There are several ways to manage code with Git while still managing localizations with Transifex.
Using the Transifex Client with Git
You can simply use the Transifex Client to push source changes to Transifex. The benefit of this approach is that it ensures you have access to the latest translations and can push resource updates to the Transifex project.
The drawback is that you’ll need to enforce a policy for pushing resource updates. Features that are still in development can change completely, and having your localizers work on text that might not appear in the final product is a waste of time and money. You will also need to be aware of including localization updates when committing their changes.
Using txgh
Txgh is an Open Source Sinatra server created by Strava. It uses Webhooks to automatically trigger updates in GitHub and Transifex. If a developer commits a change to a source translation file in GitHub, txgh automatically updates the file in Transifex. Additionally, if a translator makes a change in Transifex that updates a file to 100%, txgh creates and pushes a new commit to GitHub. Txgh removes the added step of having to run the Transifex Client each time a translation is added.
There’s No One-Size-Fits-All Solution
Different organizations will prefer different workflows based on their needs and culture. Gitflow is far from the only Git workflow in popular use, and it may not be ideal for your development cycle. You can find a comparison of Git workflows in Atlassian’s Git Tutorial. You can find additional resources for using Git through GitHub’s help site. If you want to learn more about Git, Code School provides an interactive online course.