Git best practices for Magento

Git is the most popular version management tool of today and using it in combination with Magento certainly makes sense. Here's an overview of some best practices for using git together with Magento. It is not a definitieve guide - more a collection of good ideas.

Hosting repositories

When you're starting with Git, the first choice you need to make is where you're going to store your repositories. GitHub and BitBucket are most popular for this, not only because they are simple to use, but also because of their management layer (like tickets and pull requests).

Alternatives like Assembla and Codebase go much deeper with project management. You can also host your own Git repositories (that's what we do a lot) and optionally run GitLab or GitBucket to create your own private GitHub-like UI.

Do not place your Magento files in a public repository. Place it in a private repository that only you (and perhaps co-workers) are able to access.

Structuring your repository layout

When you're starting with git, a thing you might be tempted to do is to dump the entire Magento site into a git repository. This is a bad idea though: It would include for instance the file app/etc/local.xml which is unique for every environment. Also, such a repository could exclude files that should not be in the webroot - like database dumps.

Instead, create a folder for your Magento project and make the Magento application (so the webroot) a subfolder of this repository folder. Here's a suggestion for the folder layout:

magento/
sql/
scripts/
.gitignore
.git/

The .gitnore file (discussed below) will define what to exclude from these folders. The .git folder mentioned above will be created by Git itself, but it shows you that this folder should not reside within the magento folder which is the webroot for the shop. The sql folder and scripts folder are optional. We tend to dump the databases in the sql folder so we can setup a new copy of the production site quickly, and the scripts folder contains all scripts to do that magic.

Including app/etc/local.xml or not?

In general, the app/etc/local.xml file should not be included in your git repository. However, because you do want to maintain a copy of that file, it might be a good idea to create a folder outside of your webroot that contains a copy of this most important Magento configuration file. For instance, your git repository could include a folder backups which contains all of the local.xml file of each environment you're working in. If the shit hits the fan and the file magento/app/etc/local.xml is deleted somehow, you can always restore from it easily:

magento/
backups/
backups/local.xml.production
backups/local.xml.staging
backups/local.xml.developer-jisse

Including images or not?

Git offers you version management, but it also serves as a convenient backup tool. When you use git as a backup tool, including the images certainly makes sense. So the .gitignore will not mention these images to be excluded from git. However, in some cases, the images end up to be huge. I've personally seen Magento sites with images totalling to 20Gb or more. In that case, it might be an option to exclude the images from git anyway, and create a cronjob backup instead.

The ideal .gitignore file

Above, the .gitignore file was discussed to include or exclude certain files. When using git in practice, having the .gitignore file properly setup is vital. And there are a lot of discussions around what would be the ideal .gitignore file. In our opinion, there is no perfect .gitignore file - there are many. Here's is an example, that you should study - do not use it right away:

.gitignore
magento/.htaccess
magento/app/etc/local.xml
magento/cron.php
magento/cron.sh
magento/downloader
magento/errors/
magento/includes
magento/index.php
magento/index.php.sample
magento/install.php
magento/LICENSE*
magento/media magento/php.ini.sample
magento/RELEASE_NOTES.txt
magento/robots.txt
magento/var

If you study this file, you might come across a couple of files that are actually not needed in your Magento at all: index.php.sample, all the license files, php.ini.sample. My advice is to remove those files if you don't need them. If you want to use them as reference for something, add them to your documentation - not to your production site.

There are also some files that might be considered part of your production environment: The .htaccess file might be vital for running Magento on production. However, it does not mean it should be included in git: Perhaps it includes redirect-rules specific for the production-site, that do not work if this .htaccess file is used in the testing environment. All files that are generic should remaind in git, all files that are not should either be excluded or part of your backups folder (see above).

magento/.htaccess

Another good example of an exclude that is open for debate is the media folder: Should it be in git or not? If it is included in git, make sure to exclude the cache folder:

magento/media/catalog/product/cache

Finally, the var folder might be excluded as a whole, but this will also include export/import files. If you want to include those in git, but want to exclude variable data, you can use the following:

magento/var/cache
magento/var/session
magento/var/log
magento/var/locks
magento/var/report

Users and commit messages

Git serves as a backup tool. It allows you to revert changes, but also to trace with which commit a certain bug was introduced. To trace down things properly, it is vital to use git in the way it was meant to work.

First of all, every real-life user in git should be mapped to a specific git user. For instance, if you have a GitHub repository with company code, make sure your developers are each committing changes to that repository using their own personal GitHub account.

Make sure each git commit has a meaningful description. A commit message like "changed something" or "fixed bug" are not giving anything useful on a specific commit. However, "fixed a PHP notice in custom observer" is much more meaningful. There are now more and more ideas shared among git users as to what best practices should be used for commit messages. Make sure to dive into that. The more specific the git commits get, the more useful git overall will be.

Working with branches

Magento is difficult and mistakes are easily made. As a bare minimum, make sure to setup a production environment and a testing environment. In a simple setup with git, you will simply have one branch - master - which is run on both sites. The testing site is used to make changes, upgrade scripts, see if all is working. And if you're ready, you commit the changes to git, so that the only thing that you need to on the production environment is import these changes using git pull.

More advanced setups will have a development environment, a place where things can be totally messed up. Also, besides a testing site, you might have a staging environment as well. The various environments can be reflected by identical git branches also: The master branch might still be used for production and staging, but the testing branch is used to merge changes coming from the dev branch (or perhaps multiple development brannches) into the master branch.

There are many different variations possible. In general, the more environments or branches you have, the safer it will be to make changes, because there's a lot of verification and testing being done before a change actually ends up on the live site. However, things also might become difficult while maintaining sites.

Keeping all environments up-to-date

The production site is rarely touched. For instance, when a Magento module is upgraded, it is best to installl this module in dev, merge the dev branch to master and then pull the latest changes of the master into the live site. After flushing the cache, the Magento module might run its installation or update scripts. If you want to make sure this is done in a controlled manner, disable automatic updates in your app/etc/local.xml and run the updates manually using magerun.

When you want to update the staging site (or development or test), it is good to make sure to run things with the latest copy of the production database. For this database to be imported quickly, you might have a script prepared that dumps the production database to a SQL file, adds this file to git, so that you have another script ready in all other environments that imports this SQL file again. Each environment will have certain changes that are unique - like Base URLs, GoogleAnalytics settings, debug flags - so having a second SQL file imported with those specific changes seems like a good idea also.

A repository per element

A more exotic approach of maintaining various sites through git, is by splitting up the Magento application itself: Separate core from third party modules, separate the theme etcetera. If you have multiple sites that are very identical, it might be wise to create a common repository of the Magento core (app/code/core, lib) so that upgrade Magento itself becomes a matter of patching one main repo, while per site you only need to pull all changes and run the upgrade scripts.

That same scenario could be used also for each extension. If each Magento extension would be placed in its own repo, you can use that repo to keep the preferred version of that extension, while quickly rolling out upgrades to each Magento site using modman. For extensions that don't have a repository, you can simply create your own private repository instead. The same could also be used for your Magento theme, or a common base theme you have created.

Using many repositories for maintaining a single site this way sounds complex but has various benefits: You can make sure that one change is rolled out in multiple environments quickly (with still the option to create branches per environment). Also, you can add-in changes to third party extensions more easily because you control them yourself using git. Say that there's a third party extension that has a bug, and this bug has been fixed by you. However the extension developer is sloppy and keeps postponing a new version. Using git, you can more easily keep track of the original extension while maintaining your own fix.

Using additional tools

Running git offers many cool options and it should make your life of managing Magento easier. When dealing with Magento and git, there are three tools that you should not miss: First of all, magerun is your CLI tool that should preferred over using the Magento Admin Panel. Second and third, modman and composer allow you to install and update extensions in an easier fashion: Getting to know these tools will improve your developer skills.

Worth reading: