Eggmatters.com

Welcome to eggmatters.com! This is my "internet journal" or "web log." Don't get me wrong, I like the term blog. It kind of sounds like 'blargh" which kind of sounds like someone vomiting. Which, if you think about it, that's basically what most blogs are. A massive regurgitation on the internet. This blog is no different. I started this because I didn't take the hint from twitter or facebook that said "Nobody wants to read your long-winded blog entries." That shouldn't mean that I don't get to write them. There's gotta be some poor sap out there who would read something of mine and say to themselves, "Why yes, I did receive a modicum of entertainment from this"

At any rate, read on and enjoy and for god's sake interact! Leave me comments, send me a message with ideas, or suggesstions or constructive criticism veiled with sappy compliments. Because I don't take negative feedback very well. I will be adding features, expanding functionality and slaying haters, so stay tuned, check in and have a good time. Thanks!

Matthew Eggers (eggmatters)

Recently posted in: Git

View this Post »

How to Create a Git Based Automated Deploy System (for Noobs)

Whether you are using a hosted service, your own stack, a cloud provider - managing project deploys is always the most perilous aspect of web development. Every developer has some horror story of deploys gone wrong. I worked at a place once where we had a 3 page checklist for our deploy process. Screw something up, and one more item gets added to the list.

That is not the way to do it. The greater the process, the higher the risk (I said that. Just there, I said it).

Git is ideally suited for managing not only development changes, but deploy processes as well. The following is a step by step process showing you how to set up your remote production environment to accept a git repository and run a deploy script with just a few keystrokes. This is somewhat "bare bones" so, let me know what you come up with. (Bare bones? Sheesh, thing reads like War and Peace. No worries though. Most of the content here is step-by-step explanations of what I'm doing. The actual work is not really that bad.)

What you'll need
  • SSH access to your remote machine
  • Git installed on your development and production environent
  • Swiss Army Knife (just in case)
That's it! Set up a git repo on your remote machine

SSH into your remote machine. Navigate to your home directory. This will be where we will place our bare repository and staging repo.

From there, run:

mkdir your-repo-name.git
cd your-repo-name.git
git --bare init

You've just created a directory, "your-repo-name.git" and initialized a bare git repository there. The "--bare" option tells git that this is a, well, 'bare' repository. This is important to note.

A bare repository only contains refs of changes that you've made. It contains no working code. The bare repo's job is to manage changes, not content. This allows git to be a truly distributed content management system.

Note: If you have root access to your server, you may want to consider creating a "git" user and a "develoment" group. That would be an ideal place to put your repos, but not necessary.

Configure the git repo on your remote machine

Now, while still in the bare repository you will want to modify it to allow you to successfully push and pull things remotely. If you are working on your own, and don't expect others to be working with you, you can skip these steps:


git config core.sharedrepository 1
git config receive.denyNonFastforwards true

The "git config" command, if you don't know, writes whatever you tell it to as a configuration option to the "config" file in your .git directory of your repo.

core.sharedrepository is a built in git configuration. Setting it to "1" or true allows: "When group (or true), the repository is made shareable between several users in a group (making sure all the files and objects are group-writable)." - from the git-config man page.

receive.denyNonFastforwards will only allow branches that have been properly merged into the main line to be committed. Even if you're working alone, this is a good option to place.

Now, for collaborative purposes, you may want to allow other users to check out/in changes. If you've set up a git user, or others have access as the user where you've placed this repo, then this is not necessary.

Our bare repository is pretty bare. We now want to change the permissions in the directory: .git/objects This is where checked in changes go. (Actually, it's a hash of the checked in changes, but we don't need to sweat that right now.)

Run the following if other users will be checking in changes:


find objects -type d -exec chmod 02770 {} \;
Pat yourself on the back if you know exactly what this command does. If not, don't worry. I will explain this one. Linux find is, in my opinion, probably the most versatile and important command in all of computation. I am going to devote several posts to explore find and distill its complexity for users. But for now, I will explain what each piece of this command does:

  • find - linux command. It searches for things
  • objects - in .git, there is a directory called "objects" we wish to begin there.
  • type d - we only want to find directories (means, 'of type: directory')
  • exec - when we find a directory in objects, we want to execute the following command on it:
  • chmod 02770 - we want to change the file permissions of the directory. (see file permissions note below.)
  • {} - this is a macro used by exec. This basically expands each file (in our case directory). so if there's a directory in objects called "bar" {} would resolve to 'objects/bar'
  • \; syntatctical requirement required by exec that says, "When you're done, move on to the next one."

A note on the 02770 permissions:

This is an octal bit. The digits each set a permission on what is found. The meanings are:

  • 0 - You don't have to provide a leading zero. It tells a compiler or interpreter (a running program) "Hey, this number is an octal number."
  • 2 - This says, give the users of the group for this file the permissions that the user of this file has. Clear as mud? See references below on permissions. It will set an 's' after the group perms in the 'ls -l' output of the file.
  • 7 - This is the user permissions bit. 7 means 'wide open.' Give the user read, write and execute access.
  • 7 - This is the group permissions bit. 7 is the same as user: read, write, execute
  • 0 - This is for everybody else. They get Nothing! No repo for you!

So you've created a bare repository! Let's clothe it.

Populate the bare repo

Ok, close your remote session and navigate to the git repository you wish to push. If you've yet to create a git repo, just run "git init" in the directory where you're stuff is.

In your git repo run the command:


git remote add origin ssh://your_user@your_domain:/fully_qualified_path_to_your_repo_directory/your_bare_repository.git

This is somewhat self explanatory, but a few notes: remote tells git, "I'm going to make changes on another computer."; add is telling git you will be adding a connection string labeled "origin"

This is handy as, origin now is shorthand for the connection uri. Go ahead and look at your config file. You will see the lines:


cat .git/config
. . .
[remote "origin"]
	url = ssh://your_user@your_domain:/home/your_user/your_repository.git
	fetch = +refs/heads/*:refs/remotes/origin/*

Where your_user, your_domain, etc. are actually your username, paths etc.

The fetch = +refs/heads/*:refs/remotes/origin/* is important as it tells git that it should allow you to pull down remote changes. The changes on origin will be stored separately on your local repository so you can tell them apart and git can track changes appropriately.

Ok, now it's time to send our code base, or rather changes, to the remote repository (origin). Run:


git push -u origin master

Yay! You've now pushed your code to an origin. The -u option is important for the initial push. For git 1.7 and above, this tells your local repo to track changes in it. Your git config should now have changes like the following:


[branch "master"]
	remote = origin
	merge = refs/heads/master
If not, you will have to either upgrade git or try running: git push –set-upstream origin master.

Set up remote to copy files to production

So now, you can edit your work. Here is a common workflow:


git add newfile.txt
git commit -m "Just added newfile" -a
git push origin HEAD

You've added some file that you've created, commited it, and then with one command, you've merged your changes into the remote. Now we can ssh into the remote and set up our file copy.

Set up a copy repo on your remote machine

From the remote directory, somewhere sane, run the following command. It will copy your current git workflow into a working git project, with files:


git clone -l /path/to/your/repo/your_repo_name.git copy_codebase_name

Where "copy_codebase_name" is something lucid that you've decided to name it. This will be a working copy of your git repo that you will use to push your working files into production.

You now have a "staging" area where you can push files. Also, if you need to make emergency changes, you can make them here and they still will be in your normal git workflow. Ever had an "on fire" production issue where one of your co-wokers (never you) made a change to a production file directly and forgot to refactor it into your code repo? I have. There ought to be a law. You can make the changes here, and then run the copy script (coming up) without all of that garbage.

Write and run the copy script that will push your files to production

This one is pretty easy. It is a script however, and sometimes a shell script can be like a riding lawnmower without a rider. Make absolutely sure you've backed up all of the files this script will touch!!!

Before you do anything, make sure you backup your production code:


tar -xvf prod_bak.tgz /path/to/production/code

Copy the following script in the directory where your staging repo is

staging repo Will now refer to the repo you've cloned. Not the bare repository we made at the beginning of this article.

diffdeploy.sh (please name it something better.)


#!/bin/bash

ROOT_PATH=/path/to/your/staging/repo/directory
PROD_PATH=/path/to/where/your/production/code/is

cd $ROOT_PATH
git log -n1 | grep commit | awk '{print $2}' > $ROOT_PATH/comdiffs
echo "first rev "$(cat $ROOT_PATH/comdiffs)
git pull
git log -n1 | grep commit | awk '{print $2}' >> $ROOT_PATH/comdiffs
echo "last rev "$(tail -n1 $ROOT_PATH/comdiffs)

git diff --name-only $(head -n1 $ROOT_PATH/comdiffs) $(tail -n1 $ROOT_PATH/comdiffs) > $ROOT_PATH/cpfiles

while read -r cpfile; do
  echo "copying "$cpfile
  cp ./$cpfile $PROD_PATH/$cpfile
done < $ROOT_PATH/cpfiles
rm $ROOT_PATH/comdifs $ROOT_PATH/cpfiles
echo "finished deploy"

Ok shell guru's out there. There's probably hundreds of different ways to do this. Got something better? I am definitely willing to see. I'll ammend this script with better ideas

For the non shell guru's, here is a brief explanation:

  • Write a shell variable that we will use to point to the path where our staging repo is. Say our repo is /home/eggmatters/myStagingRepo/. Then you'll put: ROOT_PATH=/home/eggmatters/myStagingRepo
  • Write a shell variable that points to your production code. See the notes below.
  • We want to run the rest of the script from the repo directory.
  • pulls the SHA1 hash from our latest commit and write to a temporary file: comdiffs.
  • git pull - update the staging repo
  • pull the latest commit now, and write it to the end of the comdifs file.
  • write out another file called cpfiles which is a list of all the filenames which have changed between merges.
  • read all the lines in the cpfiles and copy them to production.
  • remove the temporary files

There you have it. So, the purpose of this script is to copy only the files that have changed between merges. That being said, it can be easy to screw up. If you run git pull on your remote, then you will lose the last commit and possible changes that may have occurred since then. Also, this is making the assumption that the directory tree of your codebase exactly matches that in production. Like the following example:

top_level_prod_directory / top_level_repo_directory
|
|- classes
 \ 
  |- dbclasses
  |- domains
  |- controllers
|- www
 \
  |- img
  |- js
  |- index.php
  |- games

The copied files will be pathed relative to the top level directories above. If your production structure is different from your codebase, then you will need to manually copy files or, modify the script to perform adequate replacements (sed) or use rsync (see below)

Why not rsync?
Rsync delivers as promised, and is probably the better choice. I've had issues with it in the past, where the issues were tough because, rsync does exactly what it says it does - and things got synched that shouldn't have or vice versa. If you're comfortable with it, and use it currently, I think it's a better choice than this script. The script above though, doesn't do a full write of everything - only what shows up in the diffs.

Conclusion & Resources

This was a looooong article. I didn't think it would be this long, but I wanted to provide explanations for some of the higher level stuff. I've always gotten myself into trouble by just scanning an article and copy pasting my code without reading the explanations. I want this to apply for burgeoning developers who may be stuck using web-based or gui ftp clients, and are just starting out with version control.

The following are resources and articles I consulted to verify and research content in this article:

  • http://qugstart.com This article appears to be the "standard bearer" for setting up a git repo. The steps for creating, configuring and populating the bare repo came from here.
  • http://www.zzee.com A Comprehensive overview on linux file permissions (namely the group guid setting)
  • http://permissions-calculator.org A Great tool for determining linux permissions strings. Good reference as well
  • http://www.softpanorama.org A great article on the mechanics of find -exec.
  • http://www.saintsjd.com Explains the differences between "bare" and "working" git repositories.
  • http://matharvard.ca Here is how you can set up git ssh to read your rsa key.
  • Last but not least: the git manual. Best damn man pages ever written. Thanks Linus!

Comment on this Post »