Using EBS Snapshots with Fog

01/08/2011

Making use of Amazon for our server hosting gives us access to a host of neat features that allow us to do good stuff with our server environments.

We make use of a feature in Amazon called EBS (Elastic Block Store), you can think of it like a hard drive that exists on the network. While it is true there has been a lot of doom and gloom surrounding the EBS devices (http://aws.amazon.com/message/65648/) they can give you some advantages if you are careful how you use them.

One of those advantages is the ability to create snapshots and then create new EBS volumes based on those snapshots. Snapshots are quite quick to take, get saved in S3 (Amazons Simple Storage Service) and protect the data for long term durability. Using these techniques you could quickly copy data from one environment to another, for example from your production to staging environments. Or while snapshots are *not* a backup solution, you could snapshot a volume, create a new volume from the snapshot, attach it to a machine and take a backup from that volume. Allowing you to backup from a certain point of time without affecting the running of the main server (of course depending on your applications and technologies this may not be appropriate and you might need to take extra steps to ensure you got a consistent data backup).

Lets take a look at how snapshots can be taken, since server side we’re mostly a ruby shop, we’ll use ruby and the fog library (http://fog.io). Fog will be used to interact with the Amazon API.

The steps are:

  • Connect to the region in AWS where we want to snapshot the volumes
  • Get a list of the volumes to snapshot
  • Create the snapshots for each volume
  • Tag the snapshots with an id and the type of snapshot

First off we connect to AWS and get a list of our volumes:


# Load up fog and an internal library to handle loading in AWS keys
require 'fog'
require 'access_control'

# Load in AWS IAM user credentials
fog_creds = MM::AccessControl.new('production', 'snapshot')

region = 'eu-west-1'
# Make a connection to AWS
@fog = Fog::Compute.new(:provider => 'AWS', :region => region, :aws_access_key_id => fog_creds.access, :aws_secret_access_key => fog_creds.secret)

# Grab an unfiltered list of all of the volumes
volumes  = @fog.volumes.all


Now we’ve got the volumes we can run through them all creating a snapshot from them. Snapshots need a few bits of information:

  • Volume id to take the snapshot from
  • Description of the snapshot

At the same time we will assign a tag. The tags in amazon are handy identifying resources or adding other bits of information.

So using the volume list that we have taken, we could use the following sort of code to take snapshots of the attached volumes:


# Set up a time stamp for naming the snapshots
t = Time.now
stamp = t.strftime("%Y%m%d.%M%H")
day = t.strftime("%Y%m%d")

volumes.each do |vol|
  # Skip volumes with no attachment. Each attached volume will have a server_id
  next if vol.server_id.nil?

  description = "#.#"

  # Create a new snapshot transaction. It needs a description and a volume id to snapshot
  snapshot = @fog.snapshots.new
  snapshot.description = "#"
  snapshot.volume_id = vol.id

  # Now actually take the snapshot
  snapshot.save

  # To tag the snapshot we need the snapshot id, so reload the snapshot info to get it
  snapshot.reload

  # To tag something you need a key, value and resource id of what you want to tag
  # In this example we will demonstrate tagging just by creating a tag called "SnapshotDate" and applying the current YYMMDD to it.
  # This tag is slightly redundant as amazon gives you the time the snapshot was started at, but shows an example of how to tag
  @fog.tags.create(:resource_id => snapshot.id, :key => "SnapshotDate", :value => day)

  # To add more tags simply make another tagging call
  @fog.tags.create(:resource_id => snapshot.id, :key => "TakenBy", :value => 'snapshot user')
end


Awesome, we can now take snapshots. Our next step is to clean this code up, put it into a script and run it on a regular basis. Now we’ve got a service that is creating snapshots for us every hour, we quickly notice that our snapshot count is growing quickly, we need to clean up old snapshots.

We might decide that we only want to keep the last 8 hours of snapshots, and that everything else before that isn’t required. To clean up the snapshots we’d want to:

  • Connect to AWS
  • Get a list of all the snapshots
  • Check the time the snapshot was created at
  • Delete it if it is over 8 hours old

# Assuming we have connected to AWS and have a @fog connection object
# Also we assume that as part of our snapshot script we decided to tag the snapshots as hourly

# Grab the hourly tags, we do this by applying a filter against a tags search
tags = @fog.tags.all(:key => "SnapshotType", :value => "hourly")


Now we can load the snapshot information for each of the tags, check the time, and delete if required:


# Time limit in hours
time_limit = 8

# Current time
t = Time.now

tags.each do |tag|
  # The resource_id of the tag equates to a snapshot so we can load the snapshot information
  snapshot = @fog.snapshots.get(tag.resource_id)

  # Using the current time and the time when the snapshot was created we can work our out the age of the snapshot
  age = t.to_i - snapshot.created_at.to_i
  hours = age / 60 / 60
 
  # Finally delete the snapshot if it is too old. Deleting a snapshot requires the snapshot id
  if hours > time_limit
    @fog.delete_snapshot(snapshot.id)
  end
end


This example code should then remove all of your old snapshots. Again we’d want to clean this up, put it into a script and have it running on a daily basies.

In a future post I’ll talk about how to use these snapshots to create new EBS volumes in the same, or a different, AWS account.