Cyrus Stoller home about consulting

Deploying a Rails app to a VPS using Puppet and Capistrano

Many web development tutorials end with deployment to Heroku. While Heroku is convenient, I prefer deploying web applications to a VPS (e.g. Digital Ocean or Linode) whenever possible for additional control and to avoid vendor lock-in. You can host on Heroku for free, but you’ll start paying premium prices when you need additional dynos to handle traffic or when you exceed the 10,000 row limit in the free postgres database.

Deploying to a VPS gives you flexibility to easily move websites to new providers if they offer more competitive services (e.g. price, security, storage, or bandwidth). This is like driving a stick-shift. Once you’ve mastered it, you appreciate the benefits. It’s not for everyone, but as applications grow, you will appreciate having additional control.

In this post, I will explain how to use one set of open-source tools to get a new server up and running. There are tons of tools that allow you to accomplish the same thing and many people have strong preferences. This post is intended to be less about the merits of each of the tools and more about understanding what each of the tools are doing.

When I was figuring this stuff out, I wanted to see all of the code someone used to deploy a live application to a VPS. I will go over how I deploy a Ruby on Rails app (RevTilt) using puppet manifests (gardenbed) and capistrano recipes.

My confusion

When I was first figuring this out, I was confused about why I should use both Puppet and Capistrano. Puppet can deploy code and Capistrano can provision servers. And vice versa. Personally, I find Puppet better for provisioning servers and Capistrano better for orchestrating deployment. Puppet agents pull new code from their master periodically, while Capistrano pushes code directly to your servers.

To me, it’s more intuitive to ‘push’ code to my servers. When I deploy, I want my new code to be live right away. In particular if I need to push a hotfix, I’d like it to be up as soon as possible.

Get an instance

While I have only tested this on Ubuntu 14.04 LTS, this should work on many flavors of Linux.

Select a hostname for your VPS and create a new instance with at least 512 MB of RAM. To minimize latency, I’d recommend selecting a region closest to the majority of your users. I’d also recommend explicitly setting the FQDN for your server.

# Edit /etc/hostname so that it has the hostname you want
# Add the following line to the end of /etc/hosts to set the FQDN
162.243.47.109 alpha.revtilt.com revtilt
# Use your own IP address
# FQDN: alpha.revtilt.com
# Hostname: revtilt
# To test that everything is set properly
$ hostname
revtilt
$ hostname --fqdn
alpha.revtilt.com

To use your FDQN, you should also setup an A Record with your DNS for your VPS.

Subsequent steps to setup the server this will be automated using gardenbed (a collection of puppet manifests) and capistrano.

Securing your server

Here are a couple good tutorials on how to use SSH.

Unfortunately, bad guys will frequently try to hack into your server. They are hoping that you will have a weak password, so they can commandeer your server (potentially to send spam, mine bitcoin, steal your information, or disrupt your website). Here are a couple things you can do to make your server a less attractive target.

Disable SSH authentication for root and only permit authentication via SSH keys for other users. Hackers will typically scan through IP blocks known to be operated by VPS providers and try logging in as root with common passwords. If they are able to guess correctly, they will have full access to your server and can lock you, the legitimate owner, out.

You should also setup a firewall to limit and block unwanted traffic. And, to prevent dictionary attacks on your server, you should also install fail2ban. Here’s a tutorial on how to do all of this by Linode.

Perform the rest of the steps as the non-root user you have just created.

Installing ruby using rbenv

Next, we need to install ruby, so we can run our Ruby on Rails application. Right now, my preferred method is to use rbenv. The only dependency to install rbenv is git which we will also need for deployment using capistrano.

On Ubuntu:

$ [sudo] apt-get install git

On Fedora:

$ [sudo] yum install git

Once you have git install, follow the rbenv installation instructions.

Install nginx and postgresql

For our web application, nginx will process HTTP requests from the internet and serve static assets or pass requests through to unicorn. We will use postgresql as our backend database.

For other applications, I use apache and mysql. Switching these tools is pretty straightforward.

S3 backups (optional)

I setup automatic database backups to S3 using s3cmd. Here are the cron jobs I run everyday.

First, I instruct the server to backup the contents of the postgresql database. Here’s a useful wiki article on how to do this. No need to examine the details too carefully, gardenbed will implement this for you.

Next, I define a s3_backup_command, which copies the /var/db_backups directory to S3.

# s3_backup_command
s3cmd sync -r --no-encrypt --delete-removed \
  /var/db_backups/ s3://vps_database_backups/{FQDN}

Then, I instruct cron to run this command everyday at 4:30am.

$ crontab -e

30 4 * * * s3_backup_command

From this point on I am assuming that you have done nothing except reserve a server from your VPS provider and setup the hostname/FQDN. With Digital Ocean, this is all done when you create a new droplet.

Automate using puppet

Now that we know what we want installed, we can automate this using puppet.

First, clone the gardenbed repository to your local machine.

$ git clone git@github.com:cyrusstoller/gardenbed.git

Next, copy the common.yaml.example and change the data as necessary.

$ cp hiera/common.yaml.example hiera/common.yaml

For a production system, you can delete the ‘vagrant’ postgres user. You should change your postgresql password. There are instructions on how to do that in the comments of the common.yaml file.

You should next change the database names that will get created based on the application that you will be deploying. In my case, revtilt_production is the database that will be used by RevTilt. This needs to match the contents of database.yml file in the rails application you’re planning to deploy. You should also change the postgres_password which is the password for the postgresql super user account postgres. With this password a user has full reign over everything stored in your postgresql database.

Add the versions of ruby that you want installed with rbenv. Most ruby on rails applications should run on 2.1.2, but if you have a specific version, you can specify that here.

If you would like your database to be backed up to Amazon S3 on a daily basis, add you credentials.

And lastly, be sure to replace my SSH keys, with you public SSH keys. Github makes this really easy. For me, I would go to:

https://github.com/cyrusstoller.keys

You can add as many SSH keys as you like to the common.yaml. Anyone with an SSH key associated with a user in the deployers group will be able to login with sudo privileges.

Uploading your common.yaml

Now that you have specified your server configurations, you need to upload your common.yaml to your server before you can apply the puppet manifests.

Install puppet

You will only be able to apply these instructions once (before you disable SSH for root).

# This installs all of the dependencies for the puppet modules
$ scripts/install_modules.sh
# This creates a directory to put the puppet files on the server
$ ssh root@{FQDN} 'mkdir -p /tmp/puppet/hiera'
# This transfers your common.yaml to the server
$ scp hiera/common.yaml root@{FQDN}:/tmp/puppet/hiera
# This sends the puppet files to the server
$ deploy/rsync.sh root@{FQDN}:/tmp/puppet
# This installs the appropriate version of puppet on the server
$ ssh root@{FQDN} 'sudo /tmp/puppet/scripts/upgrade_debian_based_puppet.sh'

Provisioning the server

Now that you have puppet installed, you need to run the puppet manifests.

$ ssh root@{FQDN} 'sudo /tmp/puppet/deploy/puppet_apply_with_args.sh'

At this point you will no longer be able to ssh in as root. From now on you will need to SSH in as one of the users in the deployers group. In my case, that is deployer.

# This copies your common.yaml to the home directory of the deployer user
$ scp hiera/common.yaml deployer@{FQDN}:~/common.yaml
$ ssh deployer@{FQDN}

You should now be logged in as the deployer on your server. You can check that everything is installed by running the following commands.

$ psql --version
$ ruby -v
$ service nginx status

If you’ve never used puppet before, this will feel pretty mysterious. Here are some good resources to help you understand how puppet works:

Running updates

Updating the provisioning of your server is simple. All you need to do is update your common.yaml on your server. You can SSH into your server and edit it with nano or vim or you can edit it on your local machine and then copy it back to the server.

$ scp hiera/common.yaml deployer@{FQDN}:~/common.yaml

Once you have the updated detalis on the server, run the following bash script.

$ deploy/update.sh deployer@{FQDN}

So far, I’ve only done this to install new versions of ruby and postgresql.

Now that your server is provisioned it’s time to deploy your rails application. At this point you could also deploy other ruby based applications (e.g. Sinatra or Grape).

Deploy using capistrano

If you haven’t deployed with capistrano before, be sure to check out the documentation. Most of my workflow follows what is described, except for how to handle the database.yml file in my rails project.

I’ll go over what I do at a high level, but I’m planning on writing a post fully dedicated to this. In the meantime, you can look at the code in the RevTilt project. You’ll want to pay most attention to the following directories:

Capfile - as generated, using rbenv
config/deploy - substitute your FQDN for alpha.revtilt.com
lib/capistrano - I’ll explain the types of tasks that are defined

lib/capistrano/
├── tasks
│   ├── env.cap # for setting environment variables
│   ├── helper.rb # helper to copy template files to the server
│   ├── nginx.cap # setting up nginx configs and start/stop nginx
│   ├── setup.cap # tasks for first time setup
│   ├── unicorn.cap # tasks for configuring unicorn as a system service
│   └── uptime.cap # getting uptime stats for each server
└── templates
    ├── database.yml.erb.example # create a `database.yml.erb` to use
    ├── env.example # create a `env` to use
    ├── nginx.conf.erb # nginx config - setting up unicorn via sockets
    ├── unicorn.rb.erb # unicorn config for zero downtime deploy
    ├── unicorn_init.sh.erb # unicorn service config
    └── unicorn_log_rotate.conf.erb # logrotation config

Instead of adding config/database.yml to my .gitignore I separately copy the sensitive details in lib/capistrano/tempalates/database.yml.erb to my server when I run:

$ cap production deploy:setup

I find this approach easier because it allows users to clone the project from Github and immediately start development.

Once everything is setup, I push my code to Github and then run:

$ cap production deploy

This will deploy the new code with zero downtime. If you are running a server with limited RAM, it’s possible that it will not restart the server properly because it will run out of memory when it is recompiling the assets. If you deploy your code and it appears to be successful, but don’t see the new code when you refresh your browser, run the following:

$ cap production unicorn:restart

Conclusion

If you have any questions, please send me an email or submit an issue on Github for either of these projects (RevTilt or gardenbed).

I like using Puppet to provision my servers and Capistrano to deploy code. I feel I was able to write more reusable code and less code overall. It works for me, but you may find a different combination works better for you. Happy hacking.

Category Tutorial