Setting up Rails with Redis Resque and Rescue-Scheduler on Dotcloud
I learned a ton over the past week getting feedtopic.com's somewhat unique setup hosted on Dotcloud. There are a few small caveats that you really have to pay attention to.
Before I get started, however, I need to give big ups to the Dotcloud team. They are a fellow Y Combinator 2010 batch company, with amazing skill and genius, as well as incredible support ethics and all around amazing personalities. I consider these guys my friends more than just tech support and company founders.
Now let's get down to what our backend is like.
Rails 3
Part of feedtopic.com is built on the Rails 3 framework. This deals with our database, complicated machine learning / natural language processing, and custom algorithms we built on top.
Redis
We have decided to use Redis to deal with our queueing system. We never need to store crazily marshalled objects, and json strings work great.
Resque and Resque-Scheduler
Naturally with redis, we use the resque gem to handle jobs in the redis queue, as well as resque-scheduler to basically do our cron for us. It's slick because you can load up a front end on a subpath with Rack::URLMap and see how your queue and schedules are doing from a nice clean interface. We originally used Redis To Go, so our configuration needed to be adapted when we setup the Dotcloud redis.
Postgresql
I want to mention we also chose postgres as our database. We really don't need any fancy nosql setups or anything at this time.
Dotcloud
We chose Dotcloud for the job of hosting everything. They are currently in "beta" mode, which means everything isn't as beautifully polished as it could be yet, but the support they provide, like REAL live support, is better than any docs you can get. Because there's a few things to polish up, I decided to write this blog post with a few things we figured out along the way.
Heroku (I still love Heroku though)
Another honest reason for choosing Dotcloud is, running this on Heroku in it's minimum "development" state would cost at least $72 a month. The reason is, we need to run two workers on Heroku, and each one is about $36 a month. That's a lot for two small workers (one resque, and one for resque-scheduler). Also it's a bit hacked up on Heroku, because you need to run two apps, one for the resque job, and one for the resque-scheduler job. Heroku only allows you to rake jobs:work, and that can only be mapped to one rake. Thus you need to push a whole new app just to run the scheduler. It works OK if you connect to the same Redis server, but the Redis server will again cost money (they use redistogo). With Dotcloud, it's setup nicely and you can even setup your own Redis. I definitely think Heroku will put something in place for this as well. You could even get away with it if you use the cron add-on, but with a little less flexibility.
Also don't want to knock on Heroku. These guys provide amazing service as well. We've been using them successfully with Fanvibe.com for a year and a half now. It's this special redis/resque setup that really prompted us to go with dotcloud. Fanvibe.com not only powers our web property, but it also powers the backend of our iPhone app, and our API that powers all of the NBA's (National Basketball Association) properties too (iphone app, android app, ipad app, nba.com), so you can imagine how awesome Heroku can be. Our backend doesn't just serve up pages, it cranks through live stats in real-time (sub seconds), uses heuristic algorithms to literally create on the fly prediction questions as well as ending answering and awarding people when the predictions are over, it also notifies people about news, stats, live scores, and what friends are watching and saying. Anyway, I'll save Fanvibe talk for another post. In a nutshell, I really spend very minimal time every worrying about Fanvibe on Heroku, so this in no way means Heroku isn't a great service. It really is a great service. The team over there is also very responsive and are great guys. I have no doubt they'll really polish out a solution for resque/resque-scheduler soon enough.
Rails 3 setup
Going through the usual Dotcloud documentation, it's pretty straightforward. When you first "deploy" your ruby app, you'll no doubt have a SystemTimer gem in your Gemfile. Why? Because resque gem docs tell you to put it in. SystemTimer apparently fixes a crucial bug in Ruby 1.8. I've heard this no longer exists on Ruby 1.9. SystemTimer won't work on Ruby 1.9 without this fix being merged in, so you have two options:
deploy with dotcloud with the configuration of ruby 1.8, which is called "ree" in the config parameter. ie. -c '{"ruby-version": "ree"}'
deploy without the SystemTimer gem and just let the default ruby 1.9 deal with it (this is what I ended up doing).
Another big problem was that, for some reason you'll need a nginx config fix on Dotcloud for some virtualhost problem. This part I will need Dotcloud guys to step in. I'm pretty sure they'll update the docs about this soon. But if you're getting 404's on pages that you know work locally or on something like Heroku, ping them about the nginx config file and the vhost stuff. More specifically, Sam over at Dotcloud put this together for me.
Postgresql Setup
I have our database setup on Dotcloud, just because it's easy. One thing to note is that the password Dotcloud provides is pretty crazy, so feel free to wrap the password in double quotes, otherwise I believe it screws with the yaml. Here's ours:
1 2 3 4 5 6 7 8 9
production:
adapter: postgresql
encoding: unicode
database: feedtopic
pool: 5
username: x
password: "some|password"
port: 1200
host: db.feedtopicrules.dotcloud.com
Redis Setup
The documentation is spot on here. Note that I had previously setup a Redis To Go hosted redis, so these are more caveats on how to adapt that to your own redis setup on dotcloud. To connect it with your rails app, and your resque workers, you'll need to know a few things.
ENV["REDIS_URL"] doesn't really work on Dotcloud yet, so avoid that. I would use a config/redis.yml file and load that in. We had used an environment variable set in our development.rb / production.rb files per the instructions of Redis To Go.
The password again, is a bit crazy. If you previously used Redis To Go, you'll see that they parse the URI, and that won't work with the Dotcloud super safe password.
Resque and Resque Scheduler Configuration
I'm going to group both of these here because they're related on how to set them up. This part was a bit trickier, but we figured out a nice way of doing it. I'm not going to go into how to get a working resque worker / scheduler working here, I'm going to assume you have it all working locally already.
supervisord.conf: This file is required at the root level of your app. The problem here for a rails 3 app and the resque gems is that you need two different supervisord.conf files for different workers. The different workers being a pure resque worker, and a resque-scheduler worker. You can't put them both in the same file, but you also don't want to refactor the entire structure of the app to work in different directories on the rails app. So we came up with cool solution
Each ruby-worker deployment has a unique $HOSTNAME variable. It's basicaly whatever namespace.name you decided on when deploying.
Create two files, supervisord.conf_namespace.resque, and supervisord.conf_namespace.scheduler for example.
Make sure you also set the RAILS_ENV in the environment section of the config file.
Create a post install hook file: "postinstall" that creates a link to the correct supervisord config file, based on the hostname. This will basically ensure the correct config file is used on a certain host! FREAKIN SWEET
This is the resque worker's supervisord config file. Notice we need to also set the rails environment to production. For some reason it kept running on production for me.
1 2 3 4
[program:resque]
command = rake resque:work
environment = QUEUE=*,RAILS_ENV=production
directory = /home/dotcloud/current/
This is the resque-scheduler's supervisord config file. You can ignore the fact I have the environment setup there, the queue isn't used, for some reason I just still have it sitting there.
1 2 3 4
[program:scheduler]
command = rake resque:scheduler
environment = QUEUE=*
directory = /home/dotcloud/current/
And lastly, our postinstall file. This is basically making a link. What a sweet hack. Remember to chmod +x
1 2
#!/bin/sh
ln -s supervisord.conf_$(hostname) supervisord.conf
We also added in a require in our Rakefile to include resque/tasks
1 2 3 4
require File.expand_path('../config/application', __FILE__)
require 'rake'
require 'resque/tasks'
Feedtopic::Application.load_tasks
That's it
Wow, ok that was a lot. But it really seemed like a lot more when we were working on it. The above were just the final conclusions we came to. I'm pretty sure I've documented everything, but there is a possibility I left a few things out. Again, these are very specific to our setup on Dotcloud, so what you have may vary. Dotcloud guys might send me a few clarifications and I'll update as that comes in.
If you guys want anymore information about any of the topics, feel free to ask me in the comments or ping the Dotcloud team.
Photo: So I like posting photos I've taken with every post I make, regardless of how much sense it makes. This was one I took of Issa when we visited Lake Tahoe.















