Consoles
 

Posts Tagged ‘rails’

Getting started with data_fabric

by David Czarnecki, January 29th, 2010 at 11:46am - No Comments »
Tagged As: , , , ,
Posted in: Bending Rails, Uncategorized

The data_fabric gem “provides flexible database connection switching for ActiveRecord”. If you’re not concerned with database sharding, you might want to skip this blog post. Or not. Either way, I’m not going to be offended.

I have a requirement that certain data in an application that I’m developing will probably have to be sharded because, if you’ll excuse my English, there will a “shit ton” of data. This only affects one model out of the few models I have in the application. I don’t have a requirement that the data will be replicated (which is another feature supported in data_fabric), so I’m not going into that here. In any event, here is a rundown of how I got started developing and testing with data_fabric.

- Configure the data_fabric gem in your config/environment.rb file.


config.gem 'data_fabric'

- In your model(s), decide on which column or how the data is going to be shared.


data_fabric :replicated => false, :shard_by => :initial_code

In this case, inital_code is a method that looks at a piece of the model’s data and gives me the correct shard.

- Setup the database shards in your config/database.yml file. I actually setup only one shard for development and testing environments to make things easier. I’m just including the one for the test environment here. You can read on the data_fabric site about the naming convention for sharded database connections.


test:
adapter: mysql
encoding: utf8
reconnect: false
database: myapp_test
pool: 5
username: root
password:

# This is the database shard
initial_code_testenv_test:
adapter: mysql
encoding: utf8
reconnect: false
database: myapp_test_testenv
pool: 5
username: root
password:

- In config/initializers/my_app_model.rb, I actually stub out the initial_code method to return a single value for the development and test environments. This is merely convenience so I don’t have to include every single database shard for development and testing.


require 'mocha'

if 'development'.eql?(RAILS_ENV)
PromotionCode.stubs(:initial_code).returns('devenv')
end

if 'test'.eql?(RAILS_ENV)
PromotionCode.stubs(:initial_code).returns('testenv')
end

- I copied part of the Rakefile from the data_fabric gem to actually be able to migrate the database for the sharded database connections. This was definitely missing from the data_fabric README.


require 'fileutils'
include FileUtils::Verbose

namespace :db do
task :migrate do
require 'erb'
require 'logger'
require 'active_record'
reference = YAML::load(ERB.new(IO.read("config/database.yml")).result)
env = RAILS_ENV = ENV['RAILS_ENV'] || 'development'
ActiveRecord::Base.logger = Logger.new(STDOUT)
ActiveRecord::Base.logger.level = Logger::WARN
ActiveRecord::Base.configurations = reference.dup
old_config = reference[env]
reference.each_key do |name|
next unless name.include? env
next if name.include? 'slave' # Replicated databases should not be touched directly

puts "Migrating #{name}"
ActiveRecord::Base.clear_active_connections!
ActiveRecord::Base.configurations[env] = reference[name]
ActiveRecord::Base.establish_connection RAILS_ENV
ActiveRecord::Migration.verbose = ENV["VERBOSE"] ? ENV["VERBOSE"] == "true" : true
ActiveRecord::Migrator.migrate("db/migrate/", ENV["VERSION"] ? ENV["VERSION"].to_i : nil)
end
end
end

- In my test classes that use the sharded model, I have setup and teardown methods that activate and deactivate the shard.


def setup
DataFabric.activate_shard(:initial_code => 'testenv')
end

def teardown
MyAppModel.delete_all
DataFabric.deactivate_shard(:initial_code => 'testenv')
end

I did find that I needed to delete all the objects in the database for the sharded model. I’m still digging into why that’s the case. My ActiveRecord_fu isn’t that strong I guess.

All in all, sharding is relatively easy with data_fabric. Pimping, however, “ain’t easy.” But that’s for another blog post I guess.

Scaling Ruby and Rails Part 1

by David Czarnecki, January 4th, 2010 at 07:34pm - 4 Comments »
Tagged As: , , , , ,
Posted in: Bending Rails, Engineering

I wish scaling applications and systems these days consisted solely of “Just Add Scaling!“. But you know what? It’s not. I also forget where I read it, but the quote went something like, “Programming languages don’t scale, architectures scale.” Scaling is driven by proper iterative design, implementation and testing.

In a series of blog posts I want to cover how we have approached scaling out various parts of our Ruby and Rails infrastructure here at Agora Games using real-world examples on very high-traffic sites such as the Guitar Hero and Call of Duty community sites.

Here I’ll cover the “Deep Dive”. I originally came from BigCo. and there we used a concept called the “Deep Dive”, which involved taking a specific requirement in combination with an approach or technology and following a thread of execution that would take you through the entire technology stack, or a “deep dive” through the system. At the end you would either prove or disprove the technology or approach. But it was done in the context of a real set of requirements.

The following is the e-mail (project/features names changed to protect the innocent … the important concept here is the Deep Dive, not the project/features) I sent around to our engineering team in September of last year after doing a Deep Dive on a queue system.

_________________

From: David Czarnecki

To: Engineering

Clearly Defined Requirement(s)

Ultimately, to do a deep dive correctly, you need clearly defined requirements to evaluate your technology or approach against. In the case of PROJECT X, with the use of a queue, we had the following:

Setup queue
Decide event(s)
Send to queue
Aggregate from/to queue
Put into message creation
Send back to the app

Narrowing the Field

I spent a day looking at various queue packages in Ruby and other languages to understand:

Features – What features do we get out of the package?
API – How easy is it to setup/create/interact with the queue from actual code?
Aliveness – Is this an ongoing effort or was it thrown on RubyForge and ultimately abandoned?
Community – Where is this package being used? How many developers or contributors commit to the project?
Language – Are we expanding our technology stack by introducing a queue written in one language with an interface in another language?

Pork, aka The Other Other Requirements

And don’t forget about the other “unspoken” requirements.

Ease of setup
Speed
Failsafe
Scaling

At the end of the day, whichever package is picked, you want some guarantee that the package you’ve chosen is “good” or at least “good enough”. But what if the Guarantee Fairy’s a crazy glue sniffer? Next thing you know there’s change missing from your dresser and your daughter’s knocked up. I’ve seen it a hundred times. Although you’ve got a set of requirements that define how you’re going to use a technology or approach operationally, there are still requirements that need to be addressed, even if there isn’t anything formally specified.

Let’s Get Ready To Rumble

I chose Sparrow and Rabbit/AMQP since these passed the “ease of setup” requirements with flying colors.

http://code.google.com/p/sparrow/
- Pure Ruby
http://hopper.squarespace.com/blog/2008/7/22/simple-amqp-library-for-ruby.html
- Erlang Queue Server/Ruby interface to Queue

Next up it was time to prove out the feasibility of the two technologies looking at the “soft” requirements in the context of the “hard” requirements. This meant setting up the two systems to:

Setup queue
Send event(s) to queue
Aggregate from/to queue

The other “hard” requirements would be addressed based on the outcome of this initial sanity check.

2 Queues Enter, 1 Queue Leaves … Wait, what?

Although I wanted to use this to prove out “FEATURE X”, I also wanted to address its use in “FEATURE Y”. “FEATURE Y” involves converting a song file into an MP3. So I setup a test to evaluate the two systems which was:


1k, 8k, 16k, 32k, 64k messages do
25.times do
10000 messages do
publish message to queue
read message from queue
end
end
end

In other words, publish 10000 messages to the queue (in one process) and read those messages from the queue (in another process), noting how long it took to publish and read. Do this 25 times to get a min/max/average time for each of the different message sizes.

I have attached the spreadsheet of the results which show: the larger the message, the longer it takes to publish and read from the queue. However, it also shows that Sparrow could handle the 64k messages while Rabbit/AMQP could not. Sparrow got slower to process those 10000 64k items from the queue, but it never failed as with Rabbit/AMQP. Ultimately, the deep dive was not about fixing a broken AMQP adapter.

The Devil is in the Details

One benefit of using Sparrow is that persistence is built into the server. If you take down Sparrow and there are messages on the queue, it will write those out to an SQLite3 database. Ultimately this lead me to look at the size of the field it was using for queue data which would need to be patched from its current 255 characters.

Conclusion

So, I’ve now got a queue server that I feel comfortable setting up and using and that can probably handle the load of data we’re going to throw at it come launch. The queue server/queues were integrated into PROJECT X in the context of the “FEATURE X” to prove its feasibility in addressing that feature in a future sprint.

And one more thing …

There are tests for the various bits that make up “FEATURE X”. I’m most happy with the integration test which fires up a Sparrow server, fires up a foo, creates a bar, runs the aggregator, and checks to see that a baz was created for the account (oh and then cleaning up the queue server and the subscriber). 14 LOC, but there’s a lot of code that it exercises behind the scenes. And yes, it passes :)

_________________

So, there you go. Hopefully you have enough information to do your own Deep Dive.

Ultimately for FEATURE X and FEATURE Y, Sparrow more than met our needs. Advances and changes to AMQP and its associated libraries have been made which I’m sure make it a more than viable candidate. At the time however, with just getting the system to work for a day to prove out the Deep Dive, it just didn’t meet our needs. Again, the point of this blog post is to talk about the Deep Dive in the larger context of its use in Scaling Ruby and Rails.

RailsConf Wrap Up

by Jason LaPorte, May 29th, 2009 at 01:42pm - No Comments »
Tagged As: , , , , , , ,
Posted in: Engineering, Infrastructure

Well, we’re back from Vegas! And have been, for a couple weeks… I’ve been meaning to put up some follow-up resources for my talk (PWN Your Infrastructure: Behind Call of Duty: World at War), but there was just so much work to do when I got back… such is the life of a system administrator!

That said, I’ve got some free moments, so I’m putting up some reference materials.

(more…)

Write if read returns nil

by Ola Mork, May 13th, 2009 at 10:14am - No Comments »
Tagged As: ,
Posted in: Bending Rails, Engineering

Usually we use standard caching methods on our site (primarily fragment caching to avoid DB queries).

Occasionally we need to do something more fancy. These instances usually come up when we’re splitting one query into two because rails doesn’t support :force_index or :adapter_specific_find_options on ActiveRecord::Base.find. We understand this motivation but really hate ActiveRecord::Base.connection#find_by_sql or ActiveRecord::Base.connection#execute. These are not rational hatreds.

So when we get into a situation where we’re going to be caching manually it’s usually in the controller and we almost always end up with a pattern of:

@object = Rails.cache.read('really/complicated/and/stinky/key')
if @object.nil?
@object = what_should_my_object_be?
end

That’s fine in a contrived example but we were doing this in about 10 different places and it looked like a good candidate for drying up.

Here’s the solution we use:

module ActiveSupport
module Cache
class Store
def read_and_write_if_nil(key, options = {})
object = read(key)
if object.nil?
object = yield
write(key, object, options)
end
object
end
end
end
end

And the production example looks like this:

account_ids = Rails.cache.read_and_write_if_nil("member_ids_for_clan_#{@clan.id}", :expires_in => 5.minutes) do
@clan.members.find(:all, :order => 'groupies DESC', :select => 'accounts.id').collect(&:id)
end

Porting Legacy Applications to Modern Systems

by Jason LaPorte, January 12th, 2009 at 01:58pm - 2 Comments »
Tagged As: , , , ,
Posted in: Bending Rails, Engineering

(Or, an adventure in pain!)

Let me preface this little article by saying I did not write the app I am now porting to our new network. I know noting about its specific intricacies, and furthermore, I know nothing specifically about gettext, file_column, or attachment_fu. So, I dove into this project with a sort of wanton abandon that is fairly characteristic of much of the work I do. Any obviously stupid mistakes below thus really happened.

Let me also preface this with the fact that this Rails app is old. Version 1.2.3 old. So, as always, YMMV.

Finally, I really wish attachment_fu had proper documentation.

(more…)