February 2012

Performance Optimization for MyISAM Compressed Tables

Database performance optimization is a significant part of the ongoing service we provide for our clients. We recently found that a client had a query they performed regularly that was drastically slowing the system as a whole, and we investigated to see if we could help them resolve the issue.

The symptoms were confusing: though the server configuration was sized appropriately for the available memory, the server was per-allocating approximately 20GB of RAM immediately on a query, even when the query required nowhere near that allocation, leading to excessive swapping and performance hits.

We explored various options: Were there SET SESSION statements being executed improperly before the query? Were buffer_size parameters set correctly? Were there other variables we hadn’t considered?

As we investigated, we discovered something interesting: The MySQL Server (mysqld) sets a default memory allocation for MyISAM compressed tables, and that default setting is unnecessarily large (18446744073709547520). As a result, it maps into memory all of the MyISAM compressed tables. In our case, with 20GB tables, mysqld was suddenly allocating an unreasonable amount of memory to the process.

This default memory allocation variable was introduced in MySQL 5.1.43, and you can read more about it here.

We immediately tuned this down to a more manageable size for the actual queries being run, and are seeing the expected performance improvements in our client’s queries.

Our Pro Bono Program - bringing database excellence to non-profit partners

 

PalominoDB works with a wide range of clients on database management, support and engineering projects. Occasionally, clients find that they haven’t used the full allotment of hours they’ve purchased in a given month, and when that happens, we give them the option to donate those unused hours to our Pro Bono Program. We are pleased to be able to offer our non-profit clients the same level of innovation, service and technical expertise that startups and enterprise-level companies rely on us to provide.

Our latest project is one quite near to our hearts - worldwide literacy. Our friends at Worldreader have the ambitious goal of making digital books available to everyone in the developing world, enabling millions of people to improve their lives. 

As you might imagine, managing the digital assets, the physical e-readers and the program participants involves a lot of data. Worldreader measures success by looking at how people read, what people read, when people read and how those change over time. Managing that data, and synthesizing it into actionable project metrics are core functions that take a great deal of staff and volunteer time. 

PalominoDB helps Worldreader streamline project reporting in order to track and promote engagement with digital texts of all kinds - books, newspapers, magazine and online-only content. The pilot programs in Ghana and Kenya have been hugely successful, and as the Worldreader team heads to Uganda, PalominoDB has renewed our commitment, working with the team at Worldreader over the next year to develop content management tools, purchase and inventory management systems, and provide general database support.

None of this would be possible without our for-profit clients who donate unused hours, and our staff, who contribute time and attention to these projects. Philip Stehlik, Chief Technology Officer at Taulia, explains why the PalominoDB Pro Bono Program appealed to them.  “Taulia works to connect businesses and to improve access to capital for small and medium businesses through Dynamic Discounting. We are excited that donating our unused hours with PalominoDB allowed us to support an entirely new kind of supply chain - getting electronic books to children in the developing world."

If your non-profit would like to work with PalominoDB’s Pro Bono Program, please get in touch. And if you’re a PalominoDB client (or you think you might want to be!) and want to donate hours to our Pro Bono Program, we’d love to hear from you too. And finally, if you’re an author or publisher who wants to donate books to people hungry for great stories, please get in touch with our friends at Worldreader - they’d love to help more people fall in love with your books.

 

 

 

 

Continuous Deployment Architecture For Ruby on Rails

 

We have seen agile development become more popular in recent years thanks in part to the evolution of continuous integration environments like Ruby on Rails.  These development frameworks leverage quick testing and the ability to easily deploy over clusters. As deployments happen more often, we look for ways to minimize the disruption to users. Cluster architecture already contains the components we need, with multiple servers for each role, allowing us to update a subset of the system while the rest serve the business.

We recently worked with the DevOps team at one of our clients to develop an architecture that allows them to run rolling changes through the cluster using the Ruby on Rails framework. There are two principal components of change: application code on the application servers and database structure on the database servers. The application servers can easily be targeted specifically through deploy.rb. The database side of things, however, is a bit more complicated.

In order to have zero downtime, half the application servers are taken offline and updated, then the pair of master/slave standby databases have the DDLs applied to them. Traffic is then switched to these servers.

HAProxy is put in between the application and the MySQL databaseservers to act as a router for database traffic (see http://palominodb.com/blog/2011/12/01/using-haproxy-mysql-failovers for specifics), providing the ability to flip the active and standby roles of the two server pairs (as well as providing a High Availability solution). Since there are slave relationships, we needed to be able to pause replication through the deployment application (Capistrano).  We were able to accomplish this by adding a file, deployrake.util, to the Rake subsystem under lib/tasks:

namespace :deployutil do

 desc 'Checks the replication status of the primary database'

 task :replication_test => :environment do

   # find the state of replication

    mysql_res = ActiveRecord::Base.connection.execute("SHOW SLAVE STATUS")

   mysql_res.each_hash do |row|

     if row['Slave_IO_Running'] == "Yes" and row['Slave_SQL_Running']

== "Yes" and row['Seconds_Behind_Master'].to_s == "0"

       puts "ReplicationGood"

     elsif row['Seconds_Behind_Master'].blank?

       puts "ReplicationBroken"

     else

       puts "ReplicationBehind_" + row['Seconds_Behind_Master'].to_s

     end

   end

   mysql_res.free

 end

 

 task :start_slave => :environment do

   ActiveRecord::Base.connection.execute("START SLAVE")

 end

 task :stop_slave => :environment do

   ActiveRecord::Base.connection.execute("STOP SLAVE")

 end

 

We can then create tasks in deploy.rb to call these

 desc "shows the current passive slave replication status"

 task :get_slave_replication_status, :roles => :cron do

   dbrails_env = fetch(:rails_env) + '_passiveslave'

   # find the state of replication

   set :slave_replication_status, capture("cd #{latest_release} ;

RAILS_ENV=#{dbrails_env} rake deployutil:replication_test").chomp

 end

 

 desc "stop passive slave replication"

 task :stop_passiveslave_repl, :roles => :cron do

   dbrails_env = fetch(:rails_env) + '_passiveslave'

   run "cd #{latest_release} ; RAILS_ENV=#{dbrails_env} rake

deployutil:stop_slave"

 end

 etc....

We also want to be able to limit changes to specific databases so that the changes won't go into the bin logs and propagate when the slaves are turned back on.  See http://palominodb.com/blog/2011/11/21/rails-and-database-session-variables for details on how to do this through an extension of ActiveRecord.  A word of caution here: setting sql_log_bin=0 to skip logging these changes will invalidate using the binlogs for point in time recovery.

You will need a full backup after the change.

Syndicate content
Website by Digital Loom