Insights from the PgCon 2013

PgCon 2013 was attended by 256 people across the globe.  Attendees had the opportunity to enjoy tutorials, talks and an excellent unconference (this last deserves a special mention).

I lectured a talk related with Full text search using Sphinx and Postgres (you can find the slides at, and all of the talks have been recorded).  The quality of the talks in general was quite good, but I don't want to repeat what you will find in other posts.

The unconference was attended quite late into the evening. You can find a schedule of it, as well as the minutes of some of the talks that happened (and others that didn't also) here.

There was a special emphasis on the pluggable storage feature, albeit most agree that it will be a very difficult feature to implement in the near versions. A topic related to this, was the Foreign Data Wrapper enhancements.

Pluggable Storage engine was extended after. The main reason of why everybody agrees with this feature, is because an API for the storage will allow companies to collaborate with code and avoid forks to other projects.

There was a long discussion also about migrations on the hall, using pg_upgrade.

The features about replication were bi-directional and logical replication.

Full text search unconference discussion was pretty interesting. Oleg Bartunov and Alexander showed a really interesting work coming up for optimizing GIN indexes. According to their benchmarks, Postgres could improve the performance significantly.

There were a lot of discussion I missed, due the wide number of tracks and "hall spots". But th emajority of attendees I heard agreed that the unconference was quite exciting and granted the possibility to bring many new ideas.





Supporting Feminism in Technology - Part 1


I've been contemplating the topic of feminism and misogyny in the technology field a lot of late.  This blog post is a culmination of a significant amount of thought and reflection on the topic of women in technology.  At Palomino, I focus a lot on the values around bringing underserved populations into technology.  Women, people from working class or impoverished backgrounds, people who are gay and lesbian or transgendered, and people of latino and african american backgrounds are traditionally highly underrepresented in the US technological workforce.  Palomino, even being a woman owned business, is not exempt from this issue.  One of my goals is to build up not just Palomino's DBA and engineer population to reflect higher percentages of these populations, but to support more people in the entire community in having these opportunities.  I'd like to focus on gender in this conversation, though I have much to say on the topics of race and class as well.  In fact, they are all interrelated.

To dig into the topic, one of the first things to consider and that people ask, is why is this a big deal.  And at the top level, without any time put into consideration, I can see why someone might feel this way.  After all, if a DBA is good, why does their gender, race or background matter?  And, if you are simply considering the output of an individual or organization, this is pretty true.  But, there is more.  My hope is that people who focus on the importance of open source software and free access to technology would understand the importance of building larger populations of women engineers and administrators, but that has proven itself to not be true.

The fact is, that on a macro level, one of the largest ways to get more people from underserved populations into jobs such as database administration, infrastructure architecture and software engineering is to provide them with mentors and role models who have already broken through the barriers to make it.  And, we do exist.  Palomino and Blue Gecko were both built by women.  Oracle's Ace and Ace Director list has about 12 (of 390) women on it.  I have been meeting more people of color and women working on senior teams of clients.  I do see potential role models and mentors out there.  But, don't get too excited.  There are still plenty of  opportunities for improving how we build welcoming workplaces for the talented and diverse engineers already out there making their way in the field.  

There's also the selfish part of the equation.  More often than not, when I find women and people of color in the wild, with successful records as engineers and operators, they tend to be extremely good at their jobs.  They tend to be excellent communicators, good with clients, detailed with project planning and highly technically competent.  Is this because of more innate talent?  No.  It is because the amount of willpower, inner strength, self-confidence and chutzpah required to succeed for these people is much higher than the predominant demographic of engineers.

1. Mindfulness in Language and Communication - I find this is particularly true in remote workforces, such as Palomino where the entire culture is often built about word choice and expression of ideas.  There are obvious cases, such as how often people start email threads with "Gentlemen".  Then, there is simply the propagation of cultures around masculinity, or "brogramming".  This is more delicate.  After all, there are plenty of women, myself included, who enjoy conversations around traditionally masculine pursuits and endeavors.  I lean more towards not an exclusion of topic, but an inclusion and mindfulness of those who might feel left out.  Do you ask women about their favorite sports teams?  Do you keep an eye out for folks who might retreat from certain topics and adjust your conversations accordingly?  Shutting off social conversation is not generally helpful, but as leaders in organizations, it is a responsibility to help guide conversations to be as inclusive and supportive as possible to all staff.  And of course, any traditionally sexist, racist or classist conversations need to be privately nipped in the bud immediately as a manner of course.  Creating space for other conversations outside of traditionally masculine ones to occur is also critical.  Ask people who are not from the dominant race/class/gender in your organization about their weekends and pastimes.  Don't assume a woman is interested in knitting, but give her a chance to express what she likes.  She might surprise you and your team with the diverse range of interests that might be brought up.

2. Examine the Gendered Roles and Behaviors - Go to most tech sites and look at their team pages.  I'm willing to bet that if you are looking at client facing positions that require emotional intelligence and empathy, you will find more women than in the technical fields.  Palomino is no exception.  Our project and account management teams are all female.  Our office manager is male, however.  Ultimately, I don't recommend the policing of the gender of individual roles, but I do believe it's important to examine key expectations and behaviors around staff.  For instance, it is common practice to assume engineers and administrators do not have the emotional/social capacity to interact with users/clients.  So, organizations put account managers or project managers in between, who are often female and thus considered more socially and emotionally adept.  Rarely is it considered a priority to encourage the technical staff to step up, improve their soft skills such as empathy and to interact directly with the client base.  Instead, we build a culture of mothering, which is harmful to all parties involved.

Additionally, do we value the roles that are more empathetic, client facing and emotionally intelligent?  People always discuss how hard it is to retain and find good DBAs, and their salaries, power and "catering to" reflect this in the organization.  While a good PM may not be as hard to find, they are still just as valuable to an organization.  Do you take these roles for granted, or do you also make them feel as important, valued and encouraged as your more technical staff?  Do you let mediation fall to these same people, or do you encourage all staff to develop their skills in negotiation and conflict resolution?  

3. The Devil is in the Details - At the recent Percona Live conference, T-Shirts were given to all attendees.  When asked if there were women's sizes, the organizers stated they were unisex.  Unisex is not actually unisex.  It is men's, and not designed for women's bodies.  These details, while not large individually, add up to a feeling of being an add on; just as much as lack of kosher meals, or wheelchair ramps far from the main entrance can cause one to feel like an afterthought.   Take the extra step to define and socialize your diversity policy and your code of conduct.  O'Reilly has a great code of conduct at  Note, that defining the code of conduct or the diversity policy is not enough.  You need to talk to people about these things and engage them.  When you are discussing policies around employees, or evaluating a new client, think about how this fits in to your policies.  When you are planning a company offsite, organizing a conference or writing a blog post, think about these policies.  Who will be involved or affected by your choices?  What can you do to make them feel more included?  Take the time to really think about this.

4. Recruiting - This is a challenging position, and one that I've had to consider for quite some time.  At Palomino, I'd say we get perhaps 1 out of 20 applicants who are women via our natural model of letting people come to us via word of mouth.  That is obviously a horrible ratio.  Too often, people just say "well, if women don't interview how can we hire them?".  That's a cop out.  Most hiring managers know that you don't get A players from a passive recruiting strategy.  This is just as true for getting women to interview for technical positions.  You need to spend time going to events such as the ADA Initiative Unconference (, Women Powering Technology Summit ( and sponsoring, speaking and getting involved.  There are numerous meetups, from Girls who Code in NYC, Girls in Tech in Las Vegas and Women in Tech in SF.  Additionally, you should be going through LinkedIn to find women and contacting them.  Even if they are not interested, by building a network that includes more and more women, you are improving the possibility that you will find the right women for your organization.  Get out there and speak at meetups, start some introductory courses for women coming out of college and continue to build that network.  There is no reason to stay at a 5% rate of interviews, but you have to work!

This is part 1 in 2 parts.  I'd like to focus next on some ways in which dialogue around the conversations can go wrong, and how to discuss and respond to conversations around feminism and misogyny in a constructive manner.  I do look forward to feedback and conversations around the topic, and I thank you for your time in reading and considering this.

Benchmarking Postgres on AWS 4,000 PIOPs EBS instances


Disk I/O is frequently the performance bottleneck with relational databases. With AWS recently releasing 4,000 PIOPs EBS volumes, I wanted to do some benchmarking with pgbench and PostgreSQL 9.2. Prior to this release the maximum available I/O capacity was 2,000 IOPs per volume. EBS IOPs are read and written in 16Kb chunks with their performance limited by both the I/O capacity of the EBS volumes and the network bandwidth between an EC2 instance and the EBS network. My goal isn't to provide a PostgreSQL tuning guide, an EC2 tuning guide, or a database deathmatch complete with graphs; I'll just be displaying what kind of performance is available out-of-the-box without substantive tuning. In other words, this is an exploratory benchmark not a comparative benchmark. I would have liked to compare the performance of 4,000 PIOPs EBS volumes with 2,000 PIOPs EBS volumes, but I ran out of time so that will have to wait for a following post.



I conducted my testing in AWS' São Paulo region. One benefit of testing in sa-east-1 is that spot prices for larger instances are (anecdotally) more stable than in us-east. Unfortunately, sa-east-1 doesn't have any cluster compute (CC) instances available. CC instances have twice the bandwidth to the EBS network than non-CC EC2 instances. That additional bandwidth allows you to construct larger software RAID volumes. My cocktail napkin calculations show that it should be possible to reach 50,000 PIOPs on an EBS-backed CC instance without much of a problem.

EC2 instances

I tested with three EC2 instances: an m1.large from which to run pgbench, an m2.2xlarge with four EBS volumes, and an m1.xlarge with one EBS volume. All EBS volumes are 400GB with 4,000 provisioned IOPs. The m1.large instance was an on-demand instance; the other instances  — the pgbench target database servers — were all spot instances with a maximum bid of $0.05. (In one case our first spot instance was terminated, and we had to rebuild it). Some brief testing showed that having an external machine driving the benchmark was critical for the best results.

Operating System

All EC2 instances are running Ubuntu 12.10. A custom sysctl.conf tuned the Sys V shared memory as well as set swappiness to zero and memory overcommit to two.

kernel.shmmax = 13355443200
kernel.shmall = 13355443200
vm.swappiness = 0
vm.overcommit_memory = 2


The following packages were installed via apt-get:

  • htop
  • xfsprogs
  • debian-keyring
  • mdadm
  • postgresql-9.2
  • postgresql-contrib-9.2

In order to install the postgresql packages a pgdb.list file containing

deb squeeze-pgdg main

was placed in /etc/apt/sources.list.d and the following commands were run:

gpg --keyserver --recv-keys ACCC4CF8
gpg --armor --export ACCC4CF8 | apt-key add -
apt-get update

RAID and Filesystems

For the one volume instance, I simply created an XFS file system and mounted it on /mnt/benchmark.

mkdir /mnt/benchmark
mkfs.xfs /dev/svdf 
mount -t xfs /dev/svdf /mnt/benchmark
echo "/dev/svdf    /mnt/benchmark    xfs    defaults    1 2" >> /etc/fstab

For the four volume instance it was only slightly more involved. mkfs.xfs analyzes the underlying disk objects and determines the appropriate values for stride and width. Below are the commands for assembling a four volume mdadm software RAID array that is mounted on boot (assuming you've attached the EBS volumes to your EC2 instance). Running dpkg-reconfigure rebuilds the initrd image.

mkdir /mnt/benchmark
mdadm --create /dev/md0 --level=0 --raid-volumes=4 /dev/svdf /dev/svdg /dev/svdh /dev/svdi
mdadm --detail --scan >> /etc/mdadm/mdadm.conf
mkfs.xfs /dev/md0
echo "/dev/md0    /mnt/benchmark    xfs    defaults    1 2" >> /etc/fstab
dpkg-reconfigure mdadm


pgbench is a utlity included in the postgresql-contrib-9.2 package. It approximates the TPC-B benchmark and can be looked at as a database stress test whose output is measured in transactions per second. It involves a significant amount of disk I/O with transactions that run for relatively short amounts of time. vacuumdb was run before each pgbench iteration. For each database server pgbench was run mimicking 16 clients, 32 clients, 48 clients, 64 clients, 80 clients, and 96 clients. At each of those client values, pgbench iterated ten times in steps of 100 from 100 to 1,000 transactions per client. It's important to realize that pgbench's stress test is not typical of a web application workload; most consumer facing web applications could achieve much higher rates than those mentioned here. The only pgbench results against AWS/EBS volumes that I'm-aware-of/is-quickly-googleable is from early 2012 and, at its best, achieves rates 50% slower than the lowest rates found here. I drove the benchmark using a very small, very unfancy bash script. An example of the pgbench commandline would be:

pgbench -h $DBHOST -j4 -r -Mextended -n -c48 -t600 -U$DBUSER

m1.xlarge with single 4,000 PIOPs volume

The maximum transaction volume for this isntance was when running below 48 concurrent clients and under 500 transactions per client. While the transaction throuput never dropped precipitously at any point, loads outside of that range exhibited varying performance. Even at its worst, though, this instance handled between 600-700 transactions/second.

m2.2xlarge with four 4,000 PIOPs volumes

I was impressed; at no point did the benchmark stress this instance — the tps rate was between 1700-1900 in most situations with peaks up to 2200 transactions per second. If I was asked to blindly size a "big" PostgreSQL database server running on AWS this is probably where I would start. It's not so large that you have operational issues like worrying about MTBFs for ten volume RAID arrays or trying to snapshot 4TB of disk space, but it is large enough to absorb a substantial amount of traffic.

Graphs and Tabular Data

single-4K-volume tps

The spread of transactions/second irrespective of number of clients.

Box plot of transactions per second. Single 4K volume

Data grouped by number of concurrent clients with each bar representing an increase in 100 transactions per second ranging from 100 to 1,000.

Bar graph of transactions per second grouped by concurrent clients. Single 4K volume

Progression of tps by individual level of concurrency. The x-axis tick marks measure single pgbench runs from 100 transactions per client to 1,000 transactions per client.

Six subgraphs of transactions per second by each level of concurrency. Single 4K volume

Raw tabular data


four-4,000-PIOPs-volumes tps

Again, a box plot of the data with a y-axis of transactions/second.

Box plot of transactions per second. Four 4,000 PIOPs volumes

Grouped by number of concurrent clients between 100 and 1,000 transactions per client.

Bar graph of transactions per second grouped by concurrent clients. Four 4,000 PIOPs volumes

TPS by number of concurrent clients. The x-axis ticks mark pgbench runs progressing from 100 transactions per client to 1,000 transactions per client.

Six subgraphs of transactions per second by each level of concurrency. Four 4,000 PIOPs volumes

Tabular data m2.2xlarge with four 4,000 PIOPs EBS volumes


PalominoDB at an industry event near you!

Find the Palomino Team at an event near you in 2013!

New York’s Effective MYSQL Meet up Group, March 12, 2013

As New York’s only active MYSQL meetup group, NY Effective MYSQL states its purpose is to share practical education for MySQL DBAs, Developers and Architects.  At their next meeting on March 12 at 6:30 PM, Laine Campbell, CEO & Principal of PalominoDB, will be the evening’s presenter and her topic will be  "RDS Pitfalls. Ways it's going to screw you. (And not in the nice way)" Speaking from her own experience, Laine will explain the Amazon RDS offering, it's patterns and anti-patterns, and it's gotchas and idiosyncrasies.  To learn more about the NY group and Laine’s presentation, please click here.

NYC* Tech Day - Wednesday, March 20, 2013

Join NYC* Tech Day and take a deep dive into Apache Cassandra™, the massively scalable NoSQL database! This two-track event will feature over 14 interactive sessions, delivered by Apache Cassandra experts. Come see our CTO Jay Edwards at the Meet the Experts area. Or just drop by our table and talk to Jay and PDB staff. For more info click here!

Percona Live MySQL Conference in Santa Clara April 22-25, 2013

PalominoDB will once again host a booth at this year’s Percona Live event.   With 110 sessions and over 90 speakers, Percona promises to a fantastic event that you might not want to miss! This year several of Palomino’s own will be presenters. Read on....

In order of their appearance:

On the first day of the conference, April 22nd, Rene Cannao will kick off a full day tutorial beginning at 9:30 AM with part 1 of Ramp-Tutorial for MYSQL Cluster - Scaling with Continuous Availability.  After a lunch break, Rene will continue Part 2 beginning at 1:30.  Rene, a Senior Operational DBA at Palomino, will guide attendees through a hands-on experience in the installation, configuration management and tuning of MySQL Cluster.  To see the agenda of topics being offered during this exceptional offering, please click here.

Also on  April 22nd from 9:30 AM - 12:30 PM, Jay Edwards and Ben Black will be making an in-depth tutorial:  MYSQL Patterns in Amazon - Make the Cloud Work For You.  Jay and Ben will show you how to build your MySQL environment in the cloud -- how to maintain it -- how to grow it -- and how to deal with failure.  You may want to get there early to be sure you get a seat!  Want more info on this hot topic?  Check out more on this topic.

Meet our European Team lead, Vladimir Fedorkov on April 23rd at 2:20 PM where his topic will be MYSQL Query Anti-Patterns That Can Be Moved to Sphinx.  Vlad will be discussing how to handle query bottlenecks that can result from increases in dataset and traffic.  Click here to find out more.

Also on the 23rd, at 4:50 PM, Ben Black will be back to speak on MySQL Administration in Amazon RDS.  This should be a great session for those attendees new to this this tool,  as Ben will cover common tasks in RDS and gotchas for DBA's that are new to RDS.  Check out more on this topic.

On April 24th at 1PM Mark Filipi will present Maximizing SQL Reviews and Tuning with pt-query-digest.  pt-query-digest is one of the more valuable components of the Percona Toolkit, and as one of our operational DBAs, Mark will be approaching his topic with an eye to real world experiences.  Read more about it by following this link.

Also on the 24th at 1:50 PM, Ben Black and David Turner will tag-team the topic Online Schema Changes for Maximizing Uptime.  Together they will  cover common operations implemented in production and how you can minimize downtime and customer impact.  Here’s a link for more info on this.

PGCon May 2013

One of our Operational Database Administrators, Emanuel Calvo, will be a presenter at  PGCON 2013 Postgre SQL Conference on May 23, 2013 at 1 PM.  His topic will be Sphinx and Postgres - Full Text Search extension.  Emanuel will be discussing how to integrate these two tools for optimum performance and reliability. Want to learn more about Emanuel and the conference? Click Here

Velocity June 2013 Web Performance and Operation Conference

Velocity 2013 will be held in Santa Clara from June 18 through the 20th. Organizers tout this event  as “the best place on the planet for web ops and performance professionals like you to learn from your peers, exchange ideas with experts, and share best practices and lessons learned.” Palomino’s Laine Campbell and Jay Edwards will both be presenters on the conference’s opening day, June 18th.

Laine Campbell, PalominoDB’s CEO,  will be offering Using Amazon Web Services for MySQL at Scale on Tuesday, June 18 at 1:30 PM.  The session promises “a deep dive into the AWS offerings for running MySQL at scale.”  Laine will guide attendees through various options for running MySQL at high volumes at Amazon Web Services. More on this Laine’s presentation is available through this link.

Later that same day at 3:30 PM, Jay Edwards, Palomino’s CTO, will present  Managing PostgreSQL with Ansible in EC2.  He will discuss how and why Ansible is a next generation configuration management system that includes deployment and ad hoc support as well. Find out more on Jay’s topic and presentation here. 

Palomino DB is Hiring!

PalominoDB is looking for good people for the following positions:  Mid-level and senior MySQL DBAs in US, Asia/Pacific, and Europe, Devops engineers in US, Asia/Pacific and Europe, and BI architects and engineers.  Technologies are MySQL, Postgres, Cassandra, Couchbase, HBase/Hadoop, Amazon Web Services, Chef, Puppet, Ansible and most monitoring/trending solutions. We support open-source software, have a non-profit program and work virtually so you can work from anywhere.  If you know someone that might be interested, please do pass this along to them. Contact us today!

Couchbase Smart Client Failure Scenarios


The Couchbase java smart client communicates directly with the cluster to maintain awareness of data location. To do so it gathers information about the cluster architecture from a manually maintained configuration file listing all the nodes. The smart client configuration is done within the Java code and does not have a pre-designated file while the Moxi configuration is generally installed at /opt/moxi/etc/moxi-cluster.cfg

Assuming the smart client is on a separate server from the affected node there are two situations where communication between the client and a specific node might be interrupted.

In the first scenario, a node may fail. If so, the rest of the cluster will detect that from standard heartbeat checks, which are built in to Couchbase, and map its data to the replica nodes. The smart client is informed of the remappings and should be able to find all identified data again. There are known bugs with some client versions (e.g. 1.0.3) -- if you experience timeouts with the client, be sure you’re using the latest build. We also recommend that you use autofailover and that you test your email alerts. You must manually rebalance after recovery; this does not happen on its own.

In the second and more common scenario a network or DNS outage has occurred. If a node is unreachable by one or more clients, yet all nodes can still talk to that node, there is no built-in mechanism for the cluster to remap data from that node to other nodes.

Additionally, there is no built-in mechanism for the smart client to reroute traffic so you will experience timeouts in this situation.  When the network issue resolves the client should stop presenting errors.

Consider scripting a heartbeat check to run on your app servers that use the Couchbase CLI and specify failover procedures.

Nagios Check Calculated on Mysql Server Variables

Nagios Check For Calculating Based on Mysql Server Variables

Recently we needed to make a change for a client to one of our mysql monitoring tools so I thought it would be a good opportunity to highlight the tool and discuss some of the changes that I made.

You can access the tool on Github from our public repository here.

Before this change if you were using either the "varcomp" or "lastrun-varcomp" modes, it would only return a WARNING if your criteria for comparison were exceeded. In the new version, both WARNING and CRITICAL states can return to nagios. Here is an example: Let's say you want to alert on maximum connections, but the number of maximum connections is different for different hosts. Instead of writing distinct per/host checks, you can use this check to do a simple calculation and alert on the result of that calculation. In this case you want to send a warning when your connections exceeds 75% of max connections and alert when it reaches 80%. The entry in your nagios config file might look something like this: -H $HOSTADDRESS$ -u $ARG1$ -p $ARG2$ --mode varcomp --expression "Threads_connected / max_connections * 100" --comparison_warning="> '$ARG3$'" --comparison_critical "> '$ARG4$' --shortname percent_max_connections

Where the expression flag uses the names that are returned from any of "FULL PROCESSLIST", "ENGINE INNODB STATUS", "GLOBAL VARIABLES", "GLOBAL STATUS", or "SLAVE STATUS". The comparison_warning and comparison_critical flags are going to be evaluated in perl, so ensure that it is a valid perl expression (in this case you could use either > or -gt. You'll definitely need to test out your commands with a few different use cases to ensure you have good syntax. When a varcomp or lastrun-varcomp check is run, the results are kept in a local cache file against which you can make comparisons. So, to give a ridiculous example, if you want to ensure that the number of open table definitions didn't increase too much between samples, you could do something like this -H $HOSTADDRESS$ -u $ARG1$ -p $ARG2$ --mode lastrun-varcomp --expression "current{Open_table_definitions} - lastrun{Open_table_definitions}" --comparison_warning="> 10" --comparison_critical "> 20" --shortname increase_in_open_table_defs

Where the notation current{} and lastrun{} refer to which sample time you want. You can see more details on all the features of the script in the README on the plugin page. We welcome any comments and suggestions.

Put Opsview Hosts Into Downtime via the Shell

Recently a client of ours who used opsview to manage their resources needed to place some of their hosts into downtime in conjunction with some other cron-scheduled tasks. In order to implement that functionality, I created this simple script that should work with most installations of opsview, or with a few modifications, can be modified to be used with other, similar REST interfaces. To use, modify the 5 variables at the top of the script as necessary. The url and username are what come with the default installation of opsview. Modify CURL if it's in a different place on your system. Then, to use, for example: -p Pa5sw0rd -h host_name_in_opsview -c create -t 2 Where host_name is the hostname as defined in opsview, not necessarily the same as its actual hostname.
# create or delete downtime for a single host using opsview curl rest api
    echo "Usage: $0 -p <opsview apiuser password> -h <host> -c (create|delete) [-t <hours_of_downtime>]"
    exit 1
while getopts p:h:t:c: opt
    case $opt in 
      p) password=$OPTARG;;
      h) host=$OPTARG;;
      t) hours_of_downtime=$OPTARG;;
      c) command=$OPTARG;;
      \?) usage;;
if [ "x$password" = "x" ] || [ "x$host" = "x" ] || [ "x$command" = "x" ]
token_response=`$CURL -s -H 'Content-Type: application/json' https://$OPSVIEW_HOSTNAME/rest/login -d "{\"username\":\"$USERNAME\",\"password\":\"$password\"}"`
token=`echo $token_response | cut -d: -f 2 | tr -d '"{}'`
if [ ${#token} -ne 40 ]
    echo "$0: Invalid apiuser login. Unable to $command downtime."
    exit 1
if [ "$command" = "create" ]
    # create downtime - POST
    starttime=`date +"%Y/%m/%d %H:%M:%S"` 
    endtime=`date +"%Y/%m/%d %H:%M:%S" -d "$hours_of_downtime hours"`
    comment="$0 api call"
    result=`$CURL -s -H "Content-Type: application/json" -H "X-Opsview-Username: $USERNAME" -H "X-Opsview-Token: $token" https://$OPSVIEW_HOSTNAME$URL?host=$host -d "$data"`
    # delete downtime - DELETE
    result=`$CURL -s -H "Content-Type: application/json" -H "X-Opsview-Username: $USERNAME" -H "X-Opsview-Token: $token" -X DELETE https://$OPSVIEW_HOSTNAME$URL?$params`
echo "$result" | grep $host > /dev/null
if [ "$exit_status" -ne "0" ] || [ "$host_in_output" -ne "0" ]
  echo "Unable to $command downtime for $host.  Result of call:"
  echo $result
  exit 1

Benchmarking NDB vs Galera

Inspired by the benchmark in this post, we decided to run some NDB vs Galera benchmarks for ourselves.

We confirmed that NDB does not perform well using m1.large instances. In fact, it’s totally unacceptable -  no setup should ever have a minimum latency of 220ms - so m1.large instances are not an option. Apparently the instances get CPU bound, but CPU utilization never goes above ~50%. Maybe top/vmstat can’t be trusted in this virtualized environment?

So, why not use m1.xlarge instances? This sounds like a better plan!

As in the original post, our dataset is 15 tables of 2M rows each, created with:

./sysbench --test=tests/db/oltp.lua --oltp-tables-count=15 --oltp-table-size=2000000 --mysql-table-engine=ndbcluster --mysql-user=user --mysql-host=host1 prepare

Benchmark against NDB was executed with:

for i in 8 16 32 64 128 256


./sysbench --report-interval=30 --test=tests/db/oltp.lua --oltp-tables-count=15 --oltp-table-size=2000000 --rand-init=on --oltp-read-only=off --rand-type=uniform --max-requests=0 --mysql-user=user --mysql-port=3306  --mysql-host=host1,host2 --mysql-table-engine=ndbcluster --max-time=600 --num-threads=$i run > ndb_2_nodes_$i.txt


After we shutdown NDB, we started Galera and recreated the table, but found that running sysbench was failing. A suggestion from Hingo was to use --oltp-auto-inc=off, which worked.

Our benchmark against NDB was executed with:

for i in 8 16 32 64 128 256


./sysbench --report-interval=30 --test=tests/db/oltp.lua --oltp-tables-count=15 --oltp-table-size=2000000 --rand-init=on --oltp-read-only=off --rand-type=uniform --max-requests=0 --mysql-user=user --mysql-port=3306  --mysql-host=host1,host2 --mysql-table-engine=ndbcluster --max-time=600 --num-threads=$i --oltp-auto-inc=off run > galera_2_nodes_$i.txt


Below are the graphs of average throughput at the end of 10 minutes, and 95% response time.





Galera clearly performs better than NDB with 2 instances!

But things become very interesting when we graph the reports generated every 10 seconds.






Surprised, right? What is that?

Here we see that even if the workload fits completely in the buffer pool, the high number of TPS causes aggressive flushing.

We assume the benchmark in the Galera blog post was CPU bound, while in our benchmark the behavior is I/O bound.

We then added another 2 more nodes (m1.xlarge instances), but kept the dataset at 15 tables x 2M rows , and re-ran the benchmark with NDB and Galera. Performance on Galera gets stuck, due to I/O. Actually, with Galera, we found that performance on 4 nodes was worse than with 2 nodes; we assume this is caused by the fact that the whole cluster goes at the speed of the slower node.

Performance on NDB keeps growing as new nodes are added, so we added another 2 nodes for just NDB (6 nodes total).





The graphs show that NDB scales better than Galera, which is not what we expected to find.

It is perhaps unfair to say that NDB scales better than Galera, but rather that NDB checkpoint causes less stress on I/O than InnoDB checkpoint, thus the bottleneck is on InnoDB and not Galera itself. To be more precise, the bottleneck is on slow I/O.

The follow graph shows the performance with 512 threads and 4 nodes (NDB and Galera) or 6 nodes (only NDB). Data collected every 30 seconds.

"When the Nerds Go Marching In"

Palomino was honored to serve as part of the team of technologists on President Obama's re-election campaign. Atlantic Magazine ran a fascinating piece about Narwhal, the sophisticated data architecture that enabled the campaign to track voters, volunteers and online trends.

Palomino CEO Laine Campbell joined the team in Chicago for the final days of the campaign, ensuring maximum uptime and performance on the MySQL databases. Afterwards, President Obama thanked her for Palomino's contributions.

Syndicate content
Website by Digital Loom