Blog

Turning off the Query Cache works in Percona Server 5.1 , before MySQL Server 5.5

In a previous blog post I wrote that is not possible to completely turn off the Query Cache in MySQL before 5.5 .

Although, it is possible to completely turn off the Query Cache with Percona Server 5.1 .

MySQL bug #38551 reports that the change is incompatible with MySQL 5.1 . In this case the incompatibily is only related to how the server behaves: if Query Cache is disabled at startup it  cannot be enable at runtime. 

Thanks to Kenny Gryp from Percona for pointing this out and for the clarifications provided.

You can't turn Query Cache Off before MySQL 5.5

 

... or at least, not completely!

It often happens to see and hear of replication SQL thread almost always in state "invalidating query cache entries (table)" .

The classic tip is "set global query_cache_size=0" .

While sometime this works, others don't; the query cache is “disabled” and you still see the SQL thread in state "invalidating query cache entries (table)".

That is related to a known bug, fixed in MySQL 5.5 :

http://bugs.mysql.com/bug.php?id=38551

In short, in all MySQL versions before 5.5 , the query cache mutex is acquired even if query_cache_size=0 and query_cache_type=OFF : always! 

That is, even if the query cache is not enabled, the mutex (slow, by nature) to access the (not existing) query cache is acquired for every binlog event. 

The only way to not acquire the query cache mutex in MySQL pre-5.5 is to compile MySQL without query cache. No other workarounds!

For MySQL 5.5 , to completely disable the query cache (thus, not acquiring the query cache mutex) is required to set query_cache_type=OFF at startup (not at runtime). 

That also means that if query_cache_type=OFF at startup, you need to restart MySQL if later you want to enable the query cache (you can't enable it at runtime).

The query mutex becomes a very hot spot especially if you are using RBR. In fact, with RBR there is a binlog event for each modified row, thus the mutex is acquired for every row.

 

References:

http://bugs.mysql.com/bug.php?id=38551

http://dev.mysql.com/doc/refman/5.1/en/server-system-variables.html#sysvar_query_cache_type

http://dev.mysql.com/doc/refman/5.5/en/server-system-variables.html#sysvar_query_cache_type

PalominoDB's Proactive Support

I'm rather passionate about what we do here at PalominoDB.  When I tell people about what makes us different from the competition, I often discuss the proactive work we do.  We are not a reactive company.  We know that proactive reviews are what keep a database up and running smoothly, and we definitely want to prevent those late night pages that everybody loves.  So what do we do to make this happen?  

We start with daily health checks.  These are aided by scripts, but include the Primary DBA reviewing the last 4 days of core cacti graphs (or a similar trending tool, as we require clients to have one) for anomalous behavior.  Core indicates key metrics that point out workload shifts - CPU, IOWait, Load Average, Swapping, SQL Query Type Counters and InnoDB Row Activity.  We verify backup logs are error free, and we make sure nothing has shown up in the MySQL error logs that proves unusual.  Finally, we review the Nagios alerts of the last day, making sure criticals are followed up on, that nothing has been acknowledged and forgotten and that alerts are enabled.

Once a week, the primary DBA then reviews all cacti graphs, not just core ones.  They review the dailies and ensure all items that have come up are being acted on and they verify tickets are not stalling.

Once a month, we do SQL Reviews of systems, unless we have been requested to do them more frequently based on release schedules.  Using mk-query-digest, general logs with microsecond patches or tcpdumps, we put together a list of the top executed and the slowest queries and provide our clients with a list of recommended changes to indexes, datatypes, data model changes, caching suggestions and query rewrites.  Once provided, we follow up regularly to ensure that crucial changes are being implemented in a timely manner.

Then we have activities that we do quarterly.  Some clients prefer these be done more frequently and that is fine by us!  These include:

  • Recovery tests if regular refreshes of other environments don't test this.
  • Ensuring tools are at up to date versions.
  • Ensuring runbooks and documentation are up to date.
  • Capacity reviews of existing workloads and hardware.
  • Security audits to ensure that changes since the last audit do not violate security policy.
  • Monitoring and trending audits to ensure that appropriate checks are in place and that all needed graphs are graphing.

This is where we start in our proactive service.  The end is unlimited, only depending on our clients' needs.  Any opportunity we can have to solve an issue before it is noticed is an opportunity that we relish and that our clients appreciate.  Of course, the real test of our mettle is the number of pages waking us up in the middle of the night.  If we have to wear our asbestos suits to work, then we are not doing our job!  

Replication and EC2 gotcha

 

Tonight a client restarted some instances on EC2 and they got different IPs.

This broke replication because the slaves were trying to connect to the masters using IPs that weren't existing any more.

To solve the task is easy: just CHANGE MASTER TO using the new IP (or the hostname).

 

Although there is a small gotcha, that probably you already know but it can be easy to forget.

 

From http://dev.mysql.com/doc/refman/5.1/en/change-master-to.html :

If you specify the MASTER_HOST or MASTER_PORT option, the slave assumes that the master server is different from before (even if the option value is the same as its current value.) In this case, the old values for the master binary log file name and position are considered no longer applicable, so if you do not specify MASTER_LOG_FILEand MASTER_LOG_POS in the statement, MASTER_LOG_FILE='' and MASTER_LOG_POS=4 are silently appended to it.

 

That is: CHANGE MASTER TO MASTER_HOST='new_ip_address' is NOT enough: actually, it will break replication because it will start replication from the first binlog.

You need to specify all the parameters required to setup replication, like master_user, master_password, master_log_file and master_log_pos .

 

To get replication position (where replication stopped) you can either check the output of SHOW SLAVE STATUS looking for Master_Log_File and Read_Master_Log_Pos, or the error log.

 

 

A side note for the curious ones: there was no downtime, as the master/slave are really a master/master pair. The "slave" failing to connect to the "master" was the write master trying to connect to the stand-by master. 

Liveblogging: A dozen databases in 45 minutes

 

Actually, the slide is 12 databases in 25 minutes (and 20 minutes of theory) by Eric Redmond (@inviite).  

Complex data:  A lot of data isn't really complex, it's just modeled in a complex way.

"Complexity is a symptom of confusion, not a cause" Jeff Hawkins.

 

NoSQL

  Linear Scalability

  Ability to be Distributed

  Low Latency

 

SQL 

  Not NoSQL

 

ACID (transaction-based)

  Atomic - Transactions are "all or nothing"

  Consistent - system data will have integrity

  Isolated - Transactions can't see each other

  Durability - changes aren't lost

 

BASE (request based)

  Basically Available

  Soft state

  Eventual consistency

Redmond says: ACID is negative (avoid negative things), BASE is mostly positive, things will be good, not great.

CAP theorem - it's a fact :D

Consistent, Available, Partition-tolerant web services.  "It is impossible to reliably provide atomic, consistent data when there are partitions in the network.  It is feasible, however, to achieve

Note that "consistent" is not the same as "consistent" in ACID, it's more like Atomicity.

Strong consistency: when an update completes, subsequent access returns the new result.  [to my mind this is actually durability]

Weak consistency - eventual consistency

"Correct consistency" - is the most important part.  DNS, for example, is eventual consistency.

Common patterns:

Replication

 CouchDB has an amazing ability to do this, Mongo is also good but not as good.

- copying data amongst nodes in a distributed database.  Lazy (optimistic) replication, gossip (nodes communicate to stay in sync). - master/slave (mongo)

- master/master (riak, couch)

  - vector clocks (keep track of write order per client

  - mvcc (mysql) 

N/R/W

  N - Nodes to write to (per bucket)

  R - Nodes read from before success

  W - Nodes written to before success

Amazon Dynamo does this (Cassandra and Riak do this) - supports both CP and AP in one db (from the CAP theorem)

Consistent Hashing

Balance your servers, and when you hash your keys, if a server goes down or is added you don't have to rebalance ALL nodes, just some % of them.

Mapreduce

 

Relational Models:

"Nothing beats relational databases for raw queryability."  The tradeoff -- you have to structure your data and tell the system how it is structured.

PostgreSQL (full featured) - http://bitbucket.org/ged/ruby-pg, http://github.com/Casecommons/pg_search, http://github.com/tenderlove/texticle

MySQL (lighter) - http://gitub.com/oldmoe/mysqlplus, http://github.com/brianmario/mysql2

Drizzle (lightest) - http://www.drizzle.org

 

Bigtable/Columnar Style

 

What makes it columnar?  well, a primary key is really a row key, and then you have column families, which are columns, stored together.  (each column's values are stored together as opposed to the row being stored together.)  You can set expiry for a column family too, after which the data expires (which is why it's great for Google).

HBase - http://hbase.apache.org - Google's BigTable implementation, which was born of Hadoop (Java mapreduce engine).  If you want to use HBase in production, use Thrift (http://thirft.apache.org) which Cassandra also uses). This is CP, but configurable to AP.  Does sequential reads and column versioning, strong but flexible columnar schema.

Cassandra - hybrid.  Node architecture like dynamo - data structure like BigTable w/column families - http://cassandra.apache.org - Good for hundreds of nodes in the same data center, if there is more than that or different data centers, use HBase (that's what Digg and Facebook are running into).  In cassandra you set up your schemas with an XML file, not with DDL.  Benefits - sequential reads of ordered keys, also has versioning.  It's AP, configurable to CP.

 

Documentation Datastores:

MongoDB (AP focused - master/slave)

http://www.mongodb.org - created to be huge (huMONGous).  Made to be partitioned, distributed, needed ad hoc queries.  Wasn't built to be durable.

 

CouchDB

Not made to be distributed, originally, was meant to be very durable.  AP focused (master/master)

http://couchdb.apache.org

http://tilgovi.github.com/couchdb-lounge (clustering)

MapReduce in Mongo is an ad hoc query, comfortable for relational db ppl.  In CouchDB, you make views and then request data from those views.  

Riak - The most "architecturally cool" database out there.  It's a dynamo implementation that is purely REST based.  It's a key-value store, but it's not descriptive enough -- it has map-reduce built in, metadata and links you can walk.  You can store ANYTHING in riak -- not just text.  example: getting a JPG file from the web and putting it as the value for the key "firefox.jpg".  Neat demo.

Riak has a ring, eventual consistency, can pull nodes in and take nodes out, without having to invalidate all the ids.  It has quorum consistency, which blows Eric's mind, but we didn't have 

 

Key/value stores

memcached - don't use it

Kyoto Cabinet - don't use it

Redis - use it - http://redis.io - it can handle lists, hashes, can intersect the value of 2 keys (such as person and pet, to find out who owns which set).

 

Graph datastores - you walk the graph instead of querying or doing mapreduce.

Neo4j

FlockDB  - distributed, "unless you're twitter, you don't need to use it".  It's not really possible to distribute a graph database, you can't walk it and do node traversals, you can just walk edges (you can do friends, but not friends of friends, etc).

Slides are available at https://github.com/coderoshi/holy-grail-dbs

 

 

Open Source Bridge, day 1 keynote

Hacker Lounge - open 22 hours a day (closed between 6 and 8 fit cleaning). Android app Session notes link on each page, takes you to thewiki. #osb11 is the official hashtag Friday is unconference (I won't be here on Friday). What is open source citizenship? Nominate people for open spice citizenship awards by the registration desk. Now the keynote, hacking for freedom: join the doocracy by Peter Fein. Hacker, hacktivist, names are important, names have power. Peter uses the word "hack" meaning doing something clever with technology. "if we cannot see, we cannot act." Do-ocracy. You. Just. Do. Telecomix (http://www.telecomix.org/) built mirrors to blocked sites, help folks tweet. They are yin to anonymous' yang - they build and facilitate communication, not destroy and take down sites.. They go lower tech when they need to also, setting up fax and reverse fax lines for example. The last 6-8months have shown how dramatic the Internet can be for freedom fighting. Egypt has pulled fiber optic cable to censor the 'net, but in the Western world we have censorship and it's more subtle. It's not just wikileaks, there is a PROTECT-IP act going through US Congress that would allow Homeland Security to seize domain names under the guise of copyright issues. Communicate organize resist, then we become subjects. "If we can't communicate, we can't organize. If we can't organize, we can't resist. If we can't resist, we become subjects [not citizens]." We are seeing a sort of info war of nation-states vs the internet. "the Internet didn't cause the protests in the Middle East [North Africa, really], but it did facilitate them.". For me, though, Napster and Limewire did not cause piracy, they just facilitated piracy, but both companies no longer exist..... Peter introduces the concept of a "disorganization" and different tactics to organize volunteers. Radical openness - Anonymous chat rooms can be found by searching Google. Adhocracy - put a team together todo a certain task. At the end of the day, what matters is what you get done,and that is the core principle of the doocracy. He points to Critical Mass as a disorganization - there's no hierarchy, no centralized leadership, people just go out and do. He points to Burning Man as another one, and Open Source Bridge as yet another. People show up, they setup stuff they ask what they can do to help. The he plays a clip from Lawrence Lessig speaking at Oscon 2002 about copyright: "If you don't do something now...this freedom will be taken away.....If you can't fight for your freedom, you don't deserve it." --------- Overall, the keynote was good, but for me, was not particularly informational, inspirational, nor entertaining. I enjoyed it, and it reminded me of a lot of stuff that has been happening, but nothing ideally had forgotten. Maybe that's because I already subscribe to the doocracy way? I can see how it might spark a non-doer to go out and do, but that was not the effect it had on me. An interesting start to a conference that has less in the way of mysql content, and more of the "soft" topics, which really are important to me.

PalominoDB Nagios Plugin for MySQL

This post has sample configurations, and an update, to the Nagios plugin called mysql_health_check.pl that PalominoDB has developed.  Presentation slides, the presentation video, the whitepaper and the download link to the actual Nagios check are always up-to-date at http://palominodb.com/about-us/projects.

There has been exciting news for the Nagios plugin PalominoDB developed for MySQL -- we now have --critical and --warning options for all modes (for --varcomp mode we only had a critical option before).

 

Here are some sample configurations (which are also seen in the slides):

To check rate of temporary tables:

define command {
 command_name    check_mysql_tmp_tables
 command_line    $USER1$/mysql_health_check.pl
--hostname $HOSTADDRESS$ --user myuser --password mypass 
--cache-dir=/var/lib/nagios/mysql_cache
--max-cache-age=300 --mode=varcomp 
--expression="Created_tmp_tables/Uptime_since_flush_status" 
--warning=">(8/60)" --critical=">(10/60)"
}
define service {
      use                      generic-service
      host_name                __HOSTNAME__
      service_description      MySQL Temp Tables
      check_command            check_mysql_tmp_tables
}
To check long-running queries (and an example of using a parameter, setting 2 different values one for master servers and one for slave servers):
define command {
 command_name    check_mysql_locked_queries
 command_line    $USER1$/mysql_health_check.pl
--hostname $HOSTADDRESS$ --user myuser --password mypass 
--cache-dir=/var/lib/nagios/mysql_cache
--max-cache-age=300 
--mode=locked-query  
--warning=$ARG1$ --critical=$ARG2$
 }
define service {
      use                      generic-service
      host_name                __HOSTNAME__
      service_description      MySQL Master Locked Queries
      check_command            check_mysql_locked_queries!60!120
}

 


define service {
      use                      generic-service
      host_name                __HOSTNAME__
      service_description      MySQL Slave Locked Queries
      check_command            check_mysql_locked_queries!300!600
}

If there are questions, comments or feedback, please let us know!

Why use PalominoDB?

 [edited June 8th to correct typo; the figure given is per YEAR, not per month]

Why PalominoDB:

Efficiency in Time and Cost - 

The average salary for a full-time DBA is $80-120K per year, not counting payroll and benefit costs.  You can retain PalominoDB for your senior needs at only $37,200 a year based on a 20 hour retainer.  This leaves the balance for you to invest in a staff of flexible, competent, systems administrators who can handle day to day tasks. A retainer with PalominoDB includes access to our senior and principal personnel, engineers, and systems administrators.  Whatever your need, PalominoDB can fulfill it.  As you have access to an entire team, you are never left short-handed on holidays or during flu season...and we will never leave you for another company.
 
Team integration -

PalominoDB isn’t just a remote database services company.  From day one, we jump in and work as hard as we can to become part of your team.  We don’t sit in the wings, waiting for you to give us work or for your systems to page us.  We plug into your team’s IRC room and get into your monitoring systems.  We provide constant communication through the use of instant messaging and email.  We attend your team meetings or we set up meetings with you - and we ensure that they are efficient meetings.  Our company integrates into your systems rather than forcing you to integrate with ours.  We are confident you will come to see us as a part of your full-time workforce rather than a group of consultants.

Proactive service -

We review systems daily.  We look for anomalies in workloads.  We check error logs, and we ensure successful backups. We research the benefits of upgraded binaries, new parameters and technologies, and how they can be applied for our clients.  If you aren’t actively generating work for us, we use those hours for proactive work - SQL Reviews, Capacity Reviews and Backup/Recovery tests.  We understand how busy small businesses can be, and we make sure to keep tickets moving.  We do not require babysitting; it is our job to free up your time so you can focus on the rest of your infrastructure.  We can even provide help and recommendations on the rest of your environments.

Extensive Experience -

The professionals at PalominoDB have been managing production systems since before the dot com era.  We’ve seen explosive growth and scaled companies through years, not months, of development.  Our experience is not just with particular technologies, but with operational process, change management, incident and problem management, documentation, and configuration management.  We see the big picture, and work in it using our breadth of experience to do so.  And bottom line, we follow our mantra of three key principles:  Keep production safe, keep production simple, and document and share everything.

SkySQL Reference Architecture and What's New in MySQL 5.5 videos

If you missed Percona Live yesterday, you missed an awesome 1-day conference.  However, to solace you, here are a few videos from Open Database Camp:

SkySQL Reference Architecture - by Kaj Arnö of SkySQL

What's New in MySQL 5.5 - by Giuseppe Maxia of Continuent - slides

Enjoy, and look forward to more information from Percona Live next week!  If you are in Europe, get ready for Percona Live in London in October, exact dates are forthcoming.

Beginner's Guide to MariaDB and mysqltuner 2.0 videos now online!

I am currently on my way to Percona Live in NYC.  For those that cannot be there, I have posted some new videos from Open DB Camp in Sardinia, Italy earlier this month.  

It will not be the same as being there in person, but there is a lot of information in the video for A Beginner's Guide to MariaDB by Colin Charles of the Monty Program.  I have also posted the video for mysqltuner 2.0 that I presented (PDF slides are here).

Enjoy, and if you are attending Percona Live, I will see you there and much MySQL goodness will be had!

Syndicate content
Website by Digital Loom