MySQL bug #38551 reports that the change is incompatible with MySQL 5.1 . In this case the incompatibily is only related to how the server behaves: if Query Cache is disabled at startup it cannot be enable at runtime.
Thanks to Kenny Gryp from Percona for pointing this out and for the clarifications provided.
I'm rather passionate about what we do here at PalominoDB. When I tell people about what makes us different from the competition, I often discuss the proactive work we do. We are not a reactive company. We know that proactive reviews are what keep a database up and running smoothly, and we definitely want to prevent those late night pages that everybody loves. So what do we do to make this happen?
We start with daily health checks. These are aided by scripts, but include the Primary DBA reviewing the last 4 days of core cacti graphs (or a similar trending tool, as we require clients to have one) for anomalous behavior. Core indicates key metrics that point out workload shifts - CPU, IOWait, Load Average, Swapping, SQL Query Type Counters and InnoDB Row Activity. We verify backup logs are error free, and we make sure nothing has shown up in the MySQL error logs that proves unusual. Finally, we review the Nagios alerts of the last day, making sure criticals are followed up on, that nothing has been acknowledged and forgotten and that alerts are enabled.
Once a week, the primary DBA then reviews all cacti graphs, not just core ones. They review the dailies and ensure all items that have come up are being acted on and they verify tickets are not stalling.
Once a month, we do SQL Reviews of systems, unless we have been requested to do them more frequently based on release schedules. Using mk-query-digest, general logs with microsecond patches or tcpdumps, we put together a list of the top executed and the slowest queries and provide our clients with a list of recommended changes to indexes, datatypes, data model changes, caching suggestions and query rewrites. Once provided, we follow up regularly to ensure that crucial changes are being implemented in a timely manner.
Then we have activities that we do quarterly. Some clients prefer these be done more frequently and that is fine by us! These include:
Recovery tests if regular refreshes of other environments don't test this.
Ensuring tools are at up to date versions.
Ensuring runbooks and documentation are up to date.
Capacity reviews of existing workloads and hardware.
Security audits to ensure that changes since the last audit do not violate security policy.
Monitoring and trending audits to ensure that appropriate checks are in place and that all needed graphs are graphing.
This is where we start in our proactive service. The end is unlimited, only depending on our clients' needs. Any opportunity we can have to solve an issue before it is noticed is an opportunity that we relish and that our clients appreciate. Of course, the real test of our mettle is the number of pages waking us up in the middle of the night. If we have to wear our asbestos suits to work, then we are not doing our job!
Tonight a client restarted some instances on EC2 and they got different IPs.
This broke replication because the slaves were trying to connect to the masters using IPs that weren't existing any more.
To solve the task is easy: just CHANGE MASTER TO using the new IP (or the hostname).
Although there is a small gotcha, that probably you already know but it can be easy to forget.
From http://dev.mysql.com/doc/refman/5.1/en/change-master-to.html :
If you specify the MASTER_HOST or MASTER_PORT option, the slave assumes that the master server is different from before (even if the option value is the same as its current value.) In this case, the old values for the master binary log file name and position are considered no longer applicable, so if you do not specify MASTER_LOG_FILEand MASTER_LOG_POS in the statement, MASTER_LOG_FILE='' and MASTER_LOG_POS=4 are silently appended to it.
That is: CHANGE MASTER TO MASTER_HOST='new_ip_address' is NOT enough: actually, it will break replication because it will start replication from the first binlog.
You need to specify all the parameters required to setup replication, like master_user, master_password, master_log_file and master_log_pos .
To get replication position (where replication stopped) you can either check the output of SHOW SLAVE STATUS looking for Master_Log_File and Read_Master_Log_Pos, or the error log.
A side note for the curious ones: there was no downtime, as the master/slave are really a master/master pair. The "slave" failing to connect to the "master" was the write master trying to connect to the stand-by master.
MySQL (lighter) - http://gitub.com/oldmoe/mysqlplus, http://github.com/brianmario/mysql2
Drizzle (lightest) - http://www.drizzle.org
What makes it columnar? well, a primary key is really a row key, and then you have column families, which are columns, stored together. (each column's values are stored together as opposed to the row being stored together.) You can set expiry for a column family too, after which the data expires (which is why it's great for Google).
HBase - http://hbase.apache.org - Google's BigTable implementation, which was born of Hadoop (Java mapreduce engine). If you want to use HBase in production, use Thrift (http://thirft.apache.org) which Cassandra also uses). This is CP, but configurable to AP. Does sequential reads and column versioning, strong but flexible columnar schema.
Cassandra - hybrid. Node architecture like dynamo - data structure like BigTable w/column families - http://cassandra.apache.org - Good for hundreds of nodes in the same data center, if there is more than that or different data centers, use HBase (that's what Digg and Facebook are running into). In cassandra you set up your schemas with an XML file, not with DDL. Benefits - sequential reads of ordered keys, also has versioning. It's AP, configurable to CP.
MongoDB (AP focused - master/slave)
http://www.mongodb.org - created to be huge (huMONGous). Made to be partitioned, distributed, needed ad hoc queries. Wasn't built to be durable.
Not made to be distributed, originally, was meant to be very durable. AP focused (master/master)
MapReduce in Mongo is an ad hoc query, comfortable for relational db ppl. In CouchDB, you make views and then request data from those views.
Riak - The most "architecturally cool" database out there. It's a dynamo implementation that is purely REST based. It's a key-value store, but it's not descriptive enough -- it has map-reduce built in, metadata and links you can walk. You can store ANYTHING in riak -- not just text. example: getting a JPG file from the web and putting it as the value for the key "firefox.jpg". Neat demo.
Riak has a ring, eventual consistency, can pull nodes in and take nodes out, without having to invalidate all the ids. It has quorum consistency, which blows Eric's mind, but we didn't have
memcached - don't use it
Kyoto Cabinet - don't use it
Redis - use it - http://redis.io - it can handle lists, hashes, can intersect the value of 2 keys (such as person and pet, to find out who owns which set).
Graph datastores - you walk the graph instead of querying or doing mapreduce.
FlockDB - distributed, "unless you're twitter, you don't need to use it". It's not really possible to distribute a graph database, you can't walk it and do node traversals, you can just walk edges (you can do friends, but not friends of friends, etc).
Slides are available at https://github.com/coderoshi/holy-grail-dbs
Hacker Lounge - open 22 hours a day (closed between 6 and 8 fit cleaning).
Session notes link on each page, takes you to thewiki.
#osb11 is the official hashtag
Friday is unconference (I won't be here on Friday).
What is open source citizenship? Nominate people for open spice citizenship awards by the registration desk.
Now the keynote, hacking for freedom: join the doocracy by Peter Fein.
Hacker, hacktivist, names are important, names have power. Peter uses the word "hack" meaning doing something clever with technology. "if we cannot see, we cannot act."
Do-ocracy. You. Just. Do.
Telecomix (http://www.telecomix.org/) built mirrors to blocked sites, help folks tweet. They are yin to anonymous' yang - they build and facilitate communication, not destroy and take down sites.. They go lower tech when they need to also, setting up fax and reverse fax lines for example.
The last 6-8months have shown how dramatic the Internet can be for freedom fighting. Egypt has pulled fiber optic cable to censor the 'net, but in the Western world we have censorship and it's more subtle. It's not just wikileaks, there is a PROTECT-IP act going through US Congress that would allow Homeland Security to seize domain names under the guise of copyright issues.
Communicate organize resist, then we become subjects.
"If we can't communicate, we can't organize. If we can't organize, we can't resist. If we can't resist, we become subjects [not citizens]."
We are seeing a sort of info war of nation-states vs the internet. "the Internet didn't cause the protests in the Middle East [North Africa, really], but it did facilitate them.". For me, though, Napster and Limewire did not cause piracy, they just facilitated piracy, but both companies no longer exist.....
Peter introduces the concept of a "disorganization" and different tactics to organize volunteers.
Radical openness - Anonymous chat rooms can be found by searching Google.
Adhocracy - put a team together todo a certain task.
At the end of the day, what matters is what you get done,and that is the core principle of the doocracy.
He points to Critical Mass as a disorganization - there's no hierarchy, no centralized leadership, people just go out and do. He points to Burning Man as another one, and Open Source Bridge as yet another. People show up, they setup stuff they ask what they can do to help.
The he plays a clip from Lawrence Lessig speaking at Oscon 2002 about copyright: "If you don't do something now...this freedom will be taken away.....If you can't fight for your freedom, you don't deserve it."
Overall, the keynote was good, but for me, was not particularly informational, inspirational, nor entertaining. I enjoyed it, and it reminded me of a lot of stuff that has been happening, but nothing ideally had forgotten. Maybe that's because I already subscribe to the doocracy way? I can see how it might spark a non-doer to go out and do, but that was not the effect it had on me.
An interesting start to a conference that has less in the way of mysql content, and more of the "soft" topics, which really are important to me.
This post has sample configurations, and an update, to the Nagios plugin called mysql_health_check.pl that PalominoDB has developed. Presentation slides, the presentation video, the whitepaper and the download link to the actual Nagios check are always up-to-date at http://palominodb.com/about-us/projects.
There has been exciting news for the Nagios plugin PalominoDB developed for MySQL -- we now have --critical and --warning options for all modes (for --varcomp mode we only had a critical option before).
Here are some sample configurations (which are also seen in the slides):
[edited June 8th to correct typo; the figure given is per YEAR, not per month]
Efficiency in Time and Cost -
The average salary for a full-time DBA is $80-120K per year, not counting payroll and benefit costs. You can retain PalominoDB for your senior needs at only $37,200 a year based on a 20 hour retainer. This leaves the balance for you to invest in a staff of flexible, competent, systems administrators who can handle day to day tasks. A retainer with PalominoDB includes access to our senior and principal personnel, engineers, and systems administrators. Whatever your need, PalominoDB can fulfill it. As you have access to an entire team, you are never left short-handed on holidays or during flu season...and we will never leave you for another company.
Team integration -
PalominoDB isn’t just a remote database services company. From day one, we jump in and work as hard as we can to become part of your team. We don’t sit in the wings, waiting for you to give us work or for your systems to page us. We plug into your team’s IRC room and get into your monitoring systems. We provide constant communication through the use of instant messaging and email. We attend your team meetings or we set up meetings with you - and we ensure that they are efficient meetings. Our company integrates into your systems rather than forcing you to integrate with ours. We are confident you will come to see us as a part of your full-time workforce rather than a group of consultants.
Proactive service -
We review systems daily. We look for anomalies in workloads. We check error logs, and we ensure successful backups. We research the benefits of upgraded binaries, new parameters and technologies, and how they can be applied for our clients. If you aren’t actively generating work for us, we use those hours for proactive work - SQL Reviews, Capacity Reviews and Backup/Recovery tests. We understand how busy small businesses can be, and we make sure to keep tickets moving. We do not require babysitting; it is our job to free up your time so you can focus on the rest of your infrastructure. We can even provide help and recommendations on the rest of your environments.
Extensive Experience -
The professionals at PalominoDB have been managing production systems since before the dot com era. We’ve seen explosive growth and scaled companies through years, not months, of development. Our experience is not just with particular technologies, but with operational process, change management, incident and problem management, documentation, and configuration management. We see the big picture, and work in it using our breadth of experience to do so. And bottom line, we follow our mantra of three key principles: Keep production safe, keep production simple, and document and share everything.