MySQL

Video: MySQL 5.5 Overview by Bill Papp

 

On Monday at the February 2011 Boston MySQL User Group Meeting, Bill Papp of Oracle shared a presentation about what's new in MySQL 5.5.  The video is already up on YouTube and is also embedded at the end of this post.

 

The slides are available as a PDF from http://technocation.org/files/doc/2011_02_14_MySQL_55_Overview.pdf

 

 

The partitioning article Giuseppe wrote that is mentioned at the very end of the talk is at http://dev.mysql.com/tech-resources/articles/mysql_55_partitioning.html

 

 

 

Video:

MySQL 5.5 explained at the Boston MySQL User Group on January 10th, 2011

Next week, Monday January 10th 2011, MySQL's Bill Papp will be speaking about MySQL 5.5.  This is not to be missed!  Pizza and soda will be served, so please RSVP if you plan on attending.  Attendance is free.

If you are near the Boston area, you can RSVP for the meeting at http://www.meetup.com/mysqlbos/calendar/15169986/.  That URL also has all the details about the location (an MIT classroom).  To RSVP anonymously, please login to the Meetup site with the e-mail address "admin at sheeri dot com" and the password "guest".

 

And yes, we will be video recording the session, so if you are too far away to attend you can still see the presentation.  If you have any specific questions about MySQL 5.5 that are not answered by either http://dev.mysql.com/tech-resources/interviews/thomas-ulin-mysql-55.html or http://dev.mysql.com/tech-resources/articles/introduction-to-mysql-55.html feel free to leave a comment here and I will make sure to ask the question of Bill.

 

Video: MySQL Cluster Overview by Bill Papp

 

At the December 2010 Boston MySQL User Group meeting a few days ago, Oracle/MySQL sales engineer Bill Papp gave a presentation on 'An Overview of MySQL Cluster'.

Slides are online at http://technocation.org/files/doc/2010_12MySQLClusterOverview.ppt

The video is on YouTube at http://www.youtube.com/watch?v=-XLc8OA8UOY.

Here is the description Bill gives of this presentation:

Now, more than ever, databases have to scale linearly to handle high spikes in usage, deliver 24 x 7 continuous availability and low latency. At the same time, many workloads need to maintain the data consistency delivered by transactional integrity.

MySQL Cluster is the industry's leading real-time, write-scalable transactional database, combining 99.999% availability with the low TCO of open source. With a distributed shared-nothing architecture and no single point of failure, MySQL Cluster can scale linearly to meet the unprecedented demands of next generation web, e-commerce, and telecommunications / network services.

Typical use cases in web services include:

* User profile management for service authentication and authorization

* eCommerce (customer and order data, payment processing, product inventory tracking, etc.)

* Web Session Management

* Social Networks (user profiles, authentication, real-time status updat! es, etc.)

* On-Line Content Delivery (i.e. user data, content catalogs, subscription processing & management)

* Near Real-Time Data Analysis (using Geographic Replication to update MyISAM, InnoDB, etc. tables with real-time transactional updates from MySQL Cluster) 

 

MySQL 5.5 is GA!

Today it was announced that MySQL 5.5 is GA.  This is very exciting news; we have not had a GA announcement since MySQL 5.1 two years ago.  There are tons of new features, mostly performance enhancements, that I am looking forward to.  Here is why I am excited about MySQL 5.5:
  • First and foremost:  the fact that it exists in a GA release.  [Note: a MySQL/Sun/Oracle timeline follows.  To skip this and go straight to the actual features in MySQL 5.5 I am excited about, click here.]  Sun bought MySQL in January 2008, and said it was not going to change anything in the first year.  MySQL 5.1 was released in November of 2008, so presumably that was on target with what MySQL AB would have done had they not been bought out by Sun.
A few months later, in April 2009, a "preview" version of MySQL 5.4 was announced.  Simultaneously, it was announced that Oracle was buying Sun (and thus, was acquiring MySQL).  This acquisition did not actually happen until January 2010, however, so all of the development in 2009 was still under Sun.
Then in April 2010, a year after the MySQL 5.4 announcement, we were again teased, this time with an announcement of a MySQL 5.5 beta version.  We had heard this song and dance just a year ago, and all we knew at that point was that we had absolutely no confidence in what the version number of the next GA release would actually be.  The official word was "MySQL 5.5" but I was feeling "I'll believe it when I see it".
So I am particularly happy to see Oracle doing good work in the MySQL sphere.  As much as Oracle has a lot to learn about the open source road and has some tough times ahead of them in that regards, I am glad that at least, for now, Oracle is getting MySQL right.

The New Features

  • Better InnoDB I/O usage - One of the biggest advantages of the Percona build is getting more background threads.  In MySQL 5.1 and earlier, there is one background thread each for reading and writing.   In MySQL 5.5, you can configure how many you want.  The default is 4 of each, the manual says they are numeric but does not show a range of what the permitted values are (though I believe the range is 1-64, based on the Percona build documentation).  The variable names are innod_read_io_threads and innodb_write_io_threads; they are global and static, so changes require a reboot.
  • innodb_io_capacity allows you to change the number of iops (i/o per second).  Previously the value was compiled in at 100 iops.  In MySQL 5.5 it defaults to 200 but it can be changed.  The manual shows the minimum value is 100, but does not have a max value (the Percona build documentation shows a max value of 999999999).  Global and dynamic.
  • The default storage engine for new tables is InnoDB rather than MyISAM.
  • There is support for an interface for an optional semisynchronous replication plugin.  Semisynchronous replication means that a master will not consider a commit completed until at least one slave shows that it has logged the transaction.
  • The performance schema, a new feature for monitoring MySQL Server execution at a low level.  The performance_schema is off by default and is a global static variable, meaning it needs a restart to be turned on.  The manual states that the performance_schema is low overhead, though I am curious to see exactly what the overhead is.
What excites me about the performance schema is that up until now almost all of the internal monitoring for MySQL is counters showing very basic information.  The exceptions to this (for me) are some of the numbers in SHOW ENGINE INNODB STATUS and Jeremy Cole's* SHOW PROFILE/S syntax.  The performance schema has crazy levels of detail similar to SHOW PROFILE/S, but on a global server level.  This is definitely going to challenge me and help me be a better DBA!
  • Partitioning is much improved.  This is not surprising since partitioning was new in 5.1, and not as feature-rich as I would have liked it to be (although even in the 5.1 tree it has improved greatly).  Specifically they have added two new partitioning types RANGE COLUMNS and LIST COLUMNS.  These enable optimized partitioning of ranges and lists based on dates, datetimes, and strings.  The partitioning key for ranges/lists can be based on up to 16 fields, and there is partition pruning optimization when the different columns in the partitioning key are compared to constants.  
For example, if you have a table partitioned by the RANGE COLUMN type on a date and a username, a query that is looking for rows from a particular username (string) after a particular date will be optimized.  Read more at http://dev.mysql.com/doc/refman/5.5/en/partitioning-columns.html.

 

  • Pluggable MySQL Authentication - MySQL 5.5 supports plugin development for external authentication.  I would be much more excited about this if there were actually plugins developed for authentication via LDAP, Kerberos, PAM and Active Directory.  I am fairly sure that given the strength of the MySQL community plugins will be developed fairly quickly, though. 
  • Proxy users allow one person to have the permissions of another.  This is important to be able to grant permissions to externally authorized users.  Part of me thinks that the MySQL ecosystem will come up with clever hacks using this new feature.
One note:  Lots of deprecated variables have been removed, so check your error logs if MySQL won’t start up after upgrade.  For example, table_type has been removed in favor of storage_engine.  Similarly the TYPE option for CREATE/ALTER table will not work any more, which might mean your internal scripts might not work.  LOAD TABLE....FROM MASTER is gone (it was MyISAM only anyway) and BACKUP/RESTORE TABLE is gone (very very old).  
The old, non-secure “put the replication parameters in the my.cnf” days are gone because --master-host, -user, -password, -port etc are removed after being deprecated for a while.
Some tools also have removed deprecated features, including mysqlbinlog, mysqldump and mysqld_multi.
All in all I am extremely excited about this release.  I wonder if a lot of the improved performance is adding in the features implemented in the Percona build; I also wonder if Percona is going to come out with a MySQL 5.5 build.
* Apparently, Jeremy Cole is now a MySQL Database Architect for Twitter (starting Nov 15 2010).  

Video: Database Change Management by Jacob Nikom

At the November 2010 Boston User Group, Jacob Nikom presented his solution for database change management using MySQL. This system is designed to keep a repository of tables and data that an application needs to function. For example, an application might need at least one administrative user, or a table with a list of countries. This is NOT a tool for database configuration management.

The slides are available at: http://technocation.org/files/doc/CatalogVersionTool_MySQL_User_Group_02.ppt

Watch the video on YouTube at: http://www.youtube.com/watch?v=RmGtsKXdVko

Zabbix Templates for MySQL

We recently had a client come to us, and ask us to improve their MySQL monitoring in Zabbix. So, we did. The approach we took was to port the MySQL script from the superb mysql-cacti-templates project to work with Zabbix. This works out well, because Zabbix is like cacti and nagios combined, and, what we wound up getting, are some templates that can alert us when InnoDB uncheckpointed bytes starts climbing rapidly.

In addition to the above benefit, you also get every MySQL graph that comes from the mysql-cacti-templates project making Zabbix with appaloosa-zabbix-templates* a first-class replacement for cacti when it comes to MySQL trending.

Finally,  the same client is also generously donating time to create templates for Memcached/Membase and Bind9. So, look forward to more out of this project as we go forward into next year.

 

* Did you notice we really like our horses, here at PalominoDB?

Percona XtraDB Information_schema Changing

I've been working a lot with xtradb versions of mysql over the last year, and more and more have begun to attempt to leverage the information_schema instrumentation that has been put in place. 

Today, I went into a system with this install (admittedly a very recent one) 

Server version: 5.1.51-rel11.5-log Percona Server (GPL), 11.5, Revision 132

Having been considering upgrading RAM on the server, I went to look at how the innodb_buffer_pool was currently being used.   The information_schema tables seem to have changed dramatically, particularly around the innodb_buffer_pool related tables found here.

At this point, I'm at a loss of how to get information on which tables/indexes are using space (and how much) in the buffer pool.  I had used INNODB_BUFFER_POOL_PAGES_INDEX to do this in the past.  Hopefully documentation will be forthcoming on the new table structures.  Admittedly, I haven't dug too deeply into the new structures (since they are not documented).  I'm curious if anyone else has worked with the newer table structures:

INNODB_BUFFER_POOL_PAGES_INDEX\G
*************************** 1. row ***************************
       Table: INNODB_BUFFER_POOL_PAGES_INDEX
Create Table: CREATE TEMPORARY TABLE `INNODB_BUFFER_POOL_PAGES_INDEX` (
  `index_id` bigint(21) unsigned NOT NULL DEFAULT '0',
  `space_id` bigint(21) unsigned NOT NULL DEFAULT '0',
  `page_no` bigint(21) unsigned NOT NULL DEFAULT '0',
  `n_recs` bigint(21) unsigned NOT NULL DEFAULT '0',
  `data_size` bigint(21) unsigned NOT NULL DEFAULT '0',
  `hashed` bigint(21) unsigned NOT NULL DEFAULT '0',
  `access_time` bigint(21) unsigned NOT NULL DEFAULT '0',
  `modified` bigint(21) unsigned NOT NULL DEFAULT '0',
  `dirty` bigint(21) unsigned NOT NULL DEFAULT '0',
  `old` bigint(21) unsigned NOT NULL DEFAULT '0',
  `lru_position` bigint(21) unsigned NOT NULL DEFAULT '0',
  `fix_count` bigint(21) unsigned NOT NULL DEFAULT '0',
  `flush_type` bigint(21) unsigned NOT NULL DEFAULT '0'
) ENGINE=MEMORY DEFAULT CHARSET=utf8

Events and Replication, part 1

MySQL events have been in GA for 2 years.  Events are not used as much as I would like -- often organizations will have cron jobs or scheduled tasks to perform ETL, maintenance and small aggregation tasks.  For example, a website that shows how many users are logged in might update a "cache table" with a count of active sessions every 5 minutes.

Events are great because they are a part of the database.  They only run when the database is running, and they are backed up with the database.  There is no need to store a password in a file or in a cron listing.

Events and replication can be a tricky matter.  Events can change data, and data changes are replicated.  In the example of an event that updates a table with a count of active sessions every 5 minutes, let's assume that the event itself would run on a primary master (write) server.  All slaves (including a secondary master) would receive the table update.

The tricky part is that CREATE EVENT statements are replicated.  There are three ways to avoid an event that changes data running on more than just the master:

0) Do not replicate the CREATE EVENT, or replicate it but then delete the event on any slaves.  I do not like this approach because if you promote one of the slaves to be a master, the event will not be in place.

1) Replicate the CREATE EVENT, knowing that the SLAVESIDE_DISABLED bit will be set even if you do not specify DISABLE ON SLAVE in the CREATE statement.  If you promote a slave to be a master, all you need to do is ALTER EVENT event_name ENABLED for each event that needs to be turned on.  This is the method I prefer, even though it still requires knowledge to turn on the events when a slave is promoted, and turn off events on the old master.

2) Replicate the CREATE EVENT, and enable the event right away, but set the event scheduler off.  This is easier if there are lots of events, because all that needs to be done when a slave is promoted to a master is:

mysql> SET GLOBAL event_scheduler=ON;

However, this means that you cannot have any events running on the slave itself - it is either all the events running, or not.

Part 2 will discuss some specific issues I have run into with events and replication.

Oracle is not removing InnoDB!

 

I was asked "What's the deal with Oracle removing InnoDB?"  I had not heard this, so I did some research.  It took less than 5 minutes to figure out what happened, and it was easy to see where the confusion is.

On the MySQL products page at http://mysql.com/products/ the matrix of MySQL editions includes "MySQL Classic" which is free, "MySQL Standard" which costs $2k per year, "MySQL Enterprise" which costs $5k per year and "MySQL Cluster Carrier Grade" which costs $10k per year.

Indeed, the "MySQL Classic" does not include InnoDB.  What happened was that folks assumed that, because it was free, it was the MySQL Community edition we all know and love.

This is not true.  How do I know?  Because just above the matrix is a set of links to each edition, and if you click the "MySQL Classic" link you get to http://mysql.com/products/classic/ which explains "MySQL Classic Edition is the ideal embedded database for ISVs, OEMs and VARs developing read-intensive applications using the MyISAM storage engine."

 

So calm down, folks. 

How I Find MySQL Fragmentation

Sarah Novotny recently posted InnoDB Tablespace Fragmentation - Find it and Fix it, so I thought I would share how I determine MySQL fragmentation.  This works for MyISAM and InnoDB tables, and will work whether or not innodb_file_per_table is in use.

The basic idea is to compare the size of the file(s) on disk with the size in the database metadata.  The DATA_LENGTH and INDEX_LENGTH fields of the TABLES table in the INFORMATION_SCHEMA database has size information calculated by the storage engine.  There are a few points to note:

  • Sometimes querying INFORMATION_SCHEMA can take a long time, and in rare cases has been reported to crash MySQL.  I have written about how you can tell how risky an INFORMATION_SCHEMA query might be.
  • MyISAM tables - In MyISAM, the metadata is exact, and the files on disk are split into data (.MYD) and index (.MYI).  Compare the size of the .MYD to DATA_LENGTH and compare the size of the .MYI file to INDEX_LENGTH.  You can calculate exactly how fragmented each table is, and decide which tables to defragment.  Sample query:

SELECT (DATA_LENGTH+INDEX_LENGTH)/1024/1024 AS sizeMb FROM INFORMATION_SCHEMA.TABLES WHERE ENGINE='MyISAM';

  • InnoDB tables when using a centralized file for data and indexes (aka "not using innodb_file_per_table") - Because all of the data and indexes are in one file, you have to compare the sum of the size of all InnoDB tables with the size of the ibdata file(s).  However, because InnoDB has estimated metadata, you cannot use a direct comparison.  What I do is get a percentage difference between the size on disk and the size reported by the metadata, and if it's more than about 20% difference than the table is likely fragmented enough to justify defragmentation.  Sample query:

SELECT SUM(DATA_LENGTH+INDEX_LENGTH)/1024/1024 AS sizeMb FROM INFORMATION_SCHEMA.TABLES WHERE ENGINE='InnoDB';

  • InnoDB tables that use separate files for metadata versus data and indexes (aka "using innodb_file_per_table") - As explained in the previous example, InnoDB uses estimated size for metadata, not exact numbers like MyISAM.  So again I figure out the percentage difference and if it's more than 20% then I will recommend defragmentation.  Sample query:

 

SELECT (DATA_LENGTH+INDEX_LENGTH)/1024/1024 AS sizeMb FROM INFORMATION_SCHEMA.TABLES WHERE ENGINE='InnoDB';

 

Note that every database is different.  I use the 20% as a guideline, not a hard-and-fast rule.  It really depends on how "wrong" Innodb's estimate of the size is.  

Fragmentation occurs when variable-length data is updated, and when variable-length rows are deleted.  You can probably guess which tables have the most fragmentation if you know what queries are being run.  If you want to get a better sense of how often updates and deletes are being run, I highly recommend using mk-query-digest against a sample set of binary logs.

Syndicate content
Website by Digital Loom