Conferences

What Community Resources should be at Oracle OpenWorld?

A short while ago I posted about the Oracle OpenWorld Schedule Matrix of MySQL sessions (in PDF and HTML formats).  We have printed up a (small) number of schedules to have on hand at the MySQL Community kiosk at the User Group Pavillion in Moscone West.

Yes, you read that correctly -- the User Group Pavillion will include a MySQL Community kiosk this year!  Sarah and I have been coordinating the effort to staff the kiosk and figure out what we need to provide.

Sadly, it's just a kiosk (same as all the other User Group organizations get), so we cannot have a ton of flyers there.  To that end, we have created a QR code that resolves to www.kimtag.com/MySQL, which is where we are putting many links.  

To that end, we'd like your help figuring out what we have missed.  In order to keep the list of links as short and relevant as possible we have put as many aggregate links as we could, for example we link to planet.mysql.com instead of individual blogs, and we are only listing the major conferences with over 500 attendees expected.  The links at www.kimtag.com/MySQL as of the time of this blog writing are:

- MySQL sessions at OOW

- Planet MySQL

- dev.mysql.com (docs, etc)

- mysql.com

- MySQL User Groups (forge.mysql.com list) - so if you have a user group, make sure to update the forge page!

- Percona Live 2012 Conference & Expo

- MySQL videos on YouTube

- IOUG MySQL Council

- OurSQL Podcast Blog

- OurSQL iTunes link

- MySQL Experts podcast

- Book: MySQL Administrator's bible*

- Book: High Performance MySQL

- Book: Expert PHP/MySQL

- Book: MySQL High Availability

 

If you think of a link we should put on there, please comment below.

 

For what it's worth, the paper we will have will be:

- The current day's schedule

- A flyer about Percona Live 2012 MySQL Conference & Expo

- A poster of the QR code and a few small paper slips with the QR code

- IOUG MySQL Council business cards

And even that is stretching it, as there will be a laptop at the kiosk provided by Oracle and the kiosk is 24 inches x 24 inches, about 61 centimeters x 61 centimeters.
* Note that I have ordered the books with the MySQL Administrator's Bible first because it's for beginner/intermediate users, whereas High Performance MySQL is for intermediate/advanced users.

Videos from OSCon Data and OSCon 2011

There are 28 videos, all linked below, on the OSCon and OSCon data 2011 playlist that I have put online for free (with permission from the presenters and O'Reilly).  O'Reilly videos are available from the conference proceedings website.  Probably the best way to find all the videos in one place is to search for the 'oscon' tag on YouTube.

How do I choose what talks to film?  Well, to make it easiest on me, I choose what room to film, and then all I have to do is change the tapes every session.  This minimizes (but not completely eliminates) techical issues.  For OSCon Data it was simple - there were 5 rooms, O'Reilly was professionally recording one room, another room was "Products and Services", which left 3 rooms -- and I had 3 video cameras*.
Following is the list of videos I took, in alphabetical order.  Each link takes you to the YouTube page, which shows the presenters, description, and links to the slides (if available) and official O'Reilly Conference page:


* If there had been no technical difficulties whatsoever, there would be 38 videos on this list - 2 from OSCon (which are on the list) and 36 from OSCon data.  Unfortunately, 10 videos did not come out - either I missed the tape change, the audio could not be heard, or permission was not given by the presenters.  Note that in the latter case, presenters just never responded -- I did not have one presenter withhold permisson, though a few have not responded to my request for permission.

MySQL Content at Oracle OpenWorld - Session Matrix

While the online content catalog and schedule builder are great tools to help plan out what sessions I want to see at Oracle OpenWorld, what I really want is a matrix of only the MySQL content, preferably in a matrix that easily shows all the sessions per time period.

So I decided to make the matrix myself - view the HTML online at http://technocation.org/files/doc/2011_OOW_MySQL_Content.html

Or download a PDF (one page per day) at http://technocation.org/files/doc/2011_OOW_MySQL_Content.pdf

If you have feedback, please let me know in the comments or via the e-mail address on the matrix.  These documents are for personal use unless other arrangements have been made.

To see full descriptions, click on a speaker's name to be sent to the content catalog's page for that speaker, then click on the session to get the full description.

Disclosure: Truth About MySQL 2012 Conference Planning

I love Percona Live.  I think it is a great meeting of the minds.  However, I do not think it is a good replacement for the big April MySQL conference.  In fact, neither does Baron Schwartz:

"The conference is organized and owned by MySQL, not the users. It isn’t a community event. It isn’t about you and me first and foremost. It’s about a company trying to successfully build a business, and other companies paying to be sponsors and show their products in the expo hall."
Baron Schwartz, April 23, 2008.
http://www.xaprb.com/blog/2008/04/23/like-it-or-not-it-is-the-mysql-conference-and-expo/

Switch "MySQL" for "Percona", and that is exactly what Baron said in today's announcement:

"Emphasis on business. We need a place where vendors, both open-source and closed-source, can showcase their products and services. This is the hand that feeds all of us. It’s good for Percona’s business, and it’s good for everyone else’s too."

...except in 2008, Baron was calling for a community conference because what was good for business was not good for the community.  

Let me pull out that last sentence from the quote.  Read it to yourself, and tell me if you feel warm fuzzies replacing "Percona" with "Oracle" or "MySQL":

"It’s good for _______’s business, and it’s good for everyone else’s too."

Here is a question for you:  Do you think Oracle will send lots of engineers to talk about current and future plans for MySQL to a conference that is open about saying "It is good for our business?"  

It does not matter if it is Percona, Blue Gecko, or PalominoDB, if Oracle has any business sense (and they make quite a bit of money, so signs are that they have business sense) they will not send engineers to a competitor's company-branded conference.

What does Percona's founder, Peter Zaitsev, have to say about the conference?  He's not really happy about it either:

"I would like to see the conference which is focused on the product users interests rather than business interests of any particular company (or personal interests of small group of people), I would like it to be affordable so more people can attend and I’d like to see it open so everyone is invited to contribute and process is as open as possible. "
Peter Zaitzev, April 23, 2008
http://www.mysqlperformanceblog.com/2008/04/23/conference-for-mysql-users/

Peter, I would like to see that too.  In fact, a small group of folks, including Giuseppe and myself, tried to make that happen, and we had SkySQL, MariaDB, Oracle and IOUG all supporting us. 

Giuseppe called for disclosure about the conference, so I will disclose this:  Baron was not truthful when he said "To the best of our knowledge, no one else was planning one".

Giuseppe asked for full disclosure, so here is a copy and paste of a Skype conversation I had with Percona's Tom Basil on June 29th, 2011:

 

 

Sheeri K. Cabral 6/29/11 12:10 PM 

I think we should develop some ideas in case O'Reilly doesn't end up having a MySQL conference....last year the announcement was late, and it was in May, and I'm starting to think they might not be doing a conference this year, since we haven't heard anything yet.

6/29/11 12:10 PM

if that's the case, I'd like to have a conference anyway, and I'd like to explore options with you, because we need a community-run conference (not Collaborate, but maybe a co-located summit).  And obviously you've had success with Percona Live, but a multi-day conference is really different.

6/29/11 12:11 PM

(FWIW I told Colin Charles in May that I was willing to help co-chair the conference, so I'm still willing to give my support in that area).

Tom Basil 6/29/11 12:14 PM 

Sheeri, can't talk now

Sheeri K. Cabral 6/29/11 12:14 PM 

*nod*  can we schedule a conf call maybe?

Tom Basil 6/29/11 12:14 PM 

Headed out in just few min

6/29/11 12:14 PM

yes, next week

6/29/11 12:14 PM

We tried to schedule conference calls but for almost 6 weeks we were pushed back.  
I had indeed offered to chair or co-chair a conference with Colin Charles, but I simply cannot lend a lot of logistical support to a Percona-branded conference.  I volunteer a lot, but it is for the benefit of the MySQL Community, not for the benefit of a company I do not work for.  
I hope that my openness and candor does not blacklist me from Percona events; I am a popular and sought-after speaker at MySQL events and it will be a loss to the community if that happens.
Percona should keep its Percona Live series.  However, the world's most popular open source database deserves and needs exactly what Peter, Percona's founder, said just a few years ago: "the conference which is focused on the product users interests rather than business interests of any particular company".
I believe Percona Live Santa Clara will be a successful event, and I will try to be a part of it.  I hope it will be as successful for the community as it will be for Percona's business.

 

Final Videos of Open DB Camp Online:

The final videos from Open DB Camp back in May in Sardinia, Italy are now online.  The full matrix of sessions, videos and slides can be found on the schedule page.

Hands on JDBC by Sandro Pinna - video

"MySQL Plugins, What are They? How you can use them to do wonders" by Sergei Golubchek of MariaDB - video

The State of Open Source Databases by Kaj Arnö of SkySQL - video

Coming soon, videos from OSCon Data!

More Videos from Open DB Camp

I have gotten to uploading more of the videos from Open DB camp in Sardinia, Italy back in May:

Henrik Ingo speaks about Xtrabackup Manager - video

Linas Virbalas speaks about "Flexible Replication: MySQL -> PostgreSQL, PostgreSQL to MySQL, PostgreSQL to PostgreSQL" - video - slideshare slides

MySQL to MongoDB replication (hackfest results) - video 

Robert Hodges of Continuent speaks about Multi-Master Replication: Problems, Solutions and Arguments - video

There are a few more videos from Open DB Camp to put up, then I start to put up the content from OSCon Data!

Liveblogging at OSCON Data: Drizzle, Virtualizing and Scaling MySQL for the Future

Brian Aker presents "Drizzle, Virtualizing and Scaling MySQL for the Future" at OSCon Data 2011

http://drizzle.org

irc.freenode.net #drizzle

http://blog.krow.net

@brianaker

2005 MySQL 5.0 released - web developers wanted tons of features that were not in the release (making replication better for instance)

2008 Sun buys MySQL

2008 MySQL 6.0 is forked to become Drizzle

2009 Oracle buys Sun

2010 Drizzle developers leave Oracle

2011 First GA release, Drizzle7

MySQL's Architecture - monolithic kernel, not very modular, lots of interdependence.

Drizzle has a microkernel, which includes a listener, parser, optimizer, executioner, storage system, logging/error reporting.

Drizzle can accept SQL and http blog streaming, and memcached and gearman can easily talk to Drizzle.

Drizzle has tried to have no "gotchas"

- If you make a new field with NOT NULL, MySQL makes new values NULL.  Drizzle does not do this.

- No hacky ALTER TABLE

- Real datetime (64 bit), including microseconds

- IPV6 (apparently this is a strong reason for people switching, to support IPV6)

- No updates that complete halfway

- Default character set is UTF-8, default collation is utf8-general (charset in latin1 by default in MySQL, collation is latin1_swedish_ci - "case insensitive" by default)

Replication

- In MySQL, replication is kind of hacky [this is my summary and opinion, but it's basically what Brian said]

- Drizzle is Google Protocol Buffer Based

- Replicates row transformations

- Integrates with RabbitMQ, Cassandra, Memcached, Gearman -- right now.

DML and MySQL binary logs analog:

- DML is stored transactionally by delta in Drizzle

- InnoDB is already logging, no need to add another log for the binary log.  So it just logs DML to the transaction log.

LibDrizzle

- supports Drizzle, MySQL, SQLite

- Asynchronous

- BSD, so Lawyer-free

What else?

- No cost authentication (pam, ldap, htaccess, ...)

- Table functions (new data dictionary, including performance and thread information).  INFORMATION_SCHEMA in Drizzle is *exactly* what's specified in the SQL standard.

- Data types - native type for UUID, boolean, all known types (except SET, because it's broken by design)

- Assertions are in Drizzle, you can ask what the type of the result of combining multiple data types will be.

- About 80 conflicts in the Drizzle parser as opposed to about 200 in the MySQL parser

Roadmap - Drizzle7+

- Replication - faster than MySQL and also allows multiple masters.

Virtualization:

Virtualizing a database gives you about a 40% performance hit.  How can costs be cut?  In MySQL 5.0 the Instance Manager was created to solve that but it hasn't really been worked on.  Drizzle has worked on virtualizing databases internally within Drizzle.

- So drizzle now has catalogs.  

- One catalog has its own set of users, its own schema with tables, etc.

- A catalog is its own sandbox; there is no syntax that allows you to connect from one catalog to another, so there's no security problems.  

- Cuts the 30/40% hit from virtualizing

- Single instance maintenance - only 1 OS and 1 database to configure, unlike VMs

    - Currently only one database configuration so there's one global config for shared memory such as innodb buffer pool, but that will change in the future.

- Still allows for I/O spread on SAN/NAS

 

In Drizzle 7.1 - Percona's xtrabackup supports Drizzle, and ships with drizzle.  xtrabackup supports full and partial backups, no locking, single solution for point-in-time recovery in a single solution.  Because transaction log is stored in database, replication is automatically consistent with the database.  Currently does not do incremental backups with the transaction logs but that's in the future.

DBQP:

- consolidates standard testing tasks, server/test management, reporting, REGRESSION TESTING

- extended architecture allows for complex testing scenarios

- pluggable - supports new testing tools

- randgen, sql-bench, crashme, sysbench, standard drizzle-test-run suite

- Keeping tools and testing configurations in-tree facilitates testing for everyone

- supported by SkySQL

 

Dynamic SQL/execute()

- New UTF-8 parser

- Being extended to allow for plugging in application servers.

 

>120 developers since day 1

avg 26-36 per month that commit

 

Bugs database - http://bugs.launchpad.net/drizzle

Liveblogging at OSCON Data: MongoDB Schema Design

Dwight Merriman gives "MongoDB Schema Design" at OSCon Data 2011.

@dmerr

 

RDBMS / MongoDB

relational / document-oriented

database / database

table / collection

row / JSON (BSON) document

index / index

column / field (dynamic/not predeclared)

SQL / Mongo query language (JSON)

Join / Embedding & Linking

Primary Key / _id field

 

Schema design is coupled with what you want to do:

- Dynamic queries

- Secondary indexes

- Atomic updates

- Map Reduce

Considerations:

- no joins

- atomic operations are at the single document level only

- types of queries/updates to do

- sharding

- speed

 

This is the commandline mongo interface but all this can be done in any (modern) programming language.

post = {author: "Herge"

date: new Date(),

text: "Destination Moon",

tags: ["comic", "adventure"]}

> db.posts.insert(post)

"posts" is the collection name.  Documents are analogous to rows but can be more complex.  Documents for one collection are grouped together.

> db.posts.find()

{ _id: ObjectId("4c4ba5c0672...."),

 

author: "Herge"

date: new Date(),

text: "Destination Moon",

tags: ["comic", "adventure"]}

_id must exist and must be unique -- if you don't create an _id, one will be made for you, 12 bytes BSON, shorter than a normal UUID but that's OK because you don't need it to be unique globally, just on this db cluster.

Secondary index, on "author":

> db.posts.ensureIndex({author: 1}) -- "1" means ascending, -1 is descending

> db.posts.find({author: 'Herge'}).explain() -- shows you the explain plan

 

Multi-key indexes

//build an index on the "tags" array

> db.posts.ensureIndex({tags: 1})

Arrays are exploded and every element of the array will be indexed, and added separately to the B-tree data structure of the index.

> db.posts.find){tags: 'comic'})

MongoDB assumes, when you query an array, that you mean you're looking for an array item that matches.

 

Query operators

Conditional operators:

 

$ne, $in, $nin, $mod, $all, $size, $exists, $type, $lt, $lte, $gt, $gte

 

 

 

 

 

 

 

 

 

 

Update operators:

$set, $inc, $push, $pop, $pull, $pushAll, $pullAll

Extending the Schema:

new_comment = {Author: "Kyle",

date: new Date(),

text: "great book!"

votes: 5}

db.posts.update( text: "Destination Moon" }, -- this is the WHERE filter

{ '$push': {comments: new_comment}, -- do this

'$inc': {comments_count: 1}}) -- and do this

If you push the comments array without it being there, it will create it without a problem.

 

> db.posts.ensureIndex({"comments.author":1 })

> db.posts.find({comments.author:"Kyle"})

> db.posts.find({comments.text:"good book"})

The 'dot' operator

 

Find all posts with >50 comments,
> db.posts.findIndex({comments.votes: {$gt: 50}})
Not as robust as all the operators in SQL, but it's pretty good, and more concise than SQL.  Over time more expressions will be added.
Find all posts with >50 comments, order by author ascending
> db.posts.findIndex({comments.votes: {$gt: 50}}).sort(author: 1)
No functional indexes (indexes on functions of fields)
If you add an index to a non-existent field, it returns NULL (which is necessary because not all documents have the same fields).
From a schema design standpoint, the point of MongoDB is to make the documents rich.  He puts up an example of a sales order, with many line items, an address field that has name, street, zip, cc field that has number, exp date.
There is model inheritance, for instance if you have
> db.shapes.find()
{_id: 1, type "circle", area: 3.14, radius: 1}
{_id: 2, type "square", area: 4, d: 2}
{_id: 3, type "rect", area: 10, length: 5, width: 2}
All shapes have area, but the other dimensions are different based on the shape.
> db.shapes.find ({radius: {$gt: 0}})
-- automatically finds only circles.
Note that this avoids the need to join for 1:many and many:many relationships, as in relational model.
That was embedding, now let's talk about linking.
- done client-side

 

So for instance, a one to many relationship might look like this -- in addition to the collection for posts, a collection for authors with author info:
// collection authors
{ _id: "Herge"
email: "h@foo.com",
karma: 14.142 
}
> var p = db.posts.findOne()
> var author = db.authors.findOne({_id:p.author})
> print(author.email)
If it's a "contains" relationship you want to embed
If you need more flexibility than that, link
Rich documents are usually easy to query

 

Rich documents are great for performance

 

Rich documents give more atomicity capability
{
account: "abc",
debits: 21,
credits: 11
}
> db.stuff.update({account:'abc'},
{$inc:{debits:21},{$dec:{credits:11}})
Caching is based on 4k pages, so if you have very small documents, that can be a problem if you are pulling from many collections.
Trees in MongoDB:
{ comments: [
   { author: "Kyle", text: "...",
     replies: [
            { author: "Fred", text: "...",
                replies: []}
       ]}
]}
Mongo doesn't search recursively so while this is great for display, not great for search.
> t = db.mytree;
> t.find()
{ "_id" : "a" }
{ "_id" : "b", "ancestors" : [ "a" ], "parent" : "a" }
{ "_id" : "c", "ancestors" : [ "a", "b" ], "parent" : "b" }
{ "_id" : "d", "ancestors" : [ "a", "b" ], "parent" : "b" }
{ "_id" : "e", "ancestors" : [ "a" ], "parent" : "a" }
{ "_id" : "f", "ancestors" : [ "a", "e" ], "parent" : "e" }
{ "_id" : "g", "ancestors" : [ "a", "b", "d" ], "parent" : "d" }
> t.ensureIndex( { ancestors : 1 } )
> // find all descendents of b:
> t.find( { ancestors : 'b' })
{ "_id" : "c", "ancestors" : [ "a", "b" ], "parent" : "b" }
{ "_id" : "d", "ancestors" : [ "a", "b" ], "parent" : "b" }
{ "_id" : "g", "ancestors" : [ "a", "b", "d" ], "parent" : "d" }
> // get all ancestors of f:
> anc = db.mytree.findOne({_id:'f'}).ancestors
[ "a", "e" ]
> db.mytree.find( { _id : { $in : anc } } )
{ "_id" : "a" }
{ "_id" : "e", "ancestors" : [ "a" ], "parent" : "a" }
Limit of document size is 16 Mb per document.  Was 4 Mb, will keep increasing probably based on Moore's Law.  This is arbitrary just as a safety measure.
BSON is not compressed due to wanting to scan quickly.
Can do Queueing too, use the findAndModify method to find the highest priority job and mark as in-progress - see http://www.mongodb.org/display/DOCS/findAndModify+Command
The operation is fast, however the entire document is locked -- not ideal, but concurrency is getting better and better in MongoDB.
MongoDB tries to do update in place, if you are adding to the document such that it does not fit in the allocation unit, it has to be deleted and reinserted.  This is expensive, hence the allocation unit - an adaptive padding factor based on the collection unit.

 

 

 

 

 

 

 

Video: Henrik Ingo's "Buildbot & EC2: how to qa and package your db product"

And another video from OpenDBCamp is online....Today it is from Henrik Ingo - "Buildbot & EC2: how to qa and package your db product".

I cannot seem to find the slides Henrik used, but the video is now online at http://www.youtube.com/watch?v=07tsdSvR5C0.

By the way, there are plans to video *all* of the sessions at OSCon Data next week (MySQL and otherwise), which is what made me look in my "video to do" folder earlier this week and realize that I had not yet put up all the OpenDBCamp videos yet!

Video: High Performance Search with BlackRay Data Engine

I realized yesterday that I never did finish putting up the videos from this year's Open Database Camp back in May, so I'm working on finishing that in the next few weeks.

Today I put up High Performance Search with BlackRay Data Engine - Felix Schupp.  The slides are on the web at http://www.slideshare.net/fschupp/blackray-the-open-source-data-engine-2011683, and the video is on YouTube at http://www.youtube.com/watch?v=oCB3ZXfc8Rs YouTube video

Syndicate content
Website by Digital Loom