mysql

Integrating Aegir with Linux and FTP

Due to the insane cost of bandwidth (compared to the rest of the developed world) in Australia, I've recently decided to move some of our hosting clients to Linode. This means they can move more data more cheaply and I don't need to come up with (and administer) a bandwidth accounting system for my Australian based web VM.

We pretty much exclusively use Drupal for hosting clients, so to make management a bit easier I decided to use Ægir on the new Linode. Installation was a relative breeze, after a quick google to find out how to specify that I didn't want to use Apache and wanted to use a separate server as dedicated MySQL host.

The problem (there is always a problem) arose when I needed to give a hosting client access to their Drupal installation, so they could manage themes and site-specific modules. Just adding an account and providing SSH access was out of the question, as all sites are stored under a single system user. Anyone logging in with permissions to edit their own Drupal can then also edit all other sites and even the Ægir installation itself.

FTP would be a solution, as FTP accounts can be chrooted (locked into a specific directory) quite easily, but I didn't want to have to manage a list of FTP accounts separate from the Drupals in Ægir. That isn't the lazysysadmin way.

After a bit of thought I remembered that on an older web host, I had happily used libnss-mysql and libpam-mysql, which integrate accounts defined in a (any) MySQL database with the Linux system. The trick is to get MySQL to cough up the account information in the correct format, so the system can parse it as if these accounts were normal system users.

Migrating users and content profiles

For some time now I've been working on a Drupal site that consists mainly of scraped content from a proprietary, ASP based CMS from the late nineties. The Simple HTML Dom Parser, used from within a drush script, has been invaluable. It made scraping the old site content and importing it as Drupal nodes a relative breeze. (No access to the database used by the CMS, boo!)

Part of setting up the new site is importing users and their content profile nodes from a different Drupal site, that was setup a year or two ago to manage an event.

I had hoped there would be a way for me to export these users and their profile nodes from one Drupal to the other, but though I found modules to export one or the other, I might still end up with profile nodes that were no longer related to their users. Of course, that's pretty useless.

When I remembered I was also supposed to add all these users to a simplenews newsletter, the proverbial light bulb came on.

Redmine with MariaDB

Cherries!I'm in the process of setting up Redmine (version 1.0-stable) on an Ubuntu 8.04 virtual machine. Getting a recent enough gem and rails is less fun than you might imagine, but the big issue I came across was a bug in the database model, which makes MySQL 5.1 (MariaDB 5.1 in my case) barf on installation.

There is a fix, but I am running from a git clone didn't want to download and apply a diff file to that repository. A quick google found what I need: the git cherry-pick command. It allows you to grab a single commit and apply its changes to your branch. In my case:

git cherry-pick a628b0f186cf4d182ce5cee1a497ad42c5246406

Because all commits have a unique label, when 1.0.1 is released and I update my git clone, I won't suffer merge conflicts from this already-applied change. Lovely :-)

fast paging in the real world

Some time ago I attended the "Optimisation by Design" course from Open Query¹. In it, Arjen teaches how writing better queries and schemas can make your database access much faster (and more reliable). One such way of optimising things is by adding appropriate query hints or flags. These hints are magic strings that control how a server executes a query or how it returns results.

An example of such a hint is SQL_CALC_FOUND_ROWS. You use it in a select query with a LIMIT clause. It instructs the server to select a limited numbers of rows, but also to calculate the total number of rows that would have been returned without the limit clause in place. That total number of rows is stored in a session variable, which can be retrieved via SELECT FOUND_ROWS();  That simply reads the variable and clears it on the server, it doesn't actually have to look at any table or index data, so it's very fast.

This is useful when queries are used to generate pages of data where a user can click a specific page number or click previous/next page. In this case you need the total number of rows to determine how many pages you need to generate links for.

The traditional way is to first run a SELECT COUNT(*) query and then select the rows you want, with LIMIT. If you don't use a WHERE clause in your query, this can be pretty fast on MyISAM, as it has a magic variable that contains the number of rows in a table. On InnoDB however, which is my storage engine of choice, there is no such variable and consequently it's not pretty fast.

Simplenews Mass Subscribe

Simplenews is a quick and easy way to set up and manage newsletters on Drupal, but if you add it to a site which already has users, you might want a way to quickly subscribe all your users to your newsletter.

Rather than click through all users and manually subscribe them - which can be a little bit boring if you have thousands of users - you can run a few SQL queries to mass-subscribe everyone in one go.

MySQL > YourSQL

Since I started doing the occasional consulting job for Open Query, I've seen a lot of MySQL servers that have been installed once and then forgotten about. This gave me the idea to do a short presentation about some basic MySQL server configuration. The first go was at DrupalCampMelbourne and I recently tried (and failed) to cram it into a three minute lightning talk slot at the LUV September meeting.

The title of the talk is (now) MySQL > YourSQL. I chose this not because I think that MySQL is better than the $other_database you use or because I may or may not run a newer version of MySQL on better hardware, but because I use InnoDB and not MyISAM as the default table format. More importantly, I do not run the server with the shipped default configuration.

These configuration tips are also included in the MySQL section of Pro Linux System Administration.

Most distributions ship MySQL with a configuration file that sees you running a server optimised for a system with 32MB of RAM. That was great in 1996, but these days your Nike+ shoes have more capacity than that, so it makes sense to optimise the configuration - or at least to make it not suck quite as badly.

I chose InnoDB as the default storage engine not because it gives me a magic increase in performance or transaction support, but due to the  improved reliability that comes from its ability to automatically recover from crashes better than MyISAM does.

All the following settings go in the [mysqld] section of your my.cnf file.

False assumptions with MySQL

I spent far too much time this past week trying to find out why a Drupal was not  assigning authenticated user permissions to users who clicked on the validation link they were emailed. Instead, it would simply remove the temporary unverified user permission.

It's supposed to do the latter, but then also follow it up with the former. It turns out the problem was a Drupal bug that was triggered by my MySQL setup.

Drupal and transient MySQL errors

A while ago Arjen Lentz blogged about transient MySQL errors that can occur when using a transactional storage engine, like say InnoDB.

Error: No error occurred.Since I'm a fan of the reliability and automated recovery that InnoDB provides, I use it for all the Drupals that I host.  However, on a very busy site, this may lead to deadlocks. These in turn lead to users seeing errors, which is something I'd like to avoid. Especially if the error could be prevented.

replicants

I found myself with some spare time the other day and decided that my current mysql backup strategy is not the best in the world. The mysql server is a virtual machine in a Brisbane datacenter and it's backed up via a script that calls mysqldump on each installed database and dumps the content to (compressed) files. These files then get sucked down via rdiff-backup.

This is fine in principle, but does mean it's possible for me to lose 24 hours worth of data due to an accidental '--; DROP table students.