Howto setup a KVM server the fast way

This is a very short quick setup on how to get KVM server up and running. It assumes that

  • you want to run a KVM server with at least one virtual machine,
  • your KVM server gets an ip address in your network,
  • your virtual machine(s) get an ip address from your network – so you can use bridging instead of natting (using NATting instead of bridging is an easy task but not part of this howto),
  • you can use lvm for disk space allocation on your KVM master (using other disk space allications methods like image files is easy, too, but not part of this howto)

Get the server running

I assume you are able to install an Ubuntu server from scratch and setup a lvm environment. Actually this can be done by mostly accepting the defaults during the Ubuntu server setup. I’d suggest you install at least Ubuntu Lucid 10.04 or newer.
If you continue reading here, you should have a running, up-to-date Ubuntu server with network connectivity and preferably access via ssh.

Get the network up and running

For the bridged network you need to install the bridge utilities and change your network configuration. First install the package:
$ sudo apt-get install bridge-utils
Now add a bridge named „br0“ (this has only be done once):
$ sudo brctl addbr br0
Now change your /etc/network/interfaces so it uses the bridge br0. This step actually sets up br0 instead of eth0. Think of eth0 as being just a physical transport added to the virtual bridge interface.
# The loopback network interface
auto lo
iface lo inet loopback
 
auto eth0
iface eth0 inet manual
 
auto br0
iface br0 inet static
address 192.168.1.100
netmask 255.255.255.0
network 192.168.1.0
broadcast 192.168.1.255
gateway 192.168.1.1
bridge_ports eth0
bridge_fd 9
bridge_hello 2
bridge_maxage 12
bridge_stp off

Please make sure you don’t forget setting your „eth0“ to „iface eth0 inet manual“ as shown above. This is needed as you want to prevent eth0 to fetch an address via dhcp but still want it to be there for your bridge as it is the physical layer. After you setup the bridge either restart your network (sudo /etc/init.d/networking restart) or reboot your server. If you are accessing your server already by ssh be warned that a misconfiguration might lock you out.

Install KVM

Now it’s time to install kvm and some usefull helper applications:
$ sudo apt-get install qemu-kvm ubuntu-vm-builder uml-utilities \
  virtinst

That’s all: You already have a kvm server now. Time to…

Install your first virtual machine

We are going to setup a 100Gb logical volume for the guest, download Ubuntu and create a machine with 2Gb of Ram and 4 cores:

# create an empty 100Gb logical volume
sudo lvcreate --size 100G vg0 --name guest1
# download Ubuntu iso
$ wget http://..../
# create machine
$ sudo virt-install --connect qemu:///system -n guest1 -r 2048 \
 --vcpus=4 -f /dev/mapper/guest1 --network=bridge:br0 \
 --vnc --accelerate -v -c ./SOMEUBUNTUISO.iso \
 --os-type=linux --os-variant=ubuntuKarmic --noautoconsole
# please note: "ubuntuKarmic" is currently the most recent
# virt-install defaults scheme - just use this if in doubt.

Get a VNC connection

KVM uses VNC to give you ca graphical interface to your machine. The good thing about this is, that it enables you to use graphical installers (and yes, even Windows) without problems. As even Ubuntu server boots into a graphical mode in the beginning – it’s great to use VNC here.

I assume you are working on a remote server. KVM gives every guest it launches a new vnc instance with a new, incremented port. It starts with 5900. So let’s tunnel via ssh:

ssh user@remotekvmhost -L 5900:localhost:5900

You connect to your remote kvm host via ssh and open a ssh tunnel fort port 5900. Now start your prefered VNC client locally and let it connect to either display „0“ or port 5900 which means the same in VNC (duh…).

From now on you should see your server on a VNC display. Install it like you’d install every other server. The networking is bridged, so you could even use dhcp if that is offered in your network.

Please make sure, you install the package „acpi“ inside your kvm guest, otherwise you won’t be able to stop the guest from the master (as it is done via acpi):

# make sure, "acpi" is installed in the *guest* machine
sudo apt-get install acpi

After installation you can manage your kvm gues by using the following commands:

# list running instances
$ virsh list
# start an instance
$ virsh start INSTANCENAME
# stop an instance politely
$ virsh stop INSTANCE
# immediatly destroy a running instance
$ virsh destroy INSTANCE
# edit the config file for an instance
$ virsh edit INSTANCE

Mounting the LVM volumes

As you might have noticed, your virtual guest’s lvm volumes cannot be mounted directly in the master as they contain their own partition table. If you need access to the guest’s filesystem from the master, though, you have to create some device nodes. There is a great tool called „kpartx“ than can create and delete device nodes for you. It’s as easy as this:

# install kpartx
$ sudo install kpartx
# make sure, virtual gues is switched off!
# create device nodes
$ sudo kpartx -a /dev/mapper/guest1
# check /dev/mapper for new device nodes and mount/unmount them
# after you are done, delete the nodes
$ sudo kpartx -d /dev/mapper/guest1

Please note, this methods also works with other block devices like image files containing partition tables. You only might run into trouble, when your lvm volume contains it’s own lvm. If that is the case, play around with pvscan, vgscan and lvscan after using kpartx. Be brave but be warned that backing up data is always a great idea.

Alternative Management Interfaces

In case you really need a gui for your management needs, check „virt-manager“. You can install this on your desktop and remotely manage running instances:

$ sudo install virt-manager

You should check RedHat’s „Virtual Machine Manager“ page, though. It might be a good idea to manually compile and install a more recent version and rely on the setup howtos. Personally I prefer using plain text console here, as it helps being able to act quite fast and from everywhere when problems occur.

Conclusion

Nowadays it’s fairly easy setting up a KVM server. As KVM/libvirt enabled guests are quite fast, it’s a nice and easy way for even hosting virtual machines. I run about a dozen virtual machines and three hardware servers for two years now without any serious problems.

Recovering Linux file permissions

I recently ran into a server, where somebody accidently issued a „chown -R www-data:www-data /var“. So all files and directories within /var where chowned to the www-data which actually means a complete system fuckup as everything from logging over mail and caching to databases relies on a correct setup there. Sadfully this was a remote production server so I had to find a quick solution to get a least a state good enough for the next days.

I started peaking around a possibity to reset file permissions based on .deb package details. There are at least approaches (the method there misses a pre-download of all installed .deb packages) to do this (and I remember running a program years ago that checked file permissions based on .deb files – just did not find it via apt-get). Nonetheless this approach lacks the possibility of handling application created files. Files in /var/log for instance don’t have to be declared in a .deb file but urgently need the right file permissions.

So I came to a different approach: cloning permissions. By chance we had a quite similar server running meaning same Linux distribution and nearly the same services installed. I wrote a one liner to save the file permissions on the healthy server:

$ find /var -printf "%p;%u;%g;%m\n" > permissions.txt

The command writes a text file with the following format:

dir/filename;user;group;mode

Please note, I started using „:“ as a separator but noted that at least some Perl related files have a double colon in there name.

Now I only needed a simple shell script that sets the file permissions on the broken server based on the text file we just generated. It came down to this:

#!/bin/bash

ENTRIES=$(cat permissions.txt)

for ENTRY in ${ENTRIES}
do
	echo ${ENTRY} | sed "s/;/ /g" | {
		read FILE USER GROUP MODE
		chown ${USER}:${GROUP} "${FILE}"
		chmod ${MODE} "${FILE}"
	}
done

The script reads every line of the text file, splits it’s content into variables and sets the user and group via „chown“ as well as the mode via „chmod“. It doesn’t check if a directory/file exists before chowning/chmodding it, as it actually doesn’t matter. If it’s not there, it just won’t do something harmfull.

After you’ve run this, it’s a good idea to restart all services and start watching log files. You have to take care of all services that rely on fast changing files in /var. For instance a mail daemon puts a lot of unique file names into /var/spool and the script above won’t be able to take care of that. You have to double check database directories like /var/lib/mysql, hosted repositories and so on. But the script will provide with a state where most services are at least running and you get an idea of how to switch back the remaining directories. It might be helpfull to search for suspicious files, like

$ find /var -user www-data

RubyGems 9.9.9 packaged – Fake install RubyGems on Debian/Ubuntu

For a lot of reasons I often rely on a mixture of a Debian/Ubuntu pre packaged Ruby with a self compiled RubyGems. It helps you in situations where you don’t care that much about the Ruby interpreter itself but need an up to date RubyGems. While this is easy to install, you might run into trouble when installing packages that depend on Ruby and RubyGems, namely packages like „rubygems“, „rubygems1.8“ and „rubygems1.9“.

After unsuccessfully playing around with dpkg for a while (you can put packages on „hold“ which prevents them from being installed automatically, I came to the conclusion, the best way is to install a fake package that is empty but satisfies depencies.

So, here it is: The shiny new RubyGems 9.9.9 which delivers rubygems, rubygems1.8 and rubygems1.9 right away. Just install it (e.g. with dpkg) and you’ll be able installing packages that rely on a rubygems package.

In case you want to play around with the package and customize it to your needs, e.g. only deliver rubygems1.8 or rubygems1.9, take

1. Install equivs

$ sudo apt-get install equivs

2. create a control file

$ equivs-control rubygems

3. edit the control file

$ vim rubygems

You can compare the default settings in the control file with the output of e.g. „apt-cache show rubygems“. The crucial field is „Provides:“ where you can put a comma separated list of packages you want to fake install. Choose a high version for  there „Version: “ field as this will mark the package newer as the distribution’s own package. This prevents the packager from replacing it.

Section: universe/interpreters
Priority: optional
Homepage: http://www.screenage.de/blog/
Standards-Version: 3.6.2
 
Package: rubygems
Version: 9.9.9
Maintainer: Caspar Clemens Mierau <[email protected]>
Provides: rubygems1.8,rubygems1.9,rubygems
Architecture: all
Description: Fake RubyGems replacement
 This is a fake meta package satisfying rubygems dependencies.
 .
 This package can be used when you installed a packaged ruby but want
 to use rubygems from source and still rely on software that depends
 on ruby and rubygems

4. build the package

$ equivs-build rubygems

p.s.: You can also use equivs for easily building meta packages containing a list of packages you want to install at a glance, e.g. for semi automated server bootstrapping.

Bootstrapping a Puppet agent/master on Ubuntu

Though it’s really great that Puppet made it into Ubuntu’s main repository, the provided version is rather outdated which prevents you from using advanced language features when writing your manifests. So sooner or later you end up installing Puppet manually. In order to speed up installation I stripped it down to the following:

install agent:

$ bash &lt; &lt;(wget -qO - https://bit.ly/install-puppet-agent)

install master:

$ bash &lt; &lt;(wget -qO - https://bit.ly/install-puppet-master)

The call fetches the most recent version of the install script from github, installs Ubuntu’s Ruby (which is good enough for running Puppet), fetches an upstream version of gem itself and updates it to the most recent version and finally installs the Puppet gem.

You can, of course, also download, review and run the scripts manually. Just have a look at https://github.com/moviepilot/puppet/tree/master/tools

slides from the ‚From MySQL to MariaDB‘ presentation

As announced, I held a short talk on switching from MySQL community edition (especially 5.1) to MariaDB (currently 5.2.6) at this years LinuxTag in Berlin.

Here are the (German) slides for reference:

(In case you cannot see the embedded presentation, you can also click here)

Please note: There are a lot of good English slides around. If you want give a talk on MariaDB, the „Beginner’s Guide“ might be a good start:

A Beginner’s Guide to MariaDB Presentation

Using backuppc as a dirty distributed shell

Backuppc is a neat server-based backup solution. In Linux envorinments it is often used in combination with rsync over ssh – and, let’s be hontest – often fairly lazy sudo or root rights for the rsync over ssh connection. This has a lot of disadvantages, but at least, you can use this setup as a cheap distributed shell, as a good maintained backuppc server might have access to a lot of your servers.

I wrote a small wrapper, that reads the (especially Debian/Ubuntu packaged) backuppc configuration and iterates through the hosts, allowing you to issue commands on every valid connection. I used it so far for listing used ssh keys, os patch levels and even small system manipulations.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#!/bin/bash
SSH_KEY="-i /var/lib/backuppc/.ssh/id_rsa"
SSH_LOGINS=( `grep "root" /etc/backuppc/hosts | \
 awk '{print "root@"$1" "}' | \
 sed ':a;N;$!ba;s/\n//g'` )
 
for SSH_LOGIN in "${SSH_LOGINS[@]}"
do
 HOST=`echo "${SSH_LOGIN}" | awk -F"@" '{print $2'}`
 echo "--------------------------------------------"
 echo "checking host: ${HOST}"
 ssh -C -qq -o "NumberOfPasswordPrompts=0" \
 -o "PasswordAuthentication=no" ${SSH_KEY} ${SSH_LOGIN} "$1"
done

You can easily change this to your needs (e.g. changing login user, adding sudo and so on).

$ ./exec_remote_command.sh "date"
--------------------------------------------
checking host: a.b.com
Mo 9. Mai 15:40:26 CEST 2011
--------------------------------------------
checking host: b.b.com
[...]

Make sure to quote your command, especially when using commands with options, so the script can handle the command line as one argument.

A younger sister of the script is the following ssh key checker that lists and sorts the ssh keys used on systems by their key comment (feel free to include the key itself):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#!/bin/bash
 
SSH_KEY="-i /var/lib/backuppc/.ssh/id_rsa"
SSH_LOGINS=( `grep "root" /etc/backuppc/hosts | \
 awk '{print "root@"$1" "}' | \
 sed ':a;N;$!ba;s/\n//g'` )
 
for SSH_LOGIN in "${SSH_LOGINS[@]}"
do
 HOST=`echo "${SSH_LOGIN}" | awk -F"@" '{print $2'}`
 echo "--------------------------------------------"
 echo "checking host: ${HOST}"
 ssh -C -qq -o "NumberOfPasswordPrompts=0" \
 -o "PasswordAuthentication=no" ${SSH_KEY} ${SSH_LOGIN} \
 "cut -d: -f6 /etc/passwd | xargs -i{} egrep -s \
 '^ssh-' {}/.ssh/authorized_keys {}/.ssh/authorized_keys2" | \
 cut -f 3- -d " " | sort
 ssh -C -qq -o "NumberOfPasswordPrompts=0" \
 -o "PasswordAuthentication=no" ${SSH_KEY} ${SSH_LOGIN} \
 "egrep -s '^ssh-' /etc/skel/.ssh/authorized_keys \
 /etc/skel/.ssh/authorized_keys2" | cut -f 3- -d " " | sort
done

A sample output of the script:

$ ./check_keys.sh 2>/dev/null
--------------------------------------------
checking host: a.b.com
[email protected] 
backuppc@localhost
some random key comment
--------------------------------------------
checking host: b.b.com
[...]

That’s all for now. Don’t blame me for doing it this way – I am only the messenger :)

Ubuntu (Berlin) Global Jam at c-base and Daniel Holbach’s notebook

Members of „Ubuntu Berlin“ met yesterday at c-base within the Ubuntu Global Jam. While it was nice seeing new and international faces showing up and introducing newcomers to advanced Launchpad usage, my main attraction of the day was Daniel Holbach’s notebook. He asserted it runs Maverick and starts up within five seconds, which made me laugh at first as my netbook’s startup time tripled from Lucid to Maverick to round about 45 seconds (which will at least change back until release I assume).

Ubuntu Berlin at Ubuntu Global Jam (c-base) - August 2010

Ubuntu (Berlin) Global Jam at c-base

To make it short: Between bug triaging and patching Daniel showed the startup procedure two or three times on his X61s (with an solid state disk, one has to add) and as promised it started up in five seconds after Grub. Actually this isn’t more than a fast booting notebook, but it shows the results of focussed efforts from the last one and a half year. Remember the initial „10s“ posting and the bunch of changes it took.

So I am happy looking forward to improvements for Maverick on my netbook. And yes: I am happy with 10 seconds, too.

[update]

Daniel noted, that it’s a X61s, not a T61. Changed.

Ubuntu Berlin @ LinuxTag 2010 – pickings

Saturday evening this years LinuxTag, Europes largest open source fair, closed its doors. As LinuxTag is presented at Berlin, the „Ubuntu Berlin“ was happy to support it in different activities. Let me sum up some of them:

1) Ubuntu Community booth

The „ubuntu Deutschland e.V.“, Ubuntu and Kubuntu community presented their work at a community hosted booth. A bunch of Ubuntu Berlin members supported the booth, answered hundreds of questions and helped with the proceedings.

2) Talks

Saturday Featured a lot of Ubuntu focussed talks. For the first time „Ubuntu Berlin“ and its members hosted a remarkable amount of them, e.g.:

  • Anselm Helbig – Ubuntu Berlin Lightning Talks (tmux)
  • Benjamin Drung – Ubuntu in 50 minutes, Packaging for Debian and Ubuntu
  • Caspar Clemens Mierau – Ubuntu Berlin Lightning Talks (Vimperator), Gnome-Do, Ubuntu in 50 minutes
  • Daniel Holbach – Fixing Bugs in Ubuntu (see slides), Ubuntu in 50 minutes, Ubuntu Berlin Lightning Talks (liblaunchpad)
  • Marcel Eichner – Ubuntu Berlin Lightning Talks (Franklin)
  • Torsten Franz – Supporting Ubuntu

Ubuntu in 50 minutes talk at LinuxTag 2010
minutes before the „Ubuntu in 50 minutes“ talk

3) Interviews for Radio Tux

Several members of Ubuntu Berlin gave interviews to the well known German Podcast „Radio Tux„. Daniel’s talk on bugjamming has been released already. Be sure to check the archive for other releases within the next days.

4) LinuxTag BBQ

Last but not least: The  „End of LinuxTag BBQ“ at c-base, sponsored by Canonical, has been a great success again. Three barbacues and about ten „Grillmeister“ (you cannot translate this) provided more than hundred visitors with tasty food. Members of various types of open source communities had interesting chats while relaxing at the banks of the Spree.

As Ubuntu Berlin’s support for the LinuxTag continuously increased within the last years, I am sure next year will even better. We’ll see.

My Mom Runs Ubuntu – Update for Ada Lovelace Day

Just a couple of days ago I wrote update the upcoming „Ada Lovelace Day“ – celebrating women in technology on the 24nd of March, which happens to be today. The day pledges for blog posts about this topic and here we go with an Ubuntu flavored version.

At Ubucon 2008 in Göttingen, a German Ubuntu User Conference, I noticed that a lot people hanging around discussed how nicely their parents and especially moms are using Ubuntu. No hazzle, no further explanations needed – it just worked. And often they don’t even know or notice that they are using Ubuntu, as they just standard software like Firefox and Thunderbird. As this user group doesn’t belong to the tweeting, facebooking, social web society, I decided to found a launchpad group named „My Mum Runs Ubuntu„, that has no further meaning than joining it means your mother runs Ubuntu – a simple way of giving this „silent“ user group at least a number and a marker on a map.

I was surprised how fast the member list grew and happy to see that it’s members came and come from all over the world as you can see below:

"My Mom Runs Ubuntu" global map (map by Google)

"My Mom Runs Ubuntu" global map (map by Google)

So this is not about a single heroine in technology – it is about a general movement:  I am convinced, especially Ubuntu with it’s focus on an intuitive interface seems to keep the entry level very low and therefore attracts user groups that might be a suprprise for a lot of people. I know dozens of techie people stating that free operating systems are way too complicated to use for them. When telling about „My Mom Runs Ubuntu“ they run out of reasons. At least there is nothing more convincing on using free software than people that are just using it on a daily basis without the need of telling everybody as they just take it as normal. I am sure, this user group continues to grow and am

So, if you already have a Launchpad account and your mum runs Ubuntu, too, give her a voice by just joining the group.

And if you think, this post misses real techie heroines, check the „Ubuntu Women“ project, featuring some of the most active members of the Ubuntu community.

Ada Lovelace Day Logo

Ada Lovelace Day Logo

A quick note on MySQL troubleshooting and MySQL replication

PLEASE NOTE: I am currently reviewing and extending this document.

While caring for a remarkable amount of MySQL server instances, troubleshooting becomes a common task. It might of interest for you which

Recovering a crashed MySQL server

After a server crash (meaning the system itself or just the MySQL daemon) corrupted table files are quite common. You’ll see this when checking the /var/log/syslog, as the MySQL daemon checks tables during its startup.

Apr 17 13:54:44 live1 mysqld[2613]: 090417 13:54:44 [ERROR]
  /usr/sbin/mysqld: Table './database1/table1' is marked as
  crashed and should be repaired

The MySQL daemon just told you that it found a broken MyISAM table. Now it’s up to you fixing it. You might already know, that there is the „REPAIR“ statement. So a lot of people enter their PhpMyAdmin afterwards, select database and table(s) and run the REPAIR statements. The problem with this is that in most cases your system is already in production – for instance is up again and the MySQL server already serves a bunch of requests. Therefore a REPAIR request gets slowed down dramatically. Consider taking your website down for the REPAIR – it will be faster and it’s definitely smarter not to deliver web pages based on corrupted tables.

The other disadvantage of the above method is, that you probably just shut down your web server and your PhpMyAdmin is down either or you have dozens of databases and tables and therefore it’s just a hard task to cycle through them. The better choice is the command line in this case.

If you only have a small number of corrupted tables, you can use the „mysql“ client utility doing something like:

$ mysql -u root -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 10
Server version: 5.0.75-0ubuntu10 (Ubuntu)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> REPAIR TABLE database1.table1;
+--------------------+--------+----------+----------+
| Table              | Op     | Msg_type | Msg_text |
+--------------------+--------+----------+----------+
| database1.table1   | repair | status   | OK       |
+--------------------+--------+----------+----------+
1 row in set (2.10 sec)

This works, but there is a better way: First, using OPTIMIZE in combination with REPAIR is suggested and there is a command line tool only for REPAIR jobs. Consider this call:

$ mysqlcheck -u root -p --auto-repair --check --optimize database1
Enter password:
database1.table1      OK
database1.table2      Table is already up to date

As you see, MySQL just checked the whole database and tried to repair and optimize it.

The great deal about using „mysqlcheck“ is, that it can also be run against all databases in one run without the need of getting a list of them in advance:

$ mysqlcheck -u root -p --auto-repair --check --optimize \
  --all-databases

Of course you need to consider if an optimize of all your databases and tables might just take too long if you have huge tables. On the other hand a complete run prevents of thinking about a probably missed table.

[update]

nobse pointed out in the comments, that it’s worth having a look at the automatic MyIsam repair options in MySQL. So have a look at them if you want to automate recovery:

option_mysqld_myisam-recover

Recovering a broken replication

MySQL replication is an easy method of load balancing database queries to multiple servers or just continuously backing up data. Though it is not hard to setup, troubleshooting it might be a hard task. A common reason for a broken replication is a server crash – the replication partner notices that there are broken queries – or even worse: the MySQL slave just guesses there is an error though there is none. I just ran into the latter one as a developer executed a „DROP VIEW“ on a non-existing VIEW on the master. The master justs returns an error and ignores. But as this query got replicated to the MySQL SLAVE, the slave thinks it cannot enroll a query and immediately stopped replication. This is just an example of a possible error (and a hint on using „IF EXISTS“ as often as possible).

Actually all you want to do now, is telling the slave to ignore just one query. All you need to do for this is stopping the slave, telling it to skip one query and starting the slave again:

$ mysql -u root -p
mysql> STOP SLAVE;
mysql> SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
mysql> START SLAVE;

That’s all about this.

Recreating databases and tables the right way

In the next topic you’ll recreate databases. A common mistake when dropping and recreating tables and databases is forgetting about all the settings it had – especially charsets which can run you into trouble later on („Why do all these umlauts show up scrambled?“). The best way of recreating tables and databases or creating them on other systems therefore is using the „SHOW CREATE“ statement. You can use „SHOW CREATE DATABASE database1“ or „SHOW CREATE TABLE database1.table1“ providing you with a CREATE statement with all current settings applied.

mysql> show create database database1;
+-----------+--------------------------------------------------------------------+
| Database  | Create Database                                                    |
+-----------+--------------------------------------------------------------------+
| database1 | CREATE DATABASE `database1` /*!40100 DEFAULT CHARACTER SET utf8 */ |
+-----------+--------------------------------------------------------------------+
1 row in set (0.00 sec)

The important part in this case is the „comment“ after the actual create statement. It is executed only on compatible MySQL server versions and makes sure, your are running utf8 on the database.

Keep this in mind and it might save you a lot of trouble.

Fixing replication when master binlog is broken

When your MySQL master crashes there is a slight chance that your master binlog gets corrupted. This means that the slaves won’t receive updates anymore stating:

[ERROR] Slave: Could not parse relay log event entry. The possible reasons are: the master’s binary log is corrupted (you can check this by running ‚mysqlbinlog‘ on the binary log), the slave’s relay log is corrupted (you can check this by running ‚mysqlbinlog‘ on the relay log), a network problem, or a bug in the master’s or slave’s MySQL code. If you want to check the master’s binary log or slave’s relay log, you will be able to know their names by issuing ‚SHOW SLAVE STATUS‘ on this slave. Error_code: 0

You might have luck when only the slave’s relay log is corrupted as you can fix this with the steps mentioned above. But a corrupted binlog on the master might not be fixable though the databases itself can be fixed. Depending on your time you try to use the „SQL_SLAVE_SKIP_COUNTER“ from above but actually the only way is to setup

Setting up replication from scratch

There are circumstances forcing you to start replication from scratch. For instance you have a server going live for the first time and actually all those test imports don’t need to be replicated to the slave anymore as this might last hours. My quick note for this (consider backing up your master database before!)

slave: STOP SLAVE;
slave: RESET SLAVE;
slave: SHOW CREATE DATABASE datenbank;
slave: DROP DATABASE datenbank;
slave: CREATE DATABASE datenbank;

master: DROP DATABASE datenbak;
master: SHOW CREATE DATABASE datenbank;
master: CREATE DATABASE datenbank;
master: RESET MASTER

slave: CHANGE MASTER TO MASTER_USER="slave-user", \
MASTER_PASSWORD="slave-password", MASTER_HOST="master.host";
slave: START SLAVE

You just started replication from scratch, check „SHOW SLAVE STATUS“ on the slave and „SHOW MASTER STATUS“ on the master.

Deleting unneeded binlog files

Replication needs binlog files – a mysql file format for storing database changes in a binary format. Sometimes it is hard to decide how many of the binlog files you want to keep on the server possibly getting you into disk space trouble. Therefore deleting binlog files that have already been transferred to the client might be a smart idea when running low on space.

First you need to know which binlog files the slave already fetched. You can do this by having a look on „SHOW SLAVE STATUS;“ on the slave. Now log into the MySQL master and run something like:

mysql> PURGE BINARY LOGS TO 'mysql-bin.010';

You can even do this on a date format level:

mysql> PURGE BINARY LOGS BEFORE '2008-04-02 22:46:26';

Conclusion

The above hints might save you same time when recovering or troubleshooting a MySQL server. Please note, that these are hints and you have – at any time – make sure, that your data has an up to date backup. Nothing will help you more.