I make the kernel crash a lot. To debug those crashes, I add a lot of printk()s, recompile the kernel, and make it crash again. I repeat this until I fix the crash. This is time consuming, especially when the crash is bad enough that the system is unusable. For every cycle, I have to:
- Gather all of the information I need about the crash
- Reboot the system
- Boot to a good kernel
- Recompile, or copy the kernel over
- Reboot again
- Load the new kernel
On some hardware, each of these reboots can be upwards of several minutes. Virtually all (>90%) of the time is spent waiting for the firmware before the bootloader and and way before the kernel loads. If I could get this down to one reboot cycle instead of two, it would drastically reduce the amount of time that this takes. Ideally, I'd also like to be compiling in parallel with the system booting.
My solution to this is to use a separate build machine and iPXE (I used to use GRUB for this). In short, iPXE can boot your system from the network. I use it instead of the "boot to a good kernel", "recompile, or copy new kernel over" step.
Step 1: Put your kernel image on a web server
There are many ways to do this, but here's how I do it:
- make sure ~/public_html exists and can serve content. If your web server uses a different prefix, make sure and change it.
- On my compile server, I take a copy of /sbin/installkernel, put it in ~/bin/installkernel,
- Type "make install" in a kernel tree, it runs this script. My script places all of the kernel images in ~/public_html/
- Ensure that you can fetch kernel images by navigating to this path on the web server in your browser.
Step 2: Get, Compile, and Install iPXE
git clone http://git.ipxe.org/ipxe.git
make -j4 bin/ipxe.lkrn
cp bin/ipxe.lkrn /boot
Step 3: Get iPXE the information it needs to boot
iPXE needs two bits of information to boot:
- an IP address
- A URL to boot from
Those can be assigned at compile time, passed in by a DHCP server, or passed in at boot-time. I prefer to let the DHCP server assign the IP address, but I pass in the URL at boot-time from GRUB by putting the following in menu.lst:
title iPXE uuid
kernel /boot/ipxe.lkrn && dhcp && chain http://220.127.116.11/~dave/ipxe.script
Remember to fill in your IP address and URL with appropriate values for your server.
Step 4: Tell iPXE from where to fetch a kernel image and what arguments to pass
Note that it references a URL on a web server. We need to go make sure that file exists. The kernel command-line should be a copy of what you see in menu.lst above.
$ cat ~/public_html/ipxe.script
#!gpxe kernel http://18.104.22.168/~dave/vmlinuz root=
ro debug profile=2 console=ttyS0,115200
Again, that will differ for your systems. You can either build the initrd each time you compile a kernel, or just ensure that the kernel image doesn't need any modules at all, and use an arbitrary initrd. Note that the whole kernel command-line is now in this file. That means that you can edit it on the web server with vim or emacs instead of having to do it on the GRUB command-line. I greatly prefer this to trying to use GRUB's console.
iPXE also has support for variables. You can do fun things like fetch a file with the system's IP address in the filename: