Btrfs RAID Setup

May 02 2014 Published by under Linux

We got a new server to be set up to build binary packages for DataMill. There was already RAID set up on it, and Linux automatically takes control of the disks. The first thing to do if you get errors such as unable to open /dev/sdb1: Device or resource busy or error checking /dev/sdc1 status: No such file or directory is to run fdisk to erase all partitions. Then reboot with parameters nodmraid nomdadm from your live CD, for example the System Rescue CD. After the reboot, I stopped the RAID controller and continued on with formatting.

root@sysresccd /root % cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md3 : active raid1 sda[0] sdb[1]
955692672 blocks [2/2] [UU]

unused devices:
root@sysresccd /root % mdadm --stop /dev/md3
mdadm: stopped /dev/md3
root@sysresccd /root % fdisk -l
Disk /dev/sda: 1000.2 GB, 1000204886016 bytes, 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x6ac24fb3
Device Boot Start End Blocks Id System
Disk /dev/sdb: 1000.2 GB, 1000204886016 bytes, 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x4de339dc
Device Boot Start End Blocks Id System
Disk /dev/sdc: 1000.2 GB, 1000204886016 bytes, 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x52af89fe
Device Boot Start End Blocks Id System
Disk /dev/sdd: 15.5 GB, 15504900096 bytes, 30283008 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0001d5e6
Device Boot Start End Blocks Id System
/dev/sdd1 * 1 30283007 15141503+ c W95 FAT32 (LBA)
root@sysresccd /root % fdisk /dev/sda
The device presents a logical sector size that is smaller than
the physical sector size. Aligning to a physical sector (or optimal
I/O) size boundary is recommended, or performance may be impacted.
Welcome to fdisk (util-linux 2.22.2).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.
Command (m for help): n
Partition type:
p primary (0 primary, 0 extended, 4 free)
e extended
Select (default p):
Using default response p
Partition number (1-4, default 1):
Using default value 1
First sector (2048-1953525167, default 2048):
Using default value 2048
Last sector, +sectors or +size{K,M,G} (2048-1953525167, default 1953525167): +500M
Partition 1 of type Linux and of size 500 MiB is set
Command (m for help): n
Partition type:
p primary (1 primary, 0 extended, 3 free)
e extended
Select (default p):
Using default response p
Partition number (1-4, default 2):
Using default value 2
First sector (1026048-1953525167, default 1026048):
Using default value 1026048
Last sector, +sectors or +size{K,M,G} (1026048-1953525167, default 1953525167):
+2G
Partition 2 of type Linux and of size 2 GiB is set
Command (m for help): n
Partition type:
p primary (2 primary, 0 extended, 2 free)
e extended
Select (default p):
Using default response p
Partition number (1-4, default 3):
Using default value 3
First sector (5220352-1953525167, default 5220352):
Using default value 5220352
Last sector, +sectors or +size{K,M,G} (5220352-1953525167, default 1953525167):
Using default value 1953525167
Partition 3 of type Linux and of size 929 GiB is set
Command (m for help): p
Disk /dev/sda: 1000.2 GB, 1000204886016 bytes, 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk identifier: 0x6ac24fb3
Device Boot Start End Blocks Id System
/dev/sda1 2048 1026047 512000 83 Linux
/dev/sda2 1026048 5220351 2097152 83 Linux
/dev/sda3 5220352 1953525167 974152408 83 Linux
Command (m for help): t
Partition number (1-4): 2
Hex code (type L to list codes): 82
Changed system type of partition 2 to 82 (Linux swap / Solaris)
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
root@sysresccd /root % sfdisk -d /dev/sda > part_table
root@sysresccd /root % sfdisk /dev/sdb < part_table
Checking that no-one is using this disk right now ...
OK
Disk /dev/sdb: 121601 cylinders, 255 heads, 63 sectors/track
Old situation:
Units: cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0
Device Boot Start End #cyls #blocks Id System
/dev/sdb1 0 - 0 0 0 Empty
/dev/sdb2 0 - 0 0 0 Empty
/dev/sdb3 0 - 0 0 0 Empty
/dev/sdb4 0 - 0 0 0 Empty
New situation:
Units: sectors of 512 bytes, counting from 0
Device Boot Start End #sectors Id System
/dev/sdb1 2048 1026047 1024000 83 Linux
/dev/sdb2 1026048 5220351 4194304 82 Linux swap / Solaris
/dev/sdb3 5220352 1953525167 1948304816 83 Linux
/dev/sdb4 0 - 0 0 Empty
Warning: partition 1 does not end at a cylinder boundary
Warning: partition 2 does not start at a cylinder boundary
Warning: partition 2 does not end at a cylinder boundary
Warning: partition 3 does not start at a cylinder boundary
Warning: partition 3 does not end at a cylinder boundary
Warning: no primary partition is marked bootable (active)
This does not matter for LILO, but the DOS MBR will not boot this disk.
Successfully wrote the new partition table
Re-reading the partition table ...
If you created or changed a DOS partition, /dev/foo7, say, then use dd(1)
to zero the first 512 bytes: dd if=/dev/zero of=/dev/foo7 bs=512 count=1
(See fdisk(8).)
root@sysresccd /root % sfdisk /dev/sdc < part_table
Checking that no-one is using this disk right now ...
OK
Disk /dev/sdc: 121601 cylinders, 255 heads, 63 sectors/track
Old situation:
Units: cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

Device Boot Start End #cyls #blocks Id System
/dev/sdc1 0 - 0 0 0 Empty
/dev/sdc2 0 - 0 0 0 Empty
/dev/sdc3 0 - 0 0 0 Empty
/dev/sdc4 0 - 0 0 0 Empty
New situation:
Units: sectors of 512 bytes, counting from 0

Device Boot Start End #sectors Id System
/dev/sdc1 2048 1026047 1024000 83 Linux
/dev/sdc2 1026048 5220351 4194304 82 Linux swap / Solaris
/dev/sdc3 5220352 1953525167 1948304816 83 Linux
/dev/sdc4 0 - 0 0 Empty
Warning: partition 1 does not end at a cylinder boundary
Warning: partition 2 does not start at a cylinder boundary
Warning: partition 2 does not end at a cylinder boundary
Warning: partition 3 does not start at a cylinder boundary
Warning: partition 3 does not end at a cylinder boundary
Warning: no primary partition is marked bootable (active)
This does not matter for LILO, but the DOS MBR will not boot this disk.
Successfully wrote the new partition table

Re-reading the partition table ...

If you created or changed a DOS partition, /dev/foo7, say, then use dd(1)
to zero the first 512 bytes: dd if=/dev/zero of=/dev/foo7 bs=512 count=1
(See fdisk(8).)
root@sysresccd /root % mkfs.btrfs -d raid5 /dev/sda1 /dev/sdb1 /dev/sdc1
/dev/sda1 appears to contain an existing filesystem (btrfs).
Error: Use the -f option to force overwrite.
root@sysresccd /root % mkfs.btrfs -f -d raid5 /dev/sda1 /dev/sdb1 /dev/sdc1
SMALL VOLUME: forcing mixed metadata/data groups
ERROR: With mixed block groups data and metadata profiles must be the same
root@sysresccd /root % mkfs.btrfs -f -d raid5 /dev/sda3 /dev/sdb3 /dev/sdc3
Error: unable to open /dev/sda3: Device or resource busy
root@sysresccd /root % cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md3 : active raid1 sdb3[1] sda3[0]
955692672 blocks [2/2] [UU]

unused devices:
root@sysresccd /root % mdadm --stop /dev/md3
mdadm: stopped /dev/md3
root@sysresccd /root % mkfs.btrfs -f -d raid5 /dev/sda3 /dev/sdb3 /dev/sdc3

WARNING! - Btrfs v3.12 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

Turning ON incompat feature 'extref': increased hardlink limit per file to 65536Turning ON incompat feature 'raid56': raid56 extended format
adding device /dev/sdb3 id 2
adding device /dev/sdc3 id 3
fs created label (null) on /dev/sda3
nodesize 16384 leafsize 16384 sectorsize 4096 size 2.72TiB
Btrfs v3.12
root@sysresccd /root % mkfs.btrfs -f -O ^extref -d raid5 /dev/sda3 /dev/sdb3 /dev/sdc3

WARNING! - Btrfs v3.12 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

Turning ON incompat feature 'raid56': raid56 extended format
adding device /dev/sdb3 id 2
adding device /dev/sdc3 id 3
fs created label (null) on /dev/sda3
nodesize 16384 leafsize 16384 sectorsize 4096 size 2.72TiB
Btrfs v3.12

When creating the btrfs volume for root, I turned off the extra hardlink feature with -O ^extref, and I ended up creating the /boot volume with

mkfs.btrfs -f /dev/sda1 /dev/sdb1 /dev/sdc1

to avoid the error. There are other options for creating a btrfs volume, such as specifying raid levels for data and metadata.

I mounted the btrfs volumes with -o compress=zlib during the install. To keep the compression when files are overwritten, these options must be included in /etc/fstab.

/dev/sda1               /boot           btrfs           compress=zlib,noauto,noatime    0 0
/dev/sda3               /               btrfs           compress=zlib,noatime  0 0
/dev/sda2               none            swap            sw              0 0
/dev/sdb2               none            swap            sw              0 0
/dev/sdc2               none            swap            sw              0 0

When compiling the kernel, RAID and LVM are not necessary since they are included in btrfs. LVM capabilities, such as growing a volume, are also available.
I used dracut to generate an initramfs that supports btrfs RAID boot with modifications to the configuration file /etc/dracut.conf.d.

# PUT YOUR CONFIG HERE OR IN separate files named *.conf
# in /etc/dracut.conf.d
# SEE man dracut.conf(5)

# Sample dracut config file

#logfile=/var/log/dracut.log
#fileloglvl=6

# Exact list of dracut modules to use.  Modules not listed here are not going
# to be included.  If you only want to add some optional modules use
# add_dracutmodules option instead.
#dracutmodules+=""

# dracut modules to omit
#omit_dracutmodules+=""

# dracut modules to add to the default
add_dracutmodules+="btrfs"

# additional kernel modules to the default
#add_drivers+=""

# list of kernel filesystem modules to be included in the generic initramfs
filesystems+="btrfs"

# build initrd only to boot current hardware
#hostonly="yes"
#

# install local /etc/mdadm.conf
mdadmconf="no"

# install local /etc/lvm/lvm.conf
lvmconf="no"

# A list of fsck tools to install. If it's not specified, module's hardcoded
# default is used, currently: "umount mount /sbin/fsck* xfs_db xfs_check
# xfs_repair e2fsck jfs_fsck reiserfsck btrfsck". The installation is
# opportunistic, so non-existing tools are just ignored.
#fscks=""

# inhibit installation of any fsck tools
nofscks="yes"

# mount / and /usr read-only by default
#ro_mnt="no"

# set the directory for temporary files
# default: /var/tmp
#tmpdir=/tmp
use_fstab="yes"

I then ran the command dracut --hostonly --force 'initramfs-genkernel-x86_64-3.12.13-gentoo' 3.12.13-gentoo which overwrote the file /boot/initramfs-genkernel-x86_64-3.12.13-gentoo.

No responses yet

Planning For the Next Version of Fiddle Salad: How I Nearly Jumped to My Next Project

The work done on Fiddle Salad this month would not have been possible without last month’s planning. Furthermore, Fiddle Salad would not have been my idea if I did not invest time in building Python Fiddle. Python Fiddle was really the end product of 9 years of dreams of running a high performance computer and the result of my experience using Gentoo Linux. So I bought a computer to build Python Fiddle, which also turned out to be necessary to run the latest IDE and development tools to build Fiddle Salad.  When I started working with the Python interpreter in JavaScript,  it was horrendously slow. It took about 20 seconds to load and took up almost 1GB of memory. Any text editor except Vim without syntax highlighting was quick enough to edit the 12MB source code file.

Fiddle Salad is an evolution of both the original idea and code base that belonged to Python Fiddle. Now it is really Fiddle Salad that’s driving the development of Python Fiddle, because they share much of the code base. 

So this is the third major milestone, which I almost gave up on before I embarked on it. Before I started work on this milestone, actually a day or two before I planned, I suddenly noticed huge, discouraging signs. They came as shocking surprises. For example, I discovered a hidden option in an application I have used often before that had some of the functionality I was going to build. If that wasn’t enough, it was actually quite popular and many people probably knew that feature. As another example, I discovered another application that was more innovative in certain aspects than the application I planned to build. I got still more examples, but they aren’t worth repeating here.

As a habit, I reached for my next plan and the best tools I have available. I then realized that I would be throwing away about 8 months of work and the plans for this month, which worked out so well. Although I had no reason and no incentive at all to work on Fiddle Salad, I did so only because I enjoyed every moment of it. I believe that’s what we are all here for, the very drumbeat of the universe.

In the end, those serious signs got swallowed up by my project, as I managed to either include their ideas or integrate them right into it. Fiddle Salad is really the culmination and peak of all live web development environments, having the best features in all of them and in my imagination.

No responses yet