Using Centos 5.2 stateless Linux support on a flash based root filesystem
Notes on using stateless linux support with a compact flash based root filesystem.
The stateless Linux support in CentOS v5.2 is provided by the initscripts (8.45.19.1.EL-1.el5) package. According to comments found through google, the stateless linux support is intended for live images. Stateless Linux provides support for:
- a read-only root filesystem
- putting temporary files in a temporary filesystem
- mounting read/write persistent state from a local filesystem or NFS
Stateless Linux Documentation
Documentation generated from reverse engineering the scripts.
Files
The files & directories involved in a stateless linux configuration are:
File/Directory | Description |
---|---|
/etc/sysconfig/readonly-root |
the top level configuration file |
/etc/rwtab | a configuration file for the list of files and directories that should be mounted in the temporary read-write filesystem |
/etc/rwtab.d/ | a directory of rwtab configuration files |
/.snapshot | the default mount point for the stateless configuration filesystem [Note: This is /var/lib/stateless/state on later versions of the initscripts] |
/var/lib/stateless/writable | the default mount point for the temporary read-write filesystem |
<STATE_MOUNT>/etc | this directory must be present in the state device. The script checks that this directory is present. |
<STATE_MOUNT>/files | the list of files/directories to mount. The files must be listed one per line. The files/directories must exist in both the state filesystem and the root filesystem [Gottcha: If the file is to be mounted in a directory that is also listed in the rwtab configuration file, it needs to be present in the tmpfs] |
/etc/statetab | [Note: Not supported in CentOS 5.2 initscripts v8.45.19.1] |
/etc/statetab.d | [Note: Not supported in CentOS 5.2 initscripts v8.45.19.1] |
Kernel parameters
The following kernel parameters are supported:
Parameter | Description |
---|---|
'readonlyroot' | override the configuration of 'READONLY' parameter of the /etc/sysconfig/readonly-root configuration to the value 'true' |
'noreadonlyroot' | override the configuration of 'READONLY' parameter of the /etc/sysconfig/readonly-root configuration to the value 'false'. Setting this value will no override the setting of 'TEMPORARY_STATE'. |
Readonly-root configuration
The configuration file /etc/sysconfig/readonly-root supports the following variables:
Variable Name |
Values |
Description |
---|---|---|
READONLY | yes | no |
Whether to enable support for 'Stateless Linux'. |
TEMPORARY_STATE | yes | no |
Whether to mount the files/directories listed in the rwtab configuration files into a temporary filesystem. Implied to be enabled if READONLY is 'yes' |
RW_MOUNT | <directory> [default=/var/lib/stateless/writable] | The mount point for the temporary scratch writable space. There are three options for mounting this:
|
RW_LABEL | <filesystem label> [default 'stateless-rw']. |
Label on local filesystem which can be used for temporary scratch space. Note: UUID's are not supported. |
RW_OPTIONS | [Note: Not supported in CentOS 5.2 initscripts v8.45.19.1] | |
STATE_MOUNT | <directory> [default '/.snapshot', or '/var/lib/stateless/state' on later versions] | Where to mount to the persistent data. There are three options mounting this that are attempted by the script:
|
STATE_LABEL | [default 'stateless-state'] | The label for partition with persistent data. |
STATE_OPTIONS | [Note: Not supported in CentOS 5.2 initscripts v8.45.19.1] | |
CLIENTSTATE | Used to mount NFS state filesystem |
A nearly read-only root filesystem
These notes are aimed at using a compact flash based root filesystem, where the truely read-only root filesystem feature is not required. Limiting write cycles is a good thing, but keeping the convenience of being able to write/update packages and configuration is useful.
Given a root filesystem backed by a simple flash device, it is desirable to limit the number of write cycles performed. To this end, configure the machine so that:
- the root filesystem doesn't write atimes
- log to another host (because /var/log will be lost on reboot)
- set 'TEMPORARY_STATE=yes' in /etc/sysconfig/readonly-root
noatime
Use the 'noatime' (no access time) mount option on the root filesystem:LABEL=/ / ext3 noatime 1 1
Note: The CentOS 5.2 'util-linux' package doesn't support the 'relatime' mount option.
Accessing the 'real' root filesystem
The root filesystem has various files and directories mounted on it, thus obscuring the 'real' files. I added a mount for the root filesystem so that it was (easily) possible to edit files like /etc/fstab. I added the following to the fstab (Note: You must edit the real fstab file, not the one on the temporary filesystem).
/ /mnt/root none bind 0 0
Temporary filesystem
The scripts will make three attempts to create a temporary filesystem. Performing no additional configuration will mean the last option (see above) will create a default tmpfs filesystem. On a machine with no swap, it might be a good idea to size the tmpfs (the default maximum size is half the physical RAM, which is great when you have swap).
Provide an '/etc/fstab' entry for the temporary filesystem:
tmpfs /var/lib/stateless/writable tmpfs noauto,size=128M 0 0
Note: Consider sizing '/dev/shm' in the /etc/fstab configuration.
Monitoring
Use inotify tools to monitor filesystem write access. Install inotify-tools directly from the dag repository (given it is only one package, don't bother installing the RPMForge yum repo).
# rpm -Uvh http://rpmforge.sw.be/redhat/el5/en/i386/rpmforge/RPMS/inotify-tools-3.13-1.el5.rf.i386.rpm
Once the machine has been restarted, it is possible to view the effect of the stateless linux configuration by vieeing '/proc/mounts'. The content in '/etc/mtab' is incomplete since most of the mounts are performed with the --n' option.
$ cat /proc/mounts
Links
- StatelessLinux
- inotify-tools
- initscripts releases
- http://lxr.linux.no/linux/Documentation/filesystems/tmpfs.txt
Appendices
/etc/sysconfig/readonly-root
# Set to 'yes' to mount the system filesystems read-only. READONLY=no # Set to 'yes' to mount various temporary state as either tmpfs # or on the block device labelled RW_LABEL. Implied by READONLY TEMPORARY_STATE=no # Place to put a tmpfs for temporary scratch writable space RW_MOUNT=/var/lib/stateless/writable # Label on local filesystem which can be used for temporary scratch space RW_LABEL=stateless-rw # Label for partition with persistent data STATE_LABEL=stateless-state # Where to mount to the persistent data STATE_MOUNT=/.snapshot
/etc/rwtab
dirs /var/cache/man dirs /var/gdm dirs /var/lock dirs /var/log dirs /var/run empty /tmp empty /var/cache/foomatic empty /var/cache/logwatch empty /var/cache/mod_ssl empty /var/cache/mod_proxy empty /var/cache/php-pear empty /var/cache/systemtap empty /var/db/nscd empty /var/lib/dav empty /var/lib/dhcp empty /var/lib/dhclient empty /var/lib/php empty /var/lib/ups empty /var/tmp empty /var/tux files /etc/adjtime files /etc/fstab files /etc/mtab files /etc/ntp.conf files /etc/resolv.conf files /etc/lvm/.cache files /var/account files /var/arpwatch files /var/cache/alchemist files /var/lib/iscsi files /var/lib/logrotate.status files /var/lib/ntp files /var/lib/xen
initscripts v8.45.19.1/etc/rc.sysinit (CentOS v5.2)
This is a small section of the init script relating to the stateless linux support
READONLY= if [ -f /etc/sysconfig/readonly-root ]; then . /etc/sysconfig/readonly-root fi if strstr "$cmdline" readonlyroot ; then READONLY=yes [ -z "$RW_MOUNT" ] && RW_MOUNT=/var/lib/stateless/writable fi if strstr "$cmdline" noreadonlyroot ; then READONLY=no fi if [ "$READONLY" = "yes" -o "$TEMPORARY_STATE" = "yes" ]; then mount_empty() { if [ -e "$1" ]; then echo "$1" | cpio -p -vd "$RW_MOUNT" &>/dev/null mount -n --bind "$RW_MOUNT$1" "$1" fi } mount_dirs() { if [ -e "$1" ]; then mkdir -p "$RW_MOUNT$1" # fixme: find is bad find "$1" -type d -print0 | cpio -p -0vd "$RW_MOUNT" &>/dev/null mount -n --bind "$RW_MOUNT$1" "$1" fi } mount_files() { if [ -e "$1" ]; then cp -a --parents "$1" "$RW_MOUNT" mount -n --bind "$RW_MOUNT$1" "$1" fi } # Common mount options for scratch space regardless of # type of backing store mountopts= # Scan partitions for local scratch storage rw_mount_dev=$(blkid -t LABEL="$RW_LABEL" -o device | awk '{ print ; exit }') # First try to mount scratch storage from /etc/fstab, then any # partition with the proper label. If either succeeds, be sure # to wipe the scratch storage clean. If both fail, then mount # scratch storage via tmpfs. if mount $mountopts "$RW_MOUNT" > /dev/null 2>&1 ; then rm -rf "$RW_MOUNT" > /dev/null 2>&1 elif [ x$rw_mount_dev != x ] && mount $rw_mount_dev $mountopts "$RW_MOUNT" > /dev/null 2>&1; then rm -rf "$RW_MOUNT" > /dev/null 2>&1 else mount -n -t tmpfs $mountopts none "$RW_MOUNT" fi for file in /etc/rwtab /etc/rwtab.d/* ; do is_ignored_file "$file" && continue [ -f $file ] && cat $file | while read type path ; do case "$type" in empty) mount_empty $path ;; files) mount_files $path ;; dirs) mount_dirs $path ;; *) ;; esac [ -n "$SELINUX_STATE" -a -e "$path" ] && restorecon -R "$path" done done # In theory there should be no more than one network interface active # this early in the boot process -- the one we're booting from. # Use the network address to set the hostname of the client. This # must be done even if we have local storage. ipaddr= if [ "$HOSTNAME" = "localhost" -o "$HOSTNAME" = "localhost.localdomain" ]; then ipaddr=$(ip addr show to 0/0 scope global | awk '/[[:space:]]inet / { print gensub("/.*","","g",$2) }') if [ -n "$ipaddr" ]; then eval $(ipcalc -h $ipaddr 2>/dev/null) hostname ${HOSTNAME} fi fi # Clients with read-only root filesystems may be provided with a # place where they can place minimal amounts of persistent # state. SSH keys or puppet certificates for example. # # Ideally we'll use puppet to manage the state directory and to # create the bind mounts. However, until that's all ready this # is sufficient to build a working system. # First try to mount persistent data from /etc/fstab, then any # partition with the proper label, then fallback to NFS state_mount_dev=$(blkid -t LABEL="$STATE_LABEL" -o device | awk '{ print ; exit }') if mount $mountopts "$STATE_MOUNT" > /dev/null 2>&1 ; then /bin/true elif [ x$state_mount_dev != x ] && mount $state_mount_dev $mountopts "$STATE_MOUNT" > /dev/null 2>&1; then /bin/true elif [ -n "$CLIENTSTATE" ]; then # No local storage was found. Make a final attempt to find # state on an NFS server. mount -t nfs $CLIENTSTATE/$HOSTNAME $STATE_MOUNT -o rw,nolock fi if [ -d $STATE_MOUNT/etc ]; then # Copy the puppet CA's cert from the r/o image into the # state directory so that we can create a bind mount on # the ssl directory for storing the client cert. I'd really # rather have a unionfs to deal with this stuff cp --parents -f -p /var/lib/puppet/ssl/certs/ca.pem $STATE_MOUNT 2>/dev/null # In the future this will be handled by puppet for i in $(grep -v "^#" $STATE_MOUNT/files); do if [ -e $i ]; then mount -n -o bind $STATE_MOUNT/${i} ${i} fi done fi fi
initscripts v8.86 /etc/rc.sysinit
READONLY= if [ -f /etc/sysconfig/readonly-root ]; then . /etc/sysconfig/readonly-root fi if strstr "$cmdline" readonlyroot ; then READONLY=yes [ -z "$RW_MOUNT" ] && RW_MOUNT=/var/lib/stateless/writable [ -z "$STATE_MOUNT" ] && STATE_MOUNT=/var/lib/stateless/state fi if strstr "$cmdline" noreadonlyroot ; then READONLY=no fi if [ "$READONLY" = "yes" -o "$TEMPORARY_STATE" = "yes" ]; then mount_empty() { if [ -e "$1" ]; then echo "$1" | cpio -p -vd "$RW_MOUNT" &>/dev/null mount -n --bind "$RW_MOUNT$1" "$1" fi } mount_dirs() { if [ -e "$1" ]; then mkdir -p "$RW_MOUNT$1" find "$1" -type d -print0 | cpio -p -0vd "$RW_MOUNT" &>/dev/null mount -n --bind "$RW_MOUNT$1" "$1" fi } mount_files() { if [ -e "$1" ]; then cp -a --parents "$1" "$RW_MOUNT" mount -n --bind "$RW_MOUNT$1" "$1" fi } # Common mount options for scratch space regardless of # type of backing store mountopts= # Scan partitions for local scratch storage rw_mount_dev=$(blkid -t LABEL="$RW_LABEL" -l -o device) # First try to mount scratch storage from /etc/fstab, then any # partition with the proper label. If either succeeds, be sure # to wipe the scratch storage clean. If both fail, then mount # scratch storage via tmpfs. if mount $mountopts "$RW_MOUNT" > /dev/null 2>&1 ; then rm -rf "$RW_MOUNT" > /dev/null 2>&1 elif [ x$rw_mount_dev != x ] && mount $rw_mount_dev $mountopts "$RW_MOUNT" > /dev/null 2>&1; then rm -rf "$RW_MOUNT" > /dev/null 2>&1 else mount -n -t tmpfs $RW_OPTIONS $mountopts none "$RW_MOUNT" fi for file in /etc/rwtab /etc/rwtab.d/* ; do is_ignored_file "$file" && continue [ -f $file ] && cat $file | while read type path ; do case "$type" in empty) mount_empty $path ;; files) mount_files $path ;; dirs) mount_dirs $path ;; *) ;; esac [ -n "$SELINUX_STATE" -a -e "$path" ] && restorecon -R "$path" done done # In theory there should be no more than one network interface active # this early in the boot process -- the one we're booting from. # Use the network address to set the hostname of the client. This # must be done even if we have local storage. ipaddr= if [ "$HOSTNAME" = "localhost" -o "$HOSTNAME" = "localhost.localdomain" ]; then ipaddr=$(ip addr show to 0.0.0.0/0 scope global | awk '/[[:space:]]inet / { print gensub("/.*","","g",$2) }') if [ -n "$ipaddr" ]; then eval $(ipcalc -h $ipaddr 2>/dev/null) hostname ${HOSTNAME} fi fi # Clients with read-only root filesystems may be provided with a # place where they can place minimal amounts of persistent # state. SSH keys or puppet certificates for example. # # Ideally we'll use puppet to manage the state directory and to # create the bind mounts. However, until that's all ready this # is sufficient to build a working system. # First try to mount persistent data from /etc/fstab, then any # partition with the proper label, then fallback to NFS state_mount_dev=$(blkid -t LABEL="$STATE_LABEL" -l -o device) if mount $mountopts $STATE_OPTIONS "$STATE_MOUNT" > /dev/null 2>&1 ; then /bin/true elif [ x$state_mount_dev != x ] && mount $state_mount_dev $mountopts "$STATE_MOUNT" > /dev/null 2>&1; then /bin/true elif [ ! -z "$CLIENTSTATE" ]; then # No local storage was found. Make a final attempt to find # state on an NFS server. mount -t nfs $CLIENTSTATE/$HOSTNAME $STATE_MOUNT -o rw,nolock fi if [ -w "$STATE_MOUNT" ]; then mount_state() { if [ -e "$1" ]; then [ ! -e "$STATE_MOUNT$1" ] && cp -a --parents "$1" "$STATE_MOUNT" mount -n --bind "$STATE_MOUNT$1" "$1" fi } for file in /etc/statetab /etc/statetab.d/* ; do is_ignored_file "$file" && continue [ ! -f "$file" ] && continue if [ -f "$STATE_MOUNT/$file" ] ; then mount -n --bind "$STATE_MOUNT/$file" "$file" fi for path in $(grep -v "^#" "$file" 2>/dev/null); do mount_state "$path" [ -n "$SELINUX_STATE" -a -e "$path" ] && restorecon -R "$path" done done if [ -f "$STATE_MOUNT/files" ] ; then for path in $(grep -v "^#" "$STATE_MOUNT/files" 2>/dev/null); do mount_state "$path" [ -n "$SELINUX_STATE" -a -e "$path" ] && restorecon -R "$path" done fi fi fi