Personal tools
You are here: Home Linux I/O Using Centos 5.2 stateless Linux support on a flash based root filesystem
 

Using Centos 5.2 stateless Linux support on a flash based root filesystem

Notes on using stateless linux support with a compact flash based root filesystem.

The stateless Linux support in CentOS v5.2 is provided by the initscripts (8.45.19.1.EL-1.el5) package. According to comments found through google, the stateless linux support is intended for live images. Stateless Linux provides support for:

  • a read-only root filesystem
  • putting temporary files in a temporary filesystem
  • mounting read/write persistent state from a local filesystem or NFS

 

Stateless Linux Documentation

Documentation generated from reverse engineering the scripts.

Files

The files & directories involved in a stateless linux configuration are:

File/Directory Description
/etc/sysconfig/readonly-root
the top level configuration file
/etc/rwtab a configuration file for the list of files and directories that should be mounted in the temporary read-write filesystem
/etc/rwtab.d/ a directory of rwtab configuration files
/.snapshot the default mount point for the stateless configuration filesystem [Note: This is /var/lib/stateless/state on later versions of the initscripts]
/var/lib/stateless/writable the default mount point for the temporary read-write filesystem
<STATE_MOUNT>/etc this directory must be present in the state device. The script checks that this directory is present.
<STATE_MOUNT>/files the list of files/directories to mount. The files must be listed one per line. The files/directories must exist in both the state filesystem and the root filesystem [Gottcha: If the file is to be mounted in a directory that is also listed in the rwtab configuration file, it needs to be present in the tmpfs]
/etc/statetab [Note: Not supported in CentOS 5.2 initscripts v8.45.19.1]
/etc/statetab.d [Note: Not supported in CentOS 5.2 initscripts v8.45.19.1]

 

Kernel parameters

The following kernel parameters are supported:

Parameter Description
'readonlyroot' override the configuration of 'READONLY' parameter of the /etc/sysconfig/readonly-root configuration to the value 'true'
'noreadonlyroot' override the configuration of 'READONLY' parameter of the /etc/sysconfig/readonly-root configuration to the value 'false'. Setting this value will no override the setting of 'TEMPORARY_STATE'.

 

Readonly-root configuration

The configuration file /etc/sysconfig/readonly-root supports the following variables:

Variable Name
Values
Description
READONLY yes | no
Whether to enable support for 'Stateless Linux'.
TEMPORARY_STATE yes | no
Whether to mount the files/directories listed in the rwtab configuration files into a temporary filesystem. Implied to be enabled if READONLY is 'yes'
RW_MOUNT <directory> [default=/var/lib/stateless/writable] The mount point for the temporary scratch writable space. There are three options for mounting this:
  1. mount using the device and options defined in /etc/fstab. This allows options to be set in the fstab file.
  2. mount using the filestem label defined in RW_LABEL
  3. mount a tmpfs filesystem
RW_LABEL <filesystem label> [default 'stateless-rw'].
Label on local filesystem which can be used for temporary scratch space.

Note: UUID's are not supported.
RW_OPTIONS
[Note: Not supported in CentOS 5.2 initscripts v8.45.19.1]
STATE_MOUNT <directory> [default '/.snapshot', or '/var/lib/stateless/state' on later versions] Where to mount to the persistent data. There are three options mounting this that are attempted by the script:
  1. mount using the device and options defined in /etc/fstab
  2. mount using the filestem label defined in STATE_LABEL
  3. mount a NFS filesystem (If CLIENTSTATE is defined)
STATE_LABEL [default 'stateless-state'] The label for partition with persistent data.
STATE_OPTIONS
[Note: Not supported in CentOS 5.2 initscripts v8.45.19.1]
CLIENTSTATE
Used to mount NFS state filesystem

 

A nearly read-only root filesystem

These notes are aimed at using a compact flash based root filesystem, where the truely read-only root filesystem feature is not required. Limiting write cycles is a good thing, but keeping the convenience of being able to write/update packages and configuration is useful.

Given a root filesystem backed by a simple flash device, it is desirable to limit the number of write cycles performed. To this end, configure the machine so that:

  • the root filesystem doesn't write atimes
  • log to another host (because /var/log will be lost on reboot)
  • set 'TEMPORARY_STATE=yes' in /etc/sysconfig/readonly-root

 

noatime

Use the 'noatime' (no access time) mount option on the root filesystem: 
LABEL=/                 /                       ext3    noatime 1 1

Note: The CentOS 5.2 'util-linux' package doesn't support the 'relatime' mount option.

Accessing the 'real' root filesystem

The root filesystem has various files and directories mounted on it, thus obscuring the 'real' files. I added a mount for the root filesystem so that it was (easily) possible to edit files like /etc/fstab. I added the following to the fstab (Note: You must edit the real fstab file, not the one on the temporary filesystem).

/                       /mnt/root               none    bind            0 0

 

Temporary filesystem

The scripts will make three attempts to create a temporary filesystem. Performing no additional configuration will mean the last option (see above) will create a default tmpfs filesystem. On a machine with no swap, it might be a good idea to size the tmpfs (the default maximum size is half the physical RAM, which is great when you have swap).

Provide an '/etc/fstab'  entry for the temporary filesystem:

tmpfs     /var/lib/stateless/writable tmpfs noauto,size=128M 0 0 

Note: Consider sizing '/dev/shm' in the /etc/fstab configuration.

Monitoring

Use inotify tools to monitor filesystem write access. Install inotify-tools directly from the dag repository (given it is only one package, don't bother installing the RPMForge yum repo).

# rpm -Uvh http://rpmforge.sw.be/redhat/el5/en/i386/rpmforge/RPMS/inotify-tools-3.13-1.el5.rf.i386.rpm

Once the machine has been restarted, it is possible to view the effect of the stateless linux configuration by vieeing '/proc/mounts'. The content in '/etc/mtab' is incomplete since most of the mounts are performed with the --n' option.

$ cat /proc/mounts

Links

Appendices

/etc/sysconfig/readonly-root

# Set to 'yes' to mount the system filesystems read-only.
READONLY=no
# Set to 'yes' to mount various temporary state as either tmpfs
# or on the block device labelled RW_LABEL. Implied by READONLY
TEMPORARY_STATE=no
# Place to put a tmpfs for temporary scratch writable space
RW_MOUNT=/var/lib/stateless/writable
# Label on local filesystem which can be used for temporary scratch space
RW_LABEL=stateless-rw
# Label for partition with persistent data
STATE_LABEL=stateless-state
# Where to mount to the persistent data
STATE_MOUNT=/.snapshot

/etc/rwtab

dirs    /var/cache/man
dirs    /var/gdm
dirs    /var/lock
dirs    /var/log
dirs    /var/run

empty   /tmp
empty   /var/cache/foomatic
empty   /var/cache/logwatch
empty   /var/cache/mod_ssl
empty   /var/cache/mod_proxy
empty   /var/cache/php-pear
empty   /var/cache/systemtap
empty   /var/db/nscd
empty   /var/lib/dav
empty   /var/lib/dhcp
empty   /var/lib/dhclient
empty   /var/lib/php
empty   /var/lib/ups
empty   /var/tmp
empty   /var/tux

files   /etc/adjtime
files   /etc/fstab
files   /etc/mtab
files   /etc/ntp.conf
files   /etc/resolv.conf
files   /etc/lvm/.cache
files   /var/account
files   /var/arpwatch
files   /var/cache/alchemist
files   /var/lib/iscsi
files   /var/lib/logrotate.status
files   /var/lib/ntp
files   /var/lib/xen

initscripts v8.45.19.1/etc/rc.sysinit (CentOS v5.2)

This is a small section of the init script relating to the stateless linux support

READONLY=
if [ -f /etc/sysconfig/readonly-root ]; then
        . /etc/sysconfig/readonly-root
fi
if strstr "$cmdline" readonlyroot ; then
        READONLY=yes
        [ -z "$RW_MOUNT" ] && RW_MOUNT=/var/lib/stateless/writable
fi
if strstr "$cmdline" noreadonlyroot ; then
        READONLY=no
fi

if [ "$READONLY" = "yes" -o "$TEMPORARY_STATE" = "yes" ]; then

        mount_empty() {
                if [ -e "$1" ]; then
                        echo "$1" | cpio -p -vd "$RW_MOUNT" &>/dev/null
                        mount -n --bind "$RW_MOUNT$1" "$1"
                fi
        }

        mount_dirs() {
                if [ -e "$1" ]; then
                        mkdir -p "$RW_MOUNT$1"
                        # fixme: find is bad
                        find "$1" -type d -print0 | cpio -p -0vd "$RW_MOUNT" &>/dev/null
                        mount -n --bind "$RW_MOUNT$1" "$1"
                fi
        }

        mount_files() {
                if [ -e "$1" ]; then
                        cp -a --parents "$1" "$RW_MOUNT"
                        mount -n --bind "$RW_MOUNT$1" "$1"
                fi
        }

        # Common mount options for scratch space regardless of
        # type of backing store
        mountopts=

        # Scan partitions for local scratch storage
        rw_mount_dev=$(blkid -t LABEL="$RW_LABEL" -o device | awk '{ print ; exit }')

        # First try to mount scratch storage from /etc/fstab, then any
        # partition with the proper label.  If either succeeds, be sure
        # to wipe the scratch storage clean.  If both fail, then mount
        # scratch storage via tmpfs.
        if mount $mountopts "$RW_MOUNT" > /dev/null 2>&1 ; then
                rm -rf "$RW_MOUNT" > /dev/null 2>&1
        elif [ x$rw_mount_dev != x ] && mount $rw_mount_dev $mountopts "$RW_MOUNT" > /dev/null 2>&1; then
                rm -rf "$RW_MOUNT"  > /dev/null 2>&1
        else
                mount -n -t tmpfs $mountopts none "$RW_MOUNT"
        fi

        for file in /etc/rwtab /etc/rwtab.d/* ; do
                is_ignored_file "$file" && continue
                [ -f $file ] && cat $file | while read type path ; do
                        case "$type" in
                                empty)
                                        mount_empty $path
                                        ;;
                                files)
                                        mount_files $path
                                        ;;
                                dirs)
                                        mount_dirs $path
                                        ;;
                                *)
                                        ;;
                        esac
                        [ -n "$SELINUX_STATE" -a -e "$path" ] && restorecon -R "$path"
                done
        done

        # In theory there should be no more than one network interface active

        # this early in the boot process -- the one we're booting from.
        # Use the network address to set the hostname of the client.  This
        # must be done even if we have local storage.
        ipaddr=
        if [ "$HOSTNAME" = "localhost" -o "$HOSTNAME" = "localhost.localdomain" ]; then
                ipaddr=$(ip addr show to 0/0 scope global | awk '/[[:space:]]inet / { print gensub("/.*","","g",$2) }')
                if [ -n "$ipaddr" ]; then
                        eval $(ipcalc -h $ipaddr 2>/dev/null)
                        hostname ${HOSTNAME}
                fi
        fi

        # Clients with read-only root filesystems may be provided with a
        # place where they can place minimal amounts of persistent
        # state.  SSH keys or puppet certificates for example.
        #
        # Ideally we'll use puppet to manage the state directory and to
        # create the bind mounts.  However, until that's all ready this
        # is sufficient to build a working system.

        # First try to mount persistent data from /etc/fstab, then any
        # partition with the proper label, then fallback to NFS
        state_mount_dev=$(blkid -t LABEL="$STATE_LABEL" -o device | awk '{ print ; exit }')
        if mount $mountopts "$STATE_MOUNT" > /dev/null 2>&1 ; then
                /bin/true
        elif [ x$state_mount_dev != x ] && mount $state_mount_dev $mountopts "$STATE_MOUNT" > /dev/null 2>&1;  then
                /bin/true
        elif [ -n "$CLIENTSTATE" ]; then
                # No local storage was found.  Make a final attempt to find
                # state on an NFS server.

                mount -t nfs $CLIENTSTATE/$HOSTNAME $STATE_MOUNT -o rw,nolock
        fi

        if [ -d $STATE_MOUNT/etc ]; then
                # Copy the puppet CA's cert from the r/o image into the
                # state directory so that we can create a bind mount on
                # the ssl directory for storing the client cert.  I'd really
                # rather have a unionfs to deal with this stuff
                cp --parents -f -p /var/lib/puppet/ssl/certs/ca.pem $STATE_MOUNT 2>/dev/null

                # In the future this will be handled by puppet
                for i in $(grep -v "^#" $STATE_MOUNT/files); do
                        if [ -e $i ]; then
                                mount -n -o bind $STATE_MOUNT/${i} ${i}
                        fi
                done
        fi
fi

initscripts v8.86 /etc/rc.sysinit

READONLY=
if [ -f /etc/sysconfig/readonly-root ]; then
        . /etc/sysconfig/readonly-root
fi
if strstr "$cmdline" readonlyroot ; then
        READONLY=yes
        [ -z "$RW_MOUNT" ] && RW_MOUNT=/var/lib/stateless/writable
        [ -z "$STATE_MOUNT" ] && STATE_MOUNT=/var/lib/stateless/state
fi
if strstr "$cmdline" noreadonlyroot ; then
        READONLY=no
fi

if [ "$READONLY" = "yes" -o "$TEMPORARY_STATE" = "yes" ]; then

        mount_empty() {
                if [ -e "$1" ]; then
                        echo "$1" | cpio -p -vd "$RW_MOUNT" &>/dev/null
                        mount -n --bind "$RW_MOUNT$1" "$1"
                fi
        }

        mount_dirs() {
                if [ -e "$1" ]; then
                        mkdir -p "$RW_MOUNT$1"
                        find "$1" -type d -print0 | cpio -p -0vd "$RW_MOUNT" &>/dev/null
                        mount -n --bind "$RW_MOUNT$1" "$1"
                fi
        }

        mount_files() {
                if [ -e "$1" ]; then
                        cp -a --parents "$1" "$RW_MOUNT"
                        mount -n --bind "$RW_MOUNT$1" "$1"
                fi
        }

        # Common mount options for scratch space regardless of
        # type of backing store
        mountopts=

        # Scan partitions for local scratch storage
        rw_mount_dev=$(blkid -t LABEL="$RW_LABEL" -l -o device)

        # First try to mount scratch storage from /etc/fstab, then any
        # partition with the proper label.  If either succeeds, be sure
        # to wipe the scratch storage clean.  If both fail, then mount
        # scratch storage via tmpfs.
        if mount $mountopts "$RW_MOUNT" > /dev/null 2>&1 ; then
                rm -rf "$RW_MOUNT" > /dev/null 2>&1
        elif [ x$rw_mount_dev != x ] && mount $rw_mount_dev $mountopts "$RW_MOUNT" > /dev/null 2>&1; then
                rm -rf "$RW_MOUNT"  > /dev/null 2>&1
        else
                mount -n -t tmpfs $RW_OPTIONS $mountopts none "$RW_MOUNT"
        fi

        for file in /etc/rwtab /etc/rwtab.d/* ; do
                is_ignored_file "$file" && continue
                [ -f $file ] && cat $file | while read type path ; do
                        case "$type" in
                                empty)
                                        mount_empty $path
                                        ;;
                                files)
                                        mount_files $path
                                        ;;
                                dirs)
                                        mount_dirs $path
                                        ;;
                                *)
                                        ;;
                        esac
                        [ -n "$SELINUX_STATE" -a -e "$path" ] && restorecon -R "$path"
                done
        done

        # In theory there should be no more than one network interface active
        # this early in the boot process -- the one we're booting from.
        # Use the network address to set the hostname of the client.  This
        # must be done even if we have local storage.
        ipaddr=
        if [ "$HOSTNAME" = "localhost" -o "$HOSTNAME" = "localhost.localdomain" ]; then
                ipaddr=$(ip addr show to 0.0.0.0/0 scope global | awk '/[[:space:]]inet / { print gensub("/.*","","g",$2) }')
                if [ -n "$ipaddr" ]; then
                        eval $(ipcalc -h $ipaddr 2>/dev/null)
                        hostname ${HOSTNAME}
                fi
        fi

        # Clients with read-only root filesystems may be provided with a
        # place where they can place minimal amounts of persistent
        # state.  SSH keys or puppet certificates for example.
        #
        # Ideally we'll use puppet to manage the state directory and to
        # create the bind mounts.  However, until that's all ready this
        # is sufficient to build a working system.

        # First try to mount persistent data from /etc/fstab, then any
        # partition with the proper label, then fallback to NFS
        state_mount_dev=$(blkid -t LABEL="$STATE_LABEL" -l -o device)
        if mount $mountopts $STATE_OPTIONS "$STATE_MOUNT" > /dev/null 2>&1 ; then
                /bin/true
        elif [ x$state_mount_dev != x ] && mount $state_mount_dev $mountopts "$STATE_MOUNT" > /dev/null 2>&1;  then
                /bin/true
        elif [ ! -z "$CLIENTSTATE" ]; then
                # No local storage was found.  Make a final attempt to find
                # state on an NFS server.

                mount -t nfs $CLIENTSTATE/$HOSTNAME $STATE_MOUNT -o rw,nolock
        fi

        if [ -w "$STATE_MOUNT" ]; then

                mount_state() {
                        if [ -e "$1" ]; then
                                [ ! -e "$STATE_MOUNT$1" ] && cp -a --parents "$1" "$STATE_MOUNT"
                                mount -n --bind "$STATE_MOUNT$1" "$1"
                        fi
                }

                for file in /etc/statetab /etc/statetab.d/* ; do
                        is_ignored_file "$file" && continue
                        [ ! -f "$file" ] && continue

                        if [ -f "$STATE_MOUNT/$file" ] ; then
                                mount -n --bind "$STATE_MOUNT/$file" "$file"
                        fi

                        for path in $(grep -v "^#" "$file" 2>/dev/null); do
                                mount_state "$path"
                                [ -n "$SELINUX_STATE" -a -e "$path" ] && restorecon -R "$path"
                        done
                done

                if [ -f "$STATE_MOUNT/files" ] ; then
                        for path in $(grep -v "^#" "$STATE_MOUNT/files" 2>/dev/null); do
                                mount_state "$path"
                                [ -n "$SELINUX_STATE" -a -e "$path" ] && restorecon -R "$path"
                        done
                fi
        fi
fi
Document Actions