Using Centos 5.2 stateless Linux support on a flash based root filesystem
Notes on using stateless linux support with a compact flash based root filesystem.
The stateless Linux support in CentOS v5.2 is provided by the initscripts (8.45.19.1.EL-1.el5) package. According to comments found through google, the stateless linux support is intended for live images. Stateless Linux provides support for:
- a read-only root filesystem
- putting temporary files in a temporary filesystem
- mounting read/write persistent state from a local filesystem or NFS
Stateless Linux Documentation
Documentation generated from reverse engineering the scripts.
Files
The files & directories involved in a stateless linux configuration are:
| File/Directory | Description |
|---|---|
| /etc/sysconfig/readonly-root |
the top level configuration file |
| /etc/rwtab | a configuration file for the list of files and directories that should be mounted in the temporary read-write filesystem |
| /etc/rwtab.d/ | a directory of rwtab configuration files |
| /.snapshot | the default mount point for the stateless configuration filesystem [Note: This is /var/lib/stateless/state on later versions of the initscripts] |
| /var/lib/stateless/writable | the default mount point for the temporary read-write filesystem |
| <STATE_MOUNT>/etc | this directory must be present in the state device. The script checks that this directory is present. |
| <STATE_MOUNT>/files | the list of files/directories to mount. The files must be listed one per line. The files/directories must exist in both the state filesystem and the root filesystem [Gottcha: If the file is to be mounted in a directory that is also listed in the rwtab configuration file, it needs to be present in the tmpfs] |
| /etc/statetab | [Note: Not supported in CentOS 5.2 initscripts v8.45.19.1] |
| /etc/statetab.d | [Note: Not supported in CentOS 5.2 initscripts v8.45.19.1] |
Kernel parameters
The following kernel parameters are supported:
| Parameter | Description |
|---|---|
| 'readonlyroot' | override the configuration of 'READONLY' parameter of the /etc/sysconfig/readonly-root configuration to the value 'true' |
| 'noreadonlyroot' | override the configuration of 'READONLY' parameter of the /etc/sysconfig/readonly-root configuration to the value 'false'. Setting this value will no override the setting of 'TEMPORARY_STATE'. |
Readonly-root configuration
The configuration file /etc/sysconfig/readonly-root supports the following variables:
| Variable Name |
Values |
Description |
|---|---|---|
| READONLY | yes | no |
Whether to enable support for 'Stateless Linux'. |
| TEMPORARY_STATE | yes | no |
Whether to mount the files/directories listed in the rwtab configuration files into a temporary filesystem. Implied to be enabled if READONLY is 'yes' |
| RW_MOUNT | <directory> [default=/var/lib/stateless/writable] | The mount point for the temporary scratch writable space. There are three options for mounting this:
|
| RW_LABEL | <filesystem label> [default 'stateless-rw']. |
Label on local filesystem which can be used for temporary scratch space. Note: UUID's are not supported. |
| RW_OPTIONS | [Note: Not supported in CentOS 5.2 initscripts v8.45.19.1] | |
| STATE_MOUNT | <directory> [default '/.snapshot', or '/var/lib/stateless/state' on later versions] | Where to mount to the persistent data. There are three options mounting this that are attempted by the script:
|
| STATE_LABEL | [default 'stateless-state'] | The label for partition with persistent data. |
| STATE_OPTIONS | [Note: Not supported in CentOS 5.2 initscripts v8.45.19.1] | |
| CLIENTSTATE | Used to mount NFS state filesystem |
A nearly read-only root filesystem
These notes are aimed at using a compact flash based root filesystem, where the truely read-only root filesystem feature is not required. Limiting write cycles is a good thing, but keeping the convenience of being able to write/update packages and configuration is useful.
Given a root filesystem backed by a simple flash device, it is desirable to limit the number of write cycles performed. To this end, configure the machine so that:
- the root filesystem doesn't write atimes
- log to another host (because /var/log will be lost on reboot)
- set 'TEMPORARY_STATE=yes' in /etc/sysconfig/readonly-root
noatime
Use the 'noatime' (no access time) mount option on the root filesystem:LABEL=/ / ext3 noatime 1 1
Note: The CentOS 5.2 'util-linux' package doesn't support the 'relatime' mount option.
Accessing the 'real' root filesystem
The root filesystem has various files and directories mounted on it, thus obscuring the 'real' files. I added a mount for the root filesystem so that it was (easily) possible to edit files like /etc/fstab. I added the following to the fstab (Note: You must edit the real fstab file, not the one on the temporary filesystem).
/ /mnt/root none bind 0 0
Temporary filesystem
The scripts will make three attempts to create a temporary filesystem. Performing no additional configuration will mean the last option (see above) will create a default tmpfs filesystem. On a machine with no swap, it might be a good idea to size the tmpfs (the default maximum size is half the physical RAM, which is great when you have swap).
Provide an '/etc/fstab' entry for the temporary filesystem:
tmpfs /var/lib/stateless/writable tmpfs noauto,size=128M 0 0
Note: Consider sizing '/dev/shm' in the /etc/fstab configuration.
Monitoring
Use inotify tools to monitor filesystem write access. Install inotify-tools directly from the dag repository (given it is only one package, don't bother installing the RPMForge yum repo).
# rpm -Uvh http://rpmforge.sw.be/redhat/el5/en/i386/rpmforge/RPMS/inotify-tools-3.13-1.el5.rf.i386.rpm
Once the machine has been restarted, it is possible to view the effect of the stateless linux configuration by vieeing '/proc/mounts'. The content in '/etc/mtab' is incomplete since most of the mounts are performed with the --n' option.
$ cat /proc/mounts
Links
- StatelessLinux
- inotify-tools
- initscripts releases
- http://lxr.linux.no/linux/Documentation/filesystems/tmpfs.txt
Appendices
/etc/sysconfig/readonly-root
# Set to 'yes' to mount the system filesystems read-only. READONLY=no # Set to 'yes' to mount various temporary state as either tmpfs # or on the block device labelled RW_LABEL. Implied by READONLY TEMPORARY_STATE=no # Place to put a tmpfs for temporary scratch writable space RW_MOUNT=/var/lib/stateless/writable # Label on local filesystem which can be used for temporary scratch space RW_LABEL=stateless-rw # Label for partition with persistent data STATE_LABEL=stateless-state # Where to mount to the persistent data STATE_MOUNT=/.snapshot
/etc/rwtab
dirs /var/cache/man dirs /var/gdm dirs /var/lock dirs /var/log dirs /var/run empty /tmp empty /var/cache/foomatic empty /var/cache/logwatch empty /var/cache/mod_ssl empty /var/cache/mod_proxy empty /var/cache/php-pear empty /var/cache/systemtap empty /var/db/nscd empty /var/lib/dav empty /var/lib/dhcp empty /var/lib/dhclient empty /var/lib/php empty /var/lib/ups empty /var/tmp empty /var/tux files /etc/adjtime files /etc/fstab files /etc/mtab files /etc/ntp.conf files /etc/resolv.conf files /etc/lvm/.cache files /var/account files /var/arpwatch files /var/cache/alchemist files /var/lib/iscsi files /var/lib/logrotate.status files /var/lib/ntp files /var/lib/xen
initscripts v8.45.19.1/etc/rc.sysinit (CentOS v5.2)
This is a small section of the init script relating to the stateless linux support
READONLY=
if [ -f /etc/sysconfig/readonly-root ]; then
. /etc/sysconfig/readonly-root
fi
if strstr "$cmdline" readonlyroot ; then
READONLY=yes
[ -z "$RW_MOUNT" ] && RW_MOUNT=/var/lib/stateless/writable
fi
if strstr "$cmdline" noreadonlyroot ; then
READONLY=no
fi
if [ "$READONLY" = "yes" -o "$TEMPORARY_STATE" = "yes" ]; then
mount_empty() {
if [ -e "$1" ]; then
echo "$1" | cpio -p -vd "$RW_MOUNT" &>/dev/null
mount -n --bind "$RW_MOUNT$1" "$1"
fi
}
mount_dirs() {
if [ -e "$1" ]; then
mkdir -p "$RW_MOUNT$1"
# fixme: find is bad
find "$1" -type d -print0 | cpio -p -0vd "$RW_MOUNT" &>/dev/null
mount -n --bind "$RW_MOUNT$1" "$1"
fi
}
mount_files() {
if [ -e "$1" ]; then
cp -a --parents "$1" "$RW_MOUNT"
mount -n --bind "$RW_MOUNT$1" "$1"
fi
}
# Common mount options for scratch space regardless of
# type of backing store
mountopts=
# Scan partitions for local scratch storage
rw_mount_dev=$(blkid -t LABEL="$RW_LABEL" -o device | awk '{ print ; exit }')
# First try to mount scratch storage from /etc/fstab, then any
# partition with the proper label. If either succeeds, be sure
# to wipe the scratch storage clean. If both fail, then mount
# scratch storage via tmpfs.
if mount $mountopts "$RW_MOUNT" > /dev/null 2>&1 ; then
rm -rf "$RW_MOUNT" > /dev/null 2>&1
elif [ x$rw_mount_dev != x ] && mount $rw_mount_dev $mountopts "$RW_MOUNT" > /dev/null 2>&1; then
rm -rf "$RW_MOUNT" > /dev/null 2>&1
else
mount -n -t tmpfs $mountopts none "$RW_MOUNT"
fi
for file in /etc/rwtab /etc/rwtab.d/* ; do
is_ignored_file "$file" && continue
[ -f $file ] && cat $file | while read type path ; do
case "$type" in
empty)
mount_empty $path
;;
files)
mount_files $path
;;
dirs)
mount_dirs $path
;;
*)
;;
esac
[ -n "$SELINUX_STATE" -a -e "$path" ] && restorecon -R "$path"
done
done
# In theory there should be no more than one network interface active
# this early in the boot process -- the one we're booting from.
# Use the network address to set the hostname of the client. This
# must be done even if we have local storage.
ipaddr=
if [ "$HOSTNAME" = "localhost" -o "$HOSTNAME" = "localhost.localdomain" ]; then
ipaddr=$(ip addr show to 0/0 scope global | awk '/[[:space:]]inet / { print gensub("/.*","","g",$2) }')
if [ -n "$ipaddr" ]; then
eval $(ipcalc -h $ipaddr 2>/dev/null)
hostname ${HOSTNAME}
fi
fi
# Clients with read-only root filesystems may be provided with a
# place where they can place minimal amounts of persistent
# state. SSH keys or puppet certificates for example.
#
# Ideally we'll use puppet to manage the state directory and to
# create the bind mounts. However, until that's all ready this
# is sufficient to build a working system.
# First try to mount persistent data from /etc/fstab, then any
# partition with the proper label, then fallback to NFS
state_mount_dev=$(blkid -t LABEL="$STATE_LABEL" -o device | awk '{ print ; exit }')
if mount $mountopts "$STATE_MOUNT" > /dev/null 2>&1 ; then
/bin/true
elif [ x$state_mount_dev != x ] && mount $state_mount_dev $mountopts "$STATE_MOUNT" > /dev/null 2>&1; then
/bin/true
elif [ -n "$CLIENTSTATE" ]; then
# No local storage was found. Make a final attempt to find
# state on an NFS server.
mount -t nfs $CLIENTSTATE/$HOSTNAME $STATE_MOUNT -o rw,nolock
fi
if [ -d $STATE_MOUNT/etc ]; then
# Copy the puppet CA's cert from the r/o image into the
# state directory so that we can create a bind mount on
# the ssl directory for storing the client cert. I'd really
# rather have a unionfs to deal with this stuff
cp --parents -f -p /var/lib/puppet/ssl/certs/ca.pem $STATE_MOUNT 2>/dev/null
# In the future this will be handled by puppet
for i in $(grep -v "^#" $STATE_MOUNT/files); do
if [ -e $i ]; then
mount -n -o bind $STATE_MOUNT/${i} ${i}
fi
done
fi
fi
initscripts v8.86 /etc/rc.sysinit
READONLY=
if [ -f /etc/sysconfig/readonly-root ]; then
. /etc/sysconfig/readonly-root
fi
if strstr "$cmdline" readonlyroot ; then
READONLY=yes
[ -z "$RW_MOUNT" ] && RW_MOUNT=/var/lib/stateless/writable
[ -z "$STATE_MOUNT" ] && STATE_MOUNT=/var/lib/stateless/state
fi
if strstr "$cmdline" noreadonlyroot ; then
READONLY=no
fi
if [ "$READONLY" = "yes" -o "$TEMPORARY_STATE" = "yes" ]; then
mount_empty() {
if [ -e "$1" ]; then
echo "$1" | cpio -p -vd "$RW_MOUNT" &>/dev/null
mount -n --bind "$RW_MOUNT$1" "$1"
fi
}
mount_dirs() {
if [ -e "$1" ]; then
mkdir -p "$RW_MOUNT$1"
find "$1" -type d -print0 | cpio -p -0vd "$RW_MOUNT" &>/dev/null
mount -n --bind "$RW_MOUNT$1" "$1"
fi
}
mount_files() {
if [ -e "$1" ]; then
cp -a --parents "$1" "$RW_MOUNT"
mount -n --bind "$RW_MOUNT$1" "$1"
fi
}
# Common mount options for scratch space regardless of
# type of backing store
mountopts=
# Scan partitions for local scratch storage
rw_mount_dev=$(blkid -t LABEL="$RW_LABEL" -l -o device)
# First try to mount scratch storage from /etc/fstab, then any
# partition with the proper label. If either succeeds, be sure
# to wipe the scratch storage clean. If both fail, then mount
# scratch storage via tmpfs.
if mount $mountopts "$RW_MOUNT" > /dev/null 2>&1 ; then
rm -rf "$RW_MOUNT" > /dev/null 2>&1
elif [ x$rw_mount_dev != x ] && mount $rw_mount_dev $mountopts "$RW_MOUNT" > /dev/null 2>&1; then
rm -rf "$RW_MOUNT" > /dev/null 2>&1
else
mount -n -t tmpfs $RW_OPTIONS $mountopts none "$RW_MOUNT"
fi
for file in /etc/rwtab /etc/rwtab.d/* ; do
is_ignored_file "$file" && continue
[ -f $file ] && cat $file | while read type path ; do
case "$type" in
empty)
mount_empty $path
;;
files)
mount_files $path
;;
dirs)
mount_dirs $path
;;
*)
;;
esac
[ -n "$SELINUX_STATE" -a -e "$path" ] && restorecon -R "$path"
done
done
# In theory there should be no more than one network interface active
# this early in the boot process -- the one we're booting from.
# Use the network address to set the hostname of the client. This
# must be done even if we have local storage.
ipaddr=
if [ "$HOSTNAME" = "localhost" -o "$HOSTNAME" = "localhost.localdomain" ]; then
ipaddr=$(ip addr show to 0.0.0.0/0 scope global | awk '/[[:space:]]inet / { print gensub("/.*","","g",$2) }')
if [ -n "$ipaddr" ]; then
eval $(ipcalc -h $ipaddr 2>/dev/null)
hostname ${HOSTNAME}
fi
fi
# Clients with read-only root filesystems may be provided with a
# place where they can place minimal amounts of persistent
# state. SSH keys or puppet certificates for example.
#
# Ideally we'll use puppet to manage the state directory and to
# create the bind mounts. However, until that's all ready this
# is sufficient to build a working system.
# First try to mount persistent data from /etc/fstab, then any
# partition with the proper label, then fallback to NFS
state_mount_dev=$(blkid -t LABEL="$STATE_LABEL" -l -o device)
if mount $mountopts $STATE_OPTIONS "$STATE_MOUNT" > /dev/null 2>&1 ; then
/bin/true
elif [ x$state_mount_dev != x ] && mount $state_mount_dev $mountopts "$STATE_MOUNT" > /dev/null 2>&1; then
/bin/true
elif [ ! -z "$CLIENTSTATE" ]; then
# No local storage was found. Make a final attempt to find
# state on an NFS server.
mount -t nfs $CLIENTSTATE/$HOSTNAME $STATE_MOUNT -o rw,nolock
fi
if [ -w "$STATE_MOUNT" ]; then
mount_state() {
if [ -e "$1" ]; then
[ ! -e "$STATE_MOUNT$1" ] && cp -a --parents "$1" "$STATE_MOUNT"
mount -n --bind "$STATE_MOUNT$1" "$1"
fi
}
for file in /etc/statetab /etc/statetab.d/* ; do
is_ignored_file "$file" && continue
[ ! -f "$file" ] && continue
if [ -f "$STATE_MOUNT/$file" ] ; then
mount -n --bind "$STATE_MOUNT/$file" "$file"
fi
for path in $(grep -v "^#" "$file" 2>/dev/null); do
mount_state "$path"
[ -n "$SELINUX_STATE" -a -e "$path" ] && restorecon -R "$path"
done
done
if [ -f "$STATE_MOUNT/files" ] ; then
for path in $(grep -v "^#" "$STATE_MOUNT/files" 2>/dev/null); do
mount_state "$path"
[ -n "$SELINUX_STATE" -a -e "$path" ] && restorecon -R "$path"
done
fi
fi
fi

