Preface
This is a repost of Defensive BASH programming by Kfir Lavi from 2012-11-14.
The original blog post has a tendency to go offline quite often. It is a set of Katas I regularly go back to when writing bash-scripts.
Intro
Here are my Katas for creating BASH programs that work. Nothing is new here, but from my experience pepole like to abuse BASH, forget computer science and create a Big ball of mud from their programs. Here I provide methods to defend your programs from braking, and keep the code tidy and clean.
Immutable global variables
- Try to keep globals to minimum
- UPPER_CASE naming
- readonly decleration
- Use globals to replace cryptic $0, $1, etc.
- Globals I allways use in my programs:
readonly PROGNAME=$(basename $0)
readonly PROGDIR=$(readlink -m $(dirname $0))
readonly ARGS="$@"
Everything is local
All variables should be local.
change_owner_of_file() {
local filename=$1
local user=$2
local group=$3
chown $user:$group $filename
}
change_owner_of_files() {
local user=$1; shift
local group=$1; shift
local files=$@
local i
for i in $files
do
chown $user:$group $i
done
}
- Self documenting parameters
- Usually for loop use i variable, so it is very important that you declare it as local.
- Local does not work on global scope.
main()
- Help keep all variables local
- Intuitive for functional programming
- The only global command in the code is: main
main() {
local files="/tmp/a /tmp/b"
local i
for i in $files
do
change_owner_of_file kfir users $i
done
}
main
Everything is a function
- The only code that is running globaly is:
- Global declarations that are immutable.
- main
- Keep code clean
- Procedures become descriptive
main() {
local files=$(ls /tmp | grep pid | grep -v daemon)
}
temporary_files() {
local dir=$1
ls $dir \
| grep pid \
| grep -v daemon
}
main() {
local files=$(temporary_files /tmp)
}
- Second example is much better. Finding files is the problem of temporary_files() and not of main()’s. This code is also testable, by unit testing of temporary_files().
- If you try to test the first example, you will mish mash finding temporary files with main algorithm.
test_temporary_files() {
local dir=/tmp
touch $dir/a-pid1232.tmp
touch $dir/a-pid1232-daemon.tmp
returns "$dir/a-pid1232.tmp" temporary_files $dir
touch $dir/b-pid1534.tmp
returns "$dir/a-pid1232.tmp $dir/b-pid1534.tmp" temporary_files $dir
}
As we see, this test does not concern main().
Debugging functions
- Run program with -x flag:
bash -x my_prog.sh
- Debug just a small section of code using set -x and set +x, which will print debug info just for the current code wrapped with set -x … set +x.
temporary_files() {
local dir=$1
set -x
ls $dir \
| grep pid \
| grep -v daemon
set +x
}
- Printing function name and its arguments:
temporary_files() {
echo $FUNCNAME $@
local dir=$1
ls $dir \
| grep pid \
| grep -v daemon
}
So calling the function:
temporary_files /tmp
will print the standard output:
temporary_files /tmp
Code clarity
What does this code do?
main() {
local dir=/tmp
[[ -z $dir ]] \
&& do_something...
[[ -n $dir ]] \
&& do_something...
[[ -f $dir ]] \
&& do_something...
[[ -d $dir ]] \
&& do_something...
}
main
Let your code speak:
is_empty() {
local var=$1
[[ -z $var ]]
}
is_not_empty() {
local var=$1
[[ -n $var ]]
}
is_file() {
local file=$1
[[ -f $file ]]
}
is_dir() {
local dir=$1
[[ -d $dir ]]
}
main() {
local dir=/tmp
is_empty $dir \
&& do_something...
is_not_empty $dir \
&& do_something...
is_file $dir \
&& do_something...
is_dir $dir \
&& do_something...
}
main
Each line does just one thing
- Break expression with backslash “\"
For example:
temporary_files() {
local dir=$1
ls $dir | grep pid | grep -v daemon
}
Can be written much cleaner:
temporary_files() {
local dir=$1
ls $dir \
| grep pid \
| grep -v daemon
}
- Symbols at the start of the line indented
Bad example of symbols at the end:
temporary_files() {
local dir=$1
ls $dir | \
grep pid | \
grep -v daemon
}
Good example where we clearly see the connection between lines and the connecting rods:
print_dir_if_not_empty() {
local dir=$1
is_empty $dir \
&& echo "dir is empty" \
|| echo "dir=$dir"
}
Printing usage
Don’t do this:
echo "this prog does:..."
echo "flags:"
echo "-h print help"
It should be a function:
usage() {
echo "this prog does:..."
echo "flags:"
echo "-h print help"
}
echo is repeated in each line. For that we have Here Document:
usage() {
cat <<- EOF
usage: $PROGNAME options
Program deletes files from filesystems to release space.
It gets config file that define fileystem paths to work on, and whitelist rules to
keep certain files.
OPTIONS:
-c --config configuration file containing the rules. use --help-config to see the syntax.
-n --pretend do not really delete, just how what you are going to do.
-t --test run unit test to check the program
-v --verbose Verbose. You can specify more then one -v to have more verbose
-x --debug debug
-h --help show this help
--help-config configuration help
Examples:
Run all tests:
$PROGNAME --test all
Run specific test:
$PROGNAME --test test_string.sh
Run:
$PROGNAME --config /path/to/config/$PROGNAME.conf
Just show what you are going to do:
$PROGNAME -vn -c /path/to/config/$PROGNAME.conf
EOF
}
Pay attention that there should be real tab ‘\t’ in the start of the line for each line. In vim you can use this replace command if your tab is 4 spaces:
:s/^ /\t/
Command line arguments
Here is an example to comlement the usage function above. I got this from Kirk’s blog post - bash shell script to use getopts with gnu style long positional parameters:
cmdline() {
# got this idea from here:
# http://kirk.webfinish.com/2009/10/bash-shell-script-to-use-getopts-with-gnu-style-long-positional-parameters/
local arg=
for arg
do
local delim=""
case "$arg" in
#translate --gnu-long-options to -g (short options)
--config) args="${args}-c ";;
--pretend) args="${args}-n ";;
--test) args="${args}-t ";;
--help-config) usage_config && exit 0;;
--help) args="${args}-h ";;
--verbose) args="${args}-v ";;
--debug) args="${args}-x ";;
#pass through anything else
*) [[ "${arg:0:1}" == "-" ]] || delim="\""
args="${args}${delim}${arg}${delim} ";;
esac
done
#Reset the positional parameters to the short options
eval set -- $args
while getopts "nvhxt:c:" OPTION
do
case $OPTION in
v)
readonly VERBOSE=1
;;
h)
usage
exit 0
;;
x)
readonly DEBUG='-x'
set -x
;;
t)
RUN_TESTS=$OPTARG
verbose VINFO "Running tests"
;;
c)
readonly CONFIG_FILE=$OPTARG
;;
n)
readonly PRETEND=1
;;
esac
done
if [[ $recursive_testing || -z $RUN_TESTS ]]; then
[[ ! -f $CONFIG_FILE ]] \
&& eexit "You must provide --config file"
fi
return 0
}
You can use it like this, using the immutable ARGS variable we defined at the top:
main() {
cmdline $ARGS
}
main
Unit testing
- Very important in higher-level language
- Use shuint2 for unit testing
test_config_line_paths() {
local s='partition cpm-all, 80-90,'
returns "/a" "config_line_paths '$s /a, '"
returns "/a /b/c" "config_line_paths '$s /a:/b/c, '"
returns "/a /b /c" "config_line_paths '$s /a : /b : /c, '"
}
config_line_paths() {
local partition_line="$@"
echo $partition_line \
| csv_column 3 \
| delete_spaces \
| column 1 \
| colons_to_spaces
}
source /usr/bin/shunit2
Here is another example using the df command:
DF=df
mock_df_with_eols() {
cat <<- EOF
Filesystem 1K-blocks Used Available Use% Mounted on
/very/long/device/path
124628916 23063572 100299192 19% /
EOF
}
test_disk_size() {
returns 1000 "disk_size /dev/sda1"
DF=mock_df_with_eols
returns 124628916 "disk_size /very/long/device/path"
}
df_column() {
local disk_device=$1
local column=$2
$DF $disk_device \
| grep -v 'Use%' \
| tr '\n' ' ' \
| awk "{print \$$column}"
}
disk_size() {
local disk_device=$1
df_column $disk_device 2
}
Here I have exception, for testing, I declare DF in the global scope not readonly. This is because of shunit2 not allowing to change global scope functions.