Parameter Expansion
Parameter Expansion
Parameter Expansion (or, more colloquially, Variable Expansion) always seems like a Perl-ish punctuation-heavy mess, which it is. But it's also very useful.
Using a fallback value, stripping the start or end of a filename (basename and dirname) are all things you should be doing in the shell.
Note
It is correctly Parameter Expansion as the operations can be equally applied to the Positional Parameters, the automatic variables ($1, $2, $3 etc.) representing the arguments to the script or function.
Basics
If we're going to do anything more useful than return a variable's value back we'll have to start typing more. Instead of typing $var you'll need to get used to typing ${var}, ie. wrap the variable's name in braces. That's also a handy way to allow us to prefix/suffix a variable's value with some other text:
echo This script has been running for ${SECONDS}s
Tests
We can do some tests on the variable:
- Substitute a default value if the variable is empty or unset
- Assign a default value if the variable is empty or unset
- Use an alternative value if the variable is set
- Raise an error if the variable is empty or unset
These all take the form:
${ *parameter* [:] *operator* *value* }
where
*parameter* is the name of the variable
there is an optional colon (:). If it is present the test is applied whether the variable is empty or unset. If it is absent, the test is only for if the variable is unset.
For almost all purposes you should include the colon as, except in a few cases, your variable expanding out to nothing will result in your script erroring. Whether dir is empty or unset matters not as cd ${dir} has the same unintended consequence.
*operator* to be discussed below.
*value* is the default/replacement text.
Actually, *value* has multiple expansions applied to it (including Parameter Expansion, ie this, all over again!). Most of the time, *value* will either be a simple string or a simple variable expansion (without all this clever gubbins).
Except where noted, we will always use the : in our tests.
Substitute a Default Value
We might have installed a product which requires a PRODUCT_HOME variable to be set. Our script might say:
PRODUCT_HOME=/product/is/here
However, the user may have their own special version installed (a beta release, a stable release, perhaps) so when our script runs so we should allow them the opportunity to use their version of the product. Our installed version, in /product/is/here, should be the default if they haven't already set PRODUCT_HOME:
PRODUCT_HOME=${PRODUCT_HOME:-/product/is/here}
splitting that up to make it easier to read:
PRODUCT_HOME :- /product/is/here
Here:
- the *parameter* name is PRODUCT_HOME
- were are using :- (rather than just -) to mean "substitute a default value if the parameter's value is empty or unset"
- the value we want to substitute is /product/is/here
- we want to double-quote the value as good practice in case someone has passed a space character (even if variable assignment semantics mean we don't have to).
As another example, if we were writing some path manipulating functions we are likely to want to pass to the function the character used as the path separator. We know that some of the paths we manipulate use different separators: Unix generally uses colon (:) as a path separator; Windows a semicolon (;); TCL uses a space in TCLLIBPATH.
Most of the time, we decide unilaterally, we are in Unix-land and we don't want to keep passing : as an argument. We need to substitute a default value if one is not passed which is the hyphen (-) operator:
sep="${3:-:}"
splitting that up to make it easier to read:
3 :- :
Here:
- the *parameter* name is 3, the third positional parameter
- were are using :- (rather than just -) to mean "substitute a default value if the parameter's value is empty or unset"
- the value we want to substitute is (confusingly) :
- we want to double-quote the value as good practice in case someone has passed a space character (even if variable assignment semantics mean we don't have to).
We can test this on the command line:
% unset var % echo "'${var}' becomes '${var:-:}'" '' becomes ':' % var= % echo "'${var}' becomes '${var:-:}'" '' becomes ':' % var=' ' % echo "'${var}' becomes '${var:-:}'" ' ' becomes ' '
Assign a Default Value
It's a bit less obvious to see how this is useful but let's see the = operator in action:
% unset var % echo "var is '${var:=one}'" var is 'one' % echo "var is '${var:=two}'" var is 'one'
In other words, only the first default assignment has taken place as once it has taken place, the variable has been assigned to!
Note
You cannot use this on Positional Parameters or other Special Variables.
Use an Alternate Value
This is counterintuitive and yet + is probably one of the most useful tests. If the variable is empty or unset, nothing is substituted, otherwise *value* is substituted.
How is that of any use?
Suppose we are back to manipulating paths again. A common template is something like:
PATH=${PATH}:/bin
which looks pretty good but what if PATH was empty or unset? We'd be left with :/bin and as we all know an empty directory element in PATH is treated as ., the current directory. When you type ls you may get a surprise.
What we want to say is that if PATH has a non-empty value then combine ${PATH}: and /bin otherwise just use /bin on its own. Well, that looks like:
PATH=${PATH:+${PATH}:}/bin
splitting that up to make it easier to read:
PATH :+ ${PATH}:
Here:
- the *parameter* name is PATH
- were are using :+ for the operator
- the value we want to substitute is (confusingly) ${PATH}:
If we wanted to prefix the new element to PATH it would mean combining /bin and :${PATH} which looks like:
PATH=/bin${PATH:+:${PATH}}
splitting that up to make it easier to read:
PATH :+ :${PATH}
Display an Error
The operator ? will display *value* and, if you're in a script, will exit. You can use it to enforce the setting of some variables:
% unset var % : ${var:?is empty or unset} bash: var: is empty or unset
If this was a script it would additionally print the line number the error occurred on and exit with 1.
Length
How long is a parameter's value?
echo ${#parameter}
for example:
% var=three % echo ${#var} 5
Substrings
Given a value in a variable you might want to access substrings. The syntax, confusingly, uses : though it should be obviously different to the tests, above, as there is no operator.
The syntax is:
${parameter:offset} ${parameter:offset:length}
where offset and length are Arithmetic Expressions.
First/Last Characters
First character:
% var=one % echo ${var:0:1} o
Last character:
% var=one % echo ${var: -1} o
Note
Be careful of the whitespace here as you need to distinguish between an offset of -1 and Substituting a Default Value, :- of 1: ${var:-1}
We could utilize the length of the value, ${#var}, instead to find the last character:
% echo ${var: ${#var} - 1}
Though you might think it's harder to read.
Chunking a String
Netmasks are often displayed as long hexadecimal strings which we could do with in a dotted-quad format:
netmask=${1:-ffffffe0} for (( offset=0 ; offset < ${#netmask} ; offset += 2 )) ; do hex_byte=${netmask:${offset}:2} ip[ ${#ip[*]} ]=$(( 16#${hex_byte} )) done
The only magic here is convincing the shell's Arithmetic to convert a base 16 number (ie. a hex number) into decimal: 16#xx.
String Manipulation
Far and away the most common string manipulation is to remove something from the start or end of a string. The most common examples of that are manipulating pathnames. A pathname of /full/path/to/foo.exe has a:
- basename of foo.exe
- dirname of /full/path/to
- extension of .exe
and so on. Most people will needlessly call out to the commands basename and dirname when they have no need to.
The string manipulation operators have similar formats:
${parameter *operator* *pattern* }
Where the operator can be: # and ## for removing a leading pattern; % and %% for removing a trailing pattern; and / and // for performing substitutions.
Bash 4 introduced the ^, ^^, , and ,, operators for changing the case of characters in the string. It also corrected the slightly misleading // operator.
Note
Mnemonics for remembering what operator does what are few and far between. On North American keyboards, # is to the left of % so you might remember # as doing things to the start of the string and % to the end.
In all cases there are two variants, say, # and ##. The difference is that the single character operator affects the shortest matching pattern, the double character operator affects the longest matching pattern.
The pattern is expanded to produce a pattern just as in Pathname Expansion.
Remove Leading Patterns
# is the operator for removing leading patterns:
% p=/full/path/to/foo.exe % echo ${p#*/} full/path/to/foo.exe % echo ${p##*/} foo.exe
Notice that the single character pattern match was less effective than you might think, the * in */ can be zero characters!
So we can replace calls to basename with ${parameter##*/}.
Playing around a little reveals:
% echo ${p##*.} exe
so we can discover the file's extension.
However, if we had multiple extensions, eg. foo.tar.gz and we naively used ${p#*.} to get the extension we would fail if any of the pathname components had a . in them:
% p=/full/pa.th/to/foo.tar.gz % echo ${p##*.} gz % echo ${p#*.} th/to/foo.tar.gz
If you're calculating extensions you really should be operating on the basename not on the full pathname.
Remove Trailing Patterns
% is the operator for removing trailing patterns:
% p=/full/path/to/foo.exe % echo ${p%/*} /full/path/to % echo ${p%%/*}
Notice that the double character pattern match was far more effective than you might think, the * in /* can be the whole string!
We can replace calls to dirname with ${parameter%/*}.
Combinations
We can combine several of the above to get the root of the file name:
% p=/full/path/to/foo.exe % base=${p##*/} % echo ${base%.*} foo
or we might have reason to calculate the extension and remove it explicitly:
% p=/full/path/to/foo.exe % base=${p##*/} % ext=${base##*.} % echo ${base%.${ext}} foo
If we had multiple extensions, eg. foo.tar.gz then we might need to be more careful:
% p=/full/path/to/foo.tar.gz % base=${p##*/} % echo ${base%.*} foo.tar % echo ${base%%.*} foo
Substituting in Strings
Maybe we want to rename some files, .tar.gz files to .tgz, say. Here we use the / operator and pattern is actually:
[*modifier*] *pattern* / *replacement*
which will, without a modifier, replace the first instance of pattern with replacement.
The optional modifier is one of:
- / meaning substitute all instances of pattern with replacement
- # substitute pattern with replacement if pattern occurs at the start of the string
- % substitute pattern with replacement if pattern occurs at the end of the string
The / modifier replaces every instance of pattern with replacement rather than just the first and the # and % modifier are being used to anchor the pattern in the same way as the # and % operator.
% p=/full/path/to/foo.tar.gz % echo ${p/.tar.gz/.tgz} /full/path/to/foo.tgz
Warning
If your pathname contained the same string as your extension that you're trying to modify you will get bitten:
% p=/full/path/to/tar_files/foo.tar % echo ${p/.tar/.tgz} /full/path/to/tgz_files/foo.tar
which probably isn't what you want.
In this instance we could have used the % modifier to only replace a trailing tar:
% echo ${p/%.tar/.tgz} /full/path/to/tar_files/foo.tgz
More complicated examples might require that you deconstruct the pathname completely into dirname, the file's root name and various extensions.
Notice that we are not changing the value of the parameter, we are only modifying the transient copy of the value that is about to be used in an assignment, echo'd or whatever.
Our renaming example might be:
mv "${p}" "${p/%.tar.gz/.tgz}"
Arrays
We can do this with arrays to? Oh yes. Broadly, where we used parameter in the expression above we can use arrayname[ *subscript* ].
Here, subscript can be an individual element of the array or * or @ to mean every element in the array.
subscript is subject to Arithmetic Expansion including, very usefully, allowing us to use shell variables without the $ prefix. This leads to the very much more readable:
for ((i=0; i < ${#a[*]}; i++)) ; do echo ${a[i]} done
where ${a[i]} replaces ${a[$i]}. You can imagine that if you are doing something much more complex (arithmetically) that is a big thing for legibility.
If we perform the expansion against every element in the array then the result of the expansion is itself a list.
Note
As we're returning a list we should be double-quoting it to ensure we preserve whitespace!
We can even use the Positional Parameters by using * or @ as the parameter.
Tests
Actually, you can't use these on an array. You can't have everything!
Length
The length of an element of an array:
% a=( one two three ) % echo ${#a[2]} 5
The length of the array itself:
% echo ${#a[*]} 3
Substrings
A substring of an element of the array:
% a=( one two three ) % echo ${a[2]:2} ree
A sub-array (or splice) of the array itself:
% echo "${a[@]:1}" two three % echo "${a[@]:0:2}" one two
Manipulating Array Elements
All the string manipulation elements work as you would expect:
% a=( one two three ) % echo ${a[2]%e} thre % a=( one two three ) % echo "${a[@]%e}" on two thre
This latter is particularly handy with our filename example:
% files=( foo.tar.gz bar.tar.gz ) % echo "${files[@]/%.tar.gz/.tgz}" foo.tgz bar.tgz
Of course, we can play with IFS as well:
% IFS=, % echo "${files[*]/%.tar.gz/.tgz}" foo.tgz,bar.tgz
Note
Don't forget to reset IFS.
Document Actions