Arrays
Arrays
Bash has arrays (OK, not version 1 of Bash) and so, like any good programers, we should be using them everywhere where we have a list of things.
The first thing to note is that the Positional Parameters ($1, $2, $3 etc.) are an array which you can reference with the familiar $* and $@.
* and @
Important
This is very important so we'll cover it first.
When we're talking about arrays we will be mostly using the subscripts (ie. indexes into the array) * and @. They both mean the same, ie. all elements of the array, but they are used in ever so slightly different ways when they are referenced inside double-quotes.
When they are referenced inside double-quotes they have the following meanings:
- * means return each element of the array separated by the first character of IFS
- @ means return each element of the array as though it was double-quoted itself, ie. preserve whitespace.
If they are not used inside double-quotes then they have exactly the same meaning:
- return each element of the array separated by a space character.
Note
That is a space character, not the first character of IFS.
Let's see all that in action with the positional parameters:
% set -- one two 'three and four'
Here we should have three Positional Parameters: one; two; and three and four (notice the double spacing before four):
% echo $# 3 % echo $1 one % echo $3 three and four
Notice that we've lost the second space character in the value of $3. That's because the order of expansion is, amongst others, Parameter Expansion then Word Splitting. So the parameter is expanded to three and four then is split by the (default as we've not said otherwise) whitespace characters in IFS into the three words three, and and four which are passed to echo to print. echo merely prints its arguments with a single space character in between.
Of course we can fix that by double-quoting what we want echoed:
% echo "$3" three and four
As to * and @:
% echo $* one two three and four % echo $@ one two three and four
As suggested, no difference -- and we've lost the second space character again.
OK, * and IFS:
% IFS=: % echo $* one two three and four % echo "$*" one:two:three and four
Whoa! Three things have happened here:
In the first case, echo $* we seem to have recovered the space character again. Well, we would do as we've changed IFS so that a space character no longer delimits words. We could have said:
% IFS=":${IFS}" % echo $* one two three and four
In the second case, echo "$*", we have : separating the Positional Parameters which is what we expect
Somewhat less obviously, we're left with a single argument as notice we have had the extra space character preserved.
OK, @ and double-quotes:
% echo $@ one two three and four % echo "$@" one two three and four
Well, we can see the extra space in the double-quoted example but otherwise they look fairly similar. Looks can be deceptive:
% for x in $@ ; do echo $x ; done one two three and four
That is completely wrong!
Fortunately (and remembering to double-quote $x as well!):
% for x in "$@" ; do echo "$x" ; done one two three and four
Phew! We've not lost anything.
Best Practice
Can we summarize anything here?
- If you are going to use the value of anything
- Use @
- double-quote it
- Unless you are going to do some funky trick with IFS and joining the elements together in which case you'll be using * and double-quoting it. Using this trick is rare, though.
- To visually distinguish between when things must be double-quoted (as above) and when you are enumerating things, use * for counting. Annoyingly, $# is the enumeration of the Positional Parameters which is visually different from the length of an array.
So, "${name[@]}" for the values and ${#name[*]} for the length of the array. And you can quickly see whether it should have been double-quoted or not. @ means double-quote.
You'll use "${name[*]}" a vanishingly small number of times and always in conjunction with modifying IFS -- you'll know when you do.
Named Arrays
We want to avoid using the Positional Parameters as much as possible as they are the arguments to the script/function. We want to preserve them whenever possible. For lots of things we'll have a list of values which we want to lump around en masse for which we need an array
Initialization
% a=( one two 'three and four' )
There's some syntactic sugar there, =( and ), but it looks pretty reasonable.
If we were working another way we might have had to do:
% a[0]=one % a[1]=two % a[2]='three and four'
To get the same result.
Expansion
Parameter Expansion covers all the fun things you can do with arrays much like ordinary variables so there's no need to repeat that here.
Keys
Keys? Indexes of the array. Do we need to know this? We will in the next section and it's very handy if you're using Bash 4's associative arrays.
% echo ${!a[*]} 0 1 2
Unsetting
You can unset an entire array as you might expect.
You can also unset individual elements:
% unset a[1] % echo ${#a[*]} 2 % echo ${a[*]} one three and four
Looks good but wait:
% echo ${a[2]} three and four
Oh oh! The length of the array is now reported as two and yet we can pull out the third element?
Oh yes and, indeed, oh no. Now we have no idea what the index of the last element of the array is. Well, we know the keys:
% echo ${!a[*]} 0 2
so we could pluck the last one off that list (array!):
% keys=( ${!a[*]} ) % echo ${keys[ ${#keys[*]} - 1 ]} 2
Yuk, how does that work again? The line is:
% echo ${keys[ *subscript* ]}
where subscript is:
${#keys[*]} - 1
ie. the length of the keys array (2) minus 1. Equivalent to:
% echo ${keys[1]} 2
Which is just ugly. The moral here is: don't unset any elements in arrays unless you know what you're doing.
Or set them to empty as it will affect you in a similar way when you use the values.
Document Actions