Arrays

You can use arrays in Bash. You should use arrays in bash.

Arrays

Bash has arrays (OK, not version 1 of Bash) and so, like any good programers, we should be using them everywhere where we have a list of things.

The first thing to note is that the Positional Parameters ($1, $2, $3 etc.) are an array which you can reference with the familiar $* and $@.

`*` and `@`

Important

This is very important so we'll cover it first.

When we're talking about arrays we will be mostly using the subscripts (ie. indexes into the array) * and @. They both mean the same, ie. all elements of the array, but they are used in ever so slightly different ways when they are referenced inside double-quotes.

When they are referenced inside double-quotes they have the following meanings:

* means return each element of the array separated by the first character of IFS
@ means return each element of the array as though it was double-quoted itself, ie. preserve whitespace.

If they are not used inside double-quotes then they have exactly the same meaning:

return each element of the array separated by a space character.

Note

That is a space character, not the first character of IFS.

Let's see all that in action with the positional parameters:

% set -- one two 'three and  four'

Here we should have three Positional Parameters: one; two; and three and four (notice the double spacing before four):

% echo $#
3
% echo $1
one
% echo $3
three and four

Notice that we've lost the second space character in the value of $3. That's because the order of expansion is, amongst others, Parameter Expansion then Word Splitting. So the parameter is expanded to three and four then is split by the (default as we've not said otherwise) whitespace characters in IFS into the three words three, and and four which are passed to echo to print. echo merely prints its arguments with a single space character in between.

Of course we can fix that by double-quoting what we want echoed:

% echo "$3"
three and  four

As to * and @:

% echo $*
one two three and four
% echo $@
one two three and four

As suggested, no difference -- and we've lost the second space character again.

OK, * and IFS:

% IFS=:
% echo $*
one two three and  four
% echo "$*"
one:two:three and  four

Whoa! Three things have happened here:

In the first case, echo $* we seem to have recovered the space character again. Well, we would do as we've changed IFS so that a space character no longer delimits words. We could have said:
```
% IFS=":${IFS}"
% echo $*
one two three and four
```
In the second case, echo "$*", we have : separating the Positional Parameters which is what we expect
Somewhat less obviously, we're left with a single argument as notice we have had the extra space character preserved.

OK, @ and double-quotes:

% echo $@
one two three and four
% echo "$@"
one two three and  four

Well, we can see the extra space in the double-quoted example but otherwise they look fairly similar. Looks can be deceptive:

% for x in $@ ; do echo $x ; done
one
two
three
and
four

That is completely wrong!

Fortunately (and remembering to double-quote $x as well!):

% for x in "$@" ; do echo "$x" ; done
one
two
three and  four

Phew! We've not lost anything.

Best Practice

Can we summarize anything here?

If you are going to use the value of anything
1. Use @
2. double-quote it
Unless you are going to do some funky trick with IFS and joining the elements together in which case you'll be using * and double-quoting it. Using this trick is rare, though.
To visually distinguish between when things must be double-quoted (as above) and when you are enumerating things, use * for counting. Annoyingly, $# is the enumeration of the Positional Parameters which is visually different from the length of an array.

So, "${name[@]}" for the values and ${#name[*]} for the length of the array. And you can quickly see whether it should have been double-quoted or not. @ means double-quote.

You'll use "${name[*]}" a vanishingly small number of times and always in conjunction with modifying IFS -- you'll know when you do.

Named Arrays

We want to avoid using the Positional Parameters as much as possible as they are the arguments to the script/function. We want to preserve them whenever possible. For lots of things we'll have a list of values which we want to lump around en masse for which we need an array

Initialization

% a=( one two 'three and  four' )

There's some syntactic sugar there, =( and ), but it looks pretty reasonable.

If we were working another way we might have had to do:

% a[0]=one
% a[1]=two
% a[2]='three and  four'

To get the same result.

Expansion

Parameter Expansion covers all the fun things you can do with arrays much like ordinary variables so there's no need to repeat that here.

Keys

Keys? Indexes of the array. Do we need to know this? We will in the next section and it's very handy if you're using Bash 4's associative arrays.

% echo ${!a[*]}
0 1 2

Unsetting

You can unset an entire array as you might expect.

You can also unset individual elements:

% unset a[1]
% echo ${#a[*]}
2
% echo ${a[*]}
one three and four

Looks good but wait:

% echo ${a[2]}
three and four

Oh oh! The length of the array is now reported as two and yet we can pull out the third element?

Oh yes and, indeed, oh no. Now we have no idea what the index of the last element of the array is. Well, we know the keys:

% echo ${!a[*]}
0 2

so we could pluck the last one off that list (array!):

% keys=( ${!a[*]} )
% echo ${keys[ ${#keys[*]} - 1 ]}
2

Yuk, how does that work again? The line is:

% echo ${keys[ *subscript* ]}

where subscript is:

${#keys[*]} - 1

ie. the length of the keys array (2) minus 1. Equivalent to:

% echo ${keys[1]}
2

Which is just ugly. The moral here is: don't unset any elements in arrays unless you know what you're doing.

Or set them to empty as it will affect you in a similar way when you use the values.

Document Actions

Arrays

Arrays

* and @

Best Practice

Named Arrays

Initialization

Expansion

Keys

Unsetting

`*` and `@`