Shell Scripting For Speed

Inspired by a forum thread

Make a choice of which shell you use, that's the "#/bin/..." at the top:

  • dash: very small & fast but limited.
    you will have to use more external commands to accomplish things.
  • bash: bulkier, takes longer to load, maybe slightly slower, but: you can accomplish much more with shell builtins. arrays, string manipulation... often the call to awk or sed is not required anymore
  • sh: the default shell, depends on your setup. best code for portability (POSIX) when you use this one. which is practically the same as coding for dash.
  • of course this list isn't complete. There's many more shells: ksh, zsh, etc. etc.

Once you made that choice, it should influence your coding quite a lot: more external commands with dash, or extensive usage of bash's internal capabilities (arrays, to name only one).

Be aware of what external commands are - each call to an external command slows your script down:

  • the command needs to be read from hard drive
  • if it's inside a command substitution or at either end of a pipe, it starts a new subshell. see here and here etc.
  • it is often overkill for the task required

Be equally aware of what the internal commands (or builtins) of the shell in question are. Quite many commands (e.g. echo) are both internal and external. A hint: if you have a terminal open that runs the shell in question, and you can use help somecommand, then it's a builtin, and that's what the shell will default to.

So, reading the shell's man page might help. A realistic task with e.g. dash, but not with bash.

Disclaimer: I like to use bash. Knowing of its capabilities, it allows me to make do with almost no piping to sed, awk etc. I reckon that in the end this is much faster. Just imagine you have one of those hideous multi-pipe oneliners inside a loop that parses through a long file, or needs to be executed in short intervals - that's many, many subshells opened, reading external commands, and closed in rapid succession. If i can replacer all of that with bash variables and string manipulation, it will save a lot of resources.

I recommend everyone who chooses bash to read up on string manipulation. There's quite a few resources out there, and evtl. you will end up on this page with the obligatory disclaimer that it's outdated. Maybe this one for a quick reference, or this one.

And you definitely need to check out Greg's Wiki.