Intro to Bash Scripting
Notes from Datacamp Course
The content is based on DataCamp’s Intro to Bash Scripting.
Intro
Bash stands for Bourne Again Shell
Bash was developed in the 80’s. It is often the default in Unix systems and Macs. Unix is the backbone of the internet, which is why all mayor cloud providers have commandline interfaces to their products.
Prerequisites:
- Understand what the command line is (terminal, shell)
- Understand basic commands such as
cat
,grep
,sed
etc. - check out the bash shell cheat sheet and this explanation of sed
Basic commands
(e)grep
filters input based on regex pattern matchingcat
concatenates file contents line-by-linetail
\head
give only the last -n lineswc
does a word or line count (flags: -w -l)sed
pattern-matched string replacement
REGEX
Regular expression are a vital skill for Bash scripting. Great site to test your expressions.
[]
create a set, for example:[afct]
^
inverse a set, for example:^[afct]
Piping
|
used for piping, for example:sort | uniq -c
Examples
fruits.txt
banana
apple
carrot
1
grep 'a' fruits.txt
banana
apple
carrot
2
grep 'p' fruits.txt
apple
3
grep '[pc]' fruits.txt
apple
carrot
4
cat fruits.txt | sort | uniq -c | head -n 3
1 apple
1 banana
1 carrot
Bash Scripts
General structure
A bash script usually begins with
#!/usr/bash
is called shebang and tells the interpreter that this is a bash scriptwhich bash
can be used to check where bash is- Middle of script contains the code
.sh
is conventionally used as file extensionbash script_name.sh
, run the script./script_name.sh
, run the script if shebang is the first line
Example
#!/usr/bash
echo "hello world"
echo "Goodbye world
./eg.sh
or
bash eg.sh
Hello world
Goodbye world
Each line in your Bash script can be a shell command. Thus, you can also include pipes in your Bash script.
Three streams for programs
STDIN
(standard input) - stream of data into the programSTDOUT
(standard output) - stream of data out of the programSTDERR
(standard error) - errors in the program
The streams come from and write out to the terminal.
2> /dev/null
in script calls redirects STDERR to be deleted (1> /dev/null
would be STDOUT)
Example
sports.txt
football
basketball
swimming
cat sports.txt 1> new_sports.txt
cat new_sports.txt
football
basketball
swimming
Arguments (ARGV)
Pass arguments by adding a space after the script execution call
ARGV
is the array of all the arguments given to the program- Each argument can be accessed via the
$
notation- the first argument as
$1
, the second as$2
etc.
- the first argument as
$@
and$*
give all the arguments in ARGV$#
give the length (number) of arguments
Example
#!/usr/bash
echo $1
echo $2
echo $@
echo "There are " $# "arguments"
bash args.sh one two three four five
one
two
one two three four five
There are 5 arguments
Basic variables in Bash
Quotes and backticks
firstname='Cynthia'
lastname='Liu'
echo "Hi there" $firstname $lastname
Hi there Cynthia Liu
- the
$
is crucial for bash to treat something as a variable and not a string - Do not add spaces around the
=
sign - Single quotes (
'sometext'
) = Shell interprets what is netween the quotes literally - Double quotes (
"sometext"
) = Shell interprets literally except using$
and backticks - Backticks (`sometext`) = creates a shell-within-a-shell; Shell runs the command and captures STDOUT back into a variable
Examples
Single quotes
now_var='NOW'
now_var_singlequote='$now_var'
echo $now_var_singlequote
$now_var
Double quotes
now_var='NOW'
now_var_doublequote="$now_var"
echo $now_var_doublequote
NOW
Typing the following command into the terminal returns the current date
date
Tue Jan 18 10:51:50 CET 2022
Backticks
By using backticks, you can use date
to invoke a shell-within-a-shell
rightnow_doublequote="The date is `date`."
echo $rightnow_doublequote
The date is Tue Jan 18 10:51:50 CET 2022
A shell-within-a-shell can also be achieved by using $(date)
rightnow_doublequote="The date is $(date)."
echo $rightnow_doublequote
The date is Tue Jan 18 10:51:50 CET 2022
Numeric variables in Bash
Numbers are not natively supported in Bash.
expr
expr
is a useful utility program (likecat
orgrep
)
expr 1 + 4
5
expr
cannot handle decimal places
bc
bc
(basic calculator) opens calculator which can handle decimal placesbc
can be used in piping by sending a string
echo "5 + 7.5" | bc
12.5
bc
has ascale
argument for defining the number of decimal places
echo "10 / 3" | bc
3
echo "scale=3; 10 / 3" | bc
3.333
dog_name='Roger'
dog_age=6
echo "My dog's name is $dog_name and he is $dog_age years old"
Double bracket notation (not for decimals)
expr 5 + 7
12
echo $((5 + 7))
12
model1=87.65
model2=89.20
echo "The total score is $(echo "$model1 + $model2" | bc)"
echo "The average score is $(echo "($model1 + $model2) / 2" | bc)"
Arrays in Bash
1) Normal numerical-indexed structure
- equivalent to a list in Python
Option 1 (declare)
declare -a my_first_array
Option 2 (brackets and spaces)
my_array=(1 3 5 2)
Return full array
echo ${my_array[@]}
1 3 5 2
Return length of an array
echo ${#my_array[@]}
Accessing array elements
echo ${my_array[2]}
5
Manipulating array elements
my_array[1]=999
echo ${my_array[@]}
999 3 5 2
Slicing arrays
- Use
array[@]:N:M
to slice out a subset of the arrayN
is the starting indexM
is the number of elements to be returned
my_array=(15 20 300 42 23 2 4 33 54 67 66)
echo ${my_array[@]:3:2}
42 23
Appending to arrays
- Use array+=(elements)
my_array=(300 42 23 2 4 33 54 67 66)
my_array+=(10)
echo ${my_array[@]}
300 42 23 2 4 33 54 67 66 10
2) Associative arrays
- similar to normal array, but with key-value pairs instead of numerical indexes (similar to a dictionary in Python)
- only available in Bash 4 onwards
Multiple line approach
declare -A city_details # Declare first
city_details=([city_name]="New York" [population]=14000000) # Add elements
echo ${city_details[city_name]} # Index using key to return a value
New York
One line approach
declare -A city_details=([city_name]="New York" [population]=14000000)
echo ${city_details[city_name]}
New York
Access the keys
!
to access the keys
echo ${!city_details[@]}
city_name population
IF statements
Basic structure
if [ CONDITION ]; then
# some code
else
# some other code
fi
- spaces between square brackets and conditional elements
- Semi-colon after close brakcet
];
Strings
==
for equal to!=
for not equal to
x="Queen"
if [ $x == "King" ]; then
echo "$x is a King!"
else
echo "$x is not a King"
fi
Queen is not a King!
Arithmetic IF statements
1) Double-paranthesis structure
x=10
if (($x > 5)); then
echo "$x is more than 5!"
fi
10 is more than 5!
2) Square brackets with flags
Arithmetic bash conditional flags
-eq
for equal to (==
)-ne
for not equal to (!=
)-lt
for less than (<
)-le
for less than or equal to (<=
)-gt
for greater than (>
)-ge
for greater than or equal to (>=
)
x=10
if [ $x -gt 5 ]; then
echo "$x is more than 5!"
fi
10 is more than 5!
Other bash conditional flags
-e
if the file exists-s
if the file exists and has a size greater than zero-r
if the file exists and is readable-w
if the file exists and is writeable
More bash conditional expressions can be found here.
AND and OR
&&
for AND||
for OR
Chaining conditionals (1)
x=10
if [ $x -gt 5 ] && [ $x -lt 11 ]; then
echo "$x is more than 5 and less than 11!"
fi
Chaining conditionals (2)
x=10
if [[ $x -gt 5 && $x -lt 11 ]]; then
echo "$x is more than 5 and less than 11!"
fi
IF and command-line programs
Option 1
words.txt
echo "Hello world!"
if grep -q Hello words.txt; then
echo "Hello is inside!"
fi
Hello is inside!
Option 2
if $(grep -q Hello words.txt); then
echo "Hello is inside!"
fi
-q
stand for quiet so it doesn’t return the matched lines like grep normally does. It just returns true if any lines match.- when using command-line arguments like grep in IF statements, there is no need for square brackets
FOR loops
Basic structure
for x in 1 2 3
do
echo $x
done
1
2
3
Number ranges
1) Brace expansion
- {START..STOP..INCREMENT}
for x in {1..5..2}
do
echo $x
done
1
3
5
2) Three expression
- double parenthesis
for ((x=2;x<=4;x+=2)) do echo $x done
2 4
Glob expansions
*
for pattern-matching expansions in for loops
for book in books/*
do
echo $book
done
books/book1.txt
books/book2.txt
Shell-within-a-shell
Let’s assume the following folder structure:
books/
|+--AirportBook.txt
|+--CattleBook.txt
|+--FairMarketBook.txt
|+--LOTR.txt
|+--file.csv
Loop through the results of a call to shell-within-a-shell using $()
:
for book in $(ls books/ | grep -i 'air')
do
echo $book
done
AirportBook.txt
FairMarketBook.txt
WHILE statements
Syntax
x = 1
while [ $x -le 3 ];
do
echo $x
((x+=1))
done
1
2
3
Make sure not to create an infinite loop!
x = 1
while [ $x -le 3 ];
do
echo $x
done
This will print out 1 forever!
CASE statements
Why CASE statements?
More optimal once IF statements get complex
A complex IF statement
if grep -q 'sydney' $1; then
mv $1 sydney/
fi
if grep -q 'melbourne|brisbane' $1; then
rm $1
fi
if grep -q 'canberra' $1; then
mv $1 "IMPORTANT_$1"
fi
CASE statement structure
case 'STRINGVAR' in
PATTERN1)
COMMAND1;;
PATTERN2)
COMMAND2;;
*)
DEFAULT COMMAND;;
esac
From IF to CASE
case $(cat $1) in
*sydney*)
mv $1 sydney/ ;;
*melbourne*|*brisbane*)
rm $1 ;;
*canberra*)
mv $1 "IMPORTANT_$1" ;;
*)
echo "No cities found" ;;
esac
Basic Functions
Syntax
Option 1
function_name () {
#function_code
return #something
}
Option 2
function function_name {
#function_code
return #something
}
Calling a function
function print_hello () {
echo "Hello world!"
}
print_hello
Hello world!
Example
temp_f=30
function convert_temp () {
temp_c=$(echo "scale=2; ($temp_f - 32) * 5 / 9" | bc)
echo $temp_c
}
convert_temp
-1.11
Arguments, return values, and scope
$1
notation to access arguments$@
and$*
to give all the arguments inARGV
$#
gives the length (number) of arguments
function print_filename {
echo "The first file was $1"
for file in $@
do
echo "This file has name $file"
done
}
print_filename "LOTR.txt" "mod.txt" "A.py"
The first file was LOTR.txt
This file has name LOTR.txt
This file has name mod.txt
This file has name A.py
Scope
Global by default
Scope in programming refers to how accessible a variable is.
- Global: something is accessible anaywhere in the program, including inside FOR loops, IF statements, functions etc.
- Local: something is only accessible in a certain part of the program
In Bash, all variables are global by default.
function print_filename {
first_filename=$1
}
print_filename "LOTR.txt" "model.txt"
echo $first_filename
LOTR.txt
Restricting scope
local
to restrict variable scope
function print_filename {
local first_filename=$1
}
print_filename "LOTR.txt" "model.txt"
echo $first_filename
Return values
The return
option in Bash is only meant to determine if the funciton was a success (0) or failure (other values 1-255). It is captured in the global variable $?
function function_2 {
echlo
}
function_2
echo $?
function_2: command not found: echlo
127
Returning correctly
function convert_temp {
echo $(echo "sclae=2; ($1 - 32) * 5 / 9" | bc)
}
converted=$(convert_temp 30)
echo "30F in Celsius is $converted C"
30F in Celsius is -1 C
Cron to schedule scripts
Cron (from chronos) has been part of unix-like systems since the 70’s.
A crontab is a file that contains cronjobs, which each tell cron what code to run and when.
Crontab - the driver of cronjobs
crontab -l
crontab: no crontab for user
Crontab and cronjob structure
There are 5 stars to set, one for each time unit.
# ┌───────────── minute (0 - 59)
# │ ┌───────────── hour (0 - 23)
# │ │ ┌───────────── day of the month (1 - 31)
# │ │ │ ┌───────────── month (1 - 12)
# │ │ │ │ ┌───────────── day of the week (0 - 6) (Sunday to Saturday;
# │ │ │ │ │ 7 is also Sunday on some systems)
# │ │ │ │ │
# │ │ │ │ │
# * * * * * <command to execute>
The default, *
means every
Example
5 1 * * * bash myscript.sh
- Minutes star is 5 (5 minutes past the hour). Hours star is 1 (after 1am). The last three are
*
, so every day and month- Overall: run every day at 1:05am
15 14 * * 7 bash myscript.sh
- Minutes star is 15 (15 minutes past the hour). Hours star is 14 (after 2pm). Next two are
*
(every day of month, every month of year). Last star is day 7 (on Sundays).- Overall: run every day at 2:15pm every Sunday
Advanced cronjob structure
Run something multiple times per day or every X time increments.
- use a comma for specific time intervals
15,30,45 * * * *
will run at 15, 30 and 45 minutes mark for whatever hours are specified by the second star
- use a slash for every X increment
*/15 * * * *
runs every 15 minutes
Create your first cronjob
Good tutorial for Mac users.
Let’s schedule a script called extract_data.sh
to run every morning at 1:30am:
- In terminal type
crontab -e
to edit your list of conrjobs- It may ask what editor to use.
nano
is an easy option and a less-steep learning curve than vi (vim)
- It may ask what editor to use.
- Create the cronjob:
- 30 1 * * * extract_data.sh
- Exit the editor to save it
nano
(on Mac) you would usectrl
+o
thenenter
thenctrl
+x
to exit- You will see a message
crontab: installing new crontab
- Check it is there by running
crontab -l
30 1 * * * extract_data.sh