Script 1 of the online course "An introduction to agent-based modeling with Python" by Claudius Gräbner

For the course homepage and links to the acompanying videos (in German) see: http://claudius-graebner.com/introabmdt.html

For the Enlish version (currently without videos) see: http://claudius-graebner.com/introabmen.html

Last updated July 18 2019

Introduction to Python 1

The very basics

This is the script for the video introduction to the Python programming language. It is addressed to beginners who have no previous experience in programming.

In any case, I suggest you have a quick look at a primer on how what programming languages do and how they work in theory. This will help you understand their functioning.

But lets get started!

First thing you will need to understand is how to issue commands to Python. If you type stuff in the console and press enter, this means you tell Python to execute (or 'to compute') what you have just typed.

If what you have typed before is consistent with the Python syntax, Python will obey and execute the command:

In [1]:
2 + 2
Out[1]:
4
In [2]:
5*12
Out[2]:
60
In [3]:
print("Hello!")
Hello!

Yet, if you write a command that is not consistent with the Python syntax, or contains any other kind of error, your computer is going to complain and throw an error message at you - something you might not understand at the moment, but something that is actually pretty useful:

In [4]:
printsomething!
  File "<ipython-input-4-73950305a598>", line 1
    printsomething!
                  ^
SyntaxError: invalid syntax

Python can be used for a wide range of tasks, such as machine learning, data visualization, econometrics, agent-based modeling, text mining, web scrapping, etc.

Yet we will start low and first use the console as a simple calculator.

For example, you can ask Python to solve basic mathematical operations. To do so, you can make use of basic mathematical operators. Operators are symbols that tell Python to carry our a particular form of computation:

In [5]:
2 + 2 # Addition
Out[5]:
4
In [6]:
2 - 4 # Substraction
Out[6]:
-2
In [7]:
2*2 # Multiplication
Out[7]:
4
In [8]:
2/2 # Division
Out[8]:
1.0
In [9]:
2**2 # Power
Out[9]:
4
In [10]:
2**0.5 # Taking roots
Out[10]:
1.4142135623730951

Note that I have written some stuff after a '#'. In Python '#' indicates a comment. This means that everything that comes after the '#' will not be executed by the computer. So you can use it to make annotations and explanations of your code.

Later, when you write models, commenting on them extensively is essential if you ever want others to understand your code, or get back into your code after some time of absence!

Writing commands into the console is not always useful, for example when you really use Python as a calculator. But this is not what the language is built for!

Whenever you want Python to solve some more complex tasks, you are obliged to write scripts.

Scripts are text documents that contain Python code. It is easy to tell your computer to execute these scripts. This basically means that you ask your computer to read the script and execute every line of the code - almost as if you had written every line into the console and pressed enter (you can actually copy-paste the code into the console and press enter, but this is usually a bad idea, for many reasons).

When you write a script containing the following lines and execute it you might be surprised by the outcome:

In [11]:
3*4
5 + 9
9 - 10
Out[11]:
-1

Python only returns the result of the ultimate line. Whenever you want to read the results of all calculations you must tell Python to print all the results.

This can be done by the print function. Functions are pre-defined algorithms that do stuff for you. The print function prints whatever you pass to it as an input. To pass input to a function, write the function name, followed by brackets containing the input, also called the arguments of the function:

In [12]:
print(3*4)
print(5 + 9)
print(9 - 10)
12
14
-1

Since we are now working in a script you can provide Python with some more complex tasks. In particular, you can issue some mathematical task, let Python store the result in an object, and then process this object further.

This is really useful:

In [13]:
x = 5 + 12
y = 8**2
z = 6*x + (100-y)
print(z)
138

An association in Python works via the = command. Whatever is written on the right side of = gets associated with the object mentioned on the left side of the =. The name of the object is also called identifier.

For example, in the previous code junks, the letters 'x', 'y', and 'z' were identifier. If you call them via the console, they help Python to identify the object you want to get, in the case of 'x' the 'identified' object is the integer 17:

In [14]:
x
Out[14]:
17

There are a few rules for valid identifiers: basically, any combination of lowercase (a to z) or uppercase (A to Z) letters, digits (0 to 9) and underscores (_) can be used for an identifier.

Yet, there are also a few exceptions:

  • You must not use keywords as identifiers. Keywords are symbols or strings that perform a very particular and important role in Python, and must not be overwritten. There are 33 keywords in Python (and we will learn how to use some of them later):
In [15]:
import keyword
keyword.kwlist
Out[15]:
['False',
 'None',
 'True',
 'and',
 'as',
 'assert',
 'break',
 'class',
 'continue',
 'def',
 'del',
 'elif',
 'else',
 'except',
 'finally',
 'for',
 'from',
 'global',
 'if',
 'import',
 'in',
 'is',
 'lambda',
 'nonlocal',
 'not',
 'or',
 'pass',
 'raise',
 'return',
 'try',
 'while',
 'with',
 'yield']
  • Identifiers must not start with a digit.

  • Identifiers must not contain symbols other than letters, digits, and underscores. Special symbols such as '!' are not admissable (which is why all operators cannot be part of an identifier).

Additionally, there are some conventions. For example, simple associations should not start with an uppercase letter (this is reserved for the definition of classes, more on this later).

Also, it is not recommendable to use names that are already associated with functions, such as print. If you associate something with an object called print you basically override the reference to the print function - and you most likely want to avoid this!

To see the association of the name print with the print function, just type print without the brackets:

In [16]:
print
Out[16]:
<function print>

While we have so far only used numbers, there is nothing that prevents us from letting Python print other things, such as strings:

In [17]:
print("Hello!")
Hello!

But what is actually the difference between a 'number' and a 'string'?

To better understand this, we need to learn about data types.

Data types

For your programming work it is important to know about different types. This is because several operators, functions or keywords only work for some types, or their effect on different types is different.

Therefore, we now learn about some of the most common and important types. For the sake of simplictly (and theoretically not 100% correct) we distinguish between primitive types and containers, whereby container are collections of potentially numerous primitive types.

The primitive types we will have a look at are:

  • integers (such as 1)
  • floats (such as 1.0)
  • strings (such as "One")
  • Boolean values (such as 'True' and 'False')

The containers we consider at this stage are:

  • lists
  • tuples
  • sets
  • dictionaries

To get the type of an object you might use the function type:

In [18]:
type(3)
Out[18]:
int

Fact for the future: This essentially tells us that 3 is an instance of the class int. Classes are blueprints for certain objects with particular properties and will become absolutely essential for agent-based modelling later. We will learn how to define our own classes in one of the next sessions.

Integers

Integers are abbreviated as int in Python.

In [19]:
type(3)
Out[19]:
int

On integers you can perform all the typical mathematical operations, such as addition, substraction, multiplication, etc.

Also, integers are the output of many functions, e.g. functions that count stuff (more on this below).

Floats

As integers, floats are numbers, but they come with decimal places:

In [20]:
type(3.5)
Out[20]:
float

As for integers, all conventional mathematical operations are defined for floats.

It is also possible to transform integers to floats, and vice versa:

In [21]:
x = 2
type(x) # an integer
y = 2.5 
type(y) # a float
x = float(x) # transforms x into a float
y = int(y) # transforms y into a float

y
Out[21]:
2

Keep in mind that while transforming an integer to a float usually comes without problems, some information is lost when transforming floats to integers: all the decimal places are simply removed.

Thus, when you sum floats and integers, the result will be a float, such that no information gets lost - even if the result has only zero decimal places:

In [22]:
z = 2.0 + 8.0
type(z)
Out[22]:
float

You might ask yourself why Python distinguishes between integers and floats, and not simply uses floats only. There are mathematical and logical reaons for this. As in mathematics, integers are required whenever you want to count stuff, or when you want to tell Python to do something x times (see below), or if you want to use them as indices. We will soon encounter several examples where the distinction gets illustrated hands on.

Strings

Strings usually behave very differently than integers and floats.

For example, not all the conventional mathematical operations are defined for strings:

In [23]:
x = "a string" # strings are defined by quotation marks at the beginning and the end
y = "another string"

x - y
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-23-b0bac9f2b604> in <module>()
      2 y = "another string"
      3 
----> 4 x - y

TypeError: unsupported operand type(s) for -: 'str' and 'str'

Other mathematical operations, however, work for strings, but in a different way than on floats or integers:

In [24]:
x + y
Out[24]:
'a stringanother string'

Some operations work with integers and strings:

In [25]:
5 * x
Out[25]:
'a stringa stringa stringa stringa string'

Note that this does not work for floats and strings:

In [26]:
5.0 * x
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-26-2874fe223c30> in <module>()
----> 1 5.0 * x

TypeError: can't multiply sequence by non-int of type 'float'

Boolean values

There are two different boolean values: True and False.

In [27]:
x = True
type(x)
Out[27]:
bool

They naturally occur when you test for relations such as 'greater than', 'smaller than' or 'equal to':

In [28]:
5 > 6
8 == 8.0
Out[28]:
True

Interestingly, boolean values sometimes behave like integers:

In [29]:
True + True + False
Out[29]:
2

Summing up boolean values returns the number of True. This is useful in many instances.

Lists

We now turn to containers. As indicated above, containers can contain a collection of primitive data types:

In [30]:
x = [1, 50, "Bla", 5.0]
type(x)
Out[30]:
list

A list can contain an arbitrary number of primitive data types, which do not necessarily need to be of the same type.

In fact, a list can also contain other containers, e.g. lists:

In [31]:
y = [4, 2, x, "blubb"]
y
Out[31]:
[4, 2, [1, 50, 'Bla', 5.0], 'blubb']

You can count the number of elements in a list using the function len:

In [32]:
len(y)
Out[32]:
4

If you want to access certain elements of lists you can do this by supplying the corresponding index. There are two things to keep in mind:

  1. The first element of a list has index zero, not one!
  2. Indices must be supplied as integers, not floats!
In [33]:
x = [1,2,3,4]
x[0] # The first element of x
Out[33]:
1
In [34]:
print(y)
y[2]
[4, 2, [1, 50, 'Bla', 5.0], 'blubb']
Out[34]:
[1, 50, 'Bla', 5.0]
In [35]:
x[2.0]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-35-8fda4c5850e9> in <module>()
----> 1 x[2.0]

TypeError: list indices must be integers or slices, not float

You can also mutate lists by replacing certain elements:

In [36]:
print(y)
y[1] = 5000
y
[4, 2, [1, 50, 'Bla', 5.0], 'blubb']
Out[36]:
[4, 5000, [1, 50, 'Bla', 5.0], 'blubb']

You can also nest this operation:

In [37]:
y[2][1] = "New element!"
y
Out[37]:
[4, 5000, [1, 'New element!', 'Bla', 5.0], 'blubb']

But this does not work if the index is greater than the length of the list:

In [38]:
x = [1,2,3]
x[3]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-38-14f0aaac68dc> in <module>()
      1 x = [1,2,3]
----> 2 x[3]

IndexError: list index out of range

But you can access the last element of the list by using negative indices:

In [39]:
print(x)
x[-1]
[1, 2, 3]
Out[39]:
3

If you want to concatenate lists by using addition:

In [40]:
x = [1,2,3]
y = [4,5,6]
x+y
Out[40]:
[1, 2, 3, 4, 5, 6]

And you can also 'multiply' lists (again, only using integers):

In [41]:
3 * x
Out[41]:
[1, 2, 3, 1, 2, 3, 1, 2, 3]

Tuples

In principle, tuples are similar to lists, with one important distinction: Lists are mutable, tuples are not

In [42]:
l = [1,2,3,4, "bla"]
l[2] = "Stuff"
print(l)
[1, 2, 'Stuff', 4, 'bla']
In [43]:
# But:
t = (1,2,3,4, "bla") # Note the different brackets used to define tuples!
t[2] = "Stuff"
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-43-38593d60cde2> in <module>()
      1 # But:
      2 t = (1,2,3,4, "bla") # Note the different brackets used to define tuples!
----> 3 t[2] = "Stuff"

TypeError: 'tuple' object does not support item assignment

Testing whether a certain element is in a tuple is also straightforward:

In [44]:
"bla" in t 
Out[44]:
True

Sets

Sets are created using the set function, or by placing elements into curly brackets. To create an empty set, however, the set function must be used since {} creates an empty dictionary.

In [45]:
set_1 = set([1,2,5,6]) # note that the input is a list, not simply the elements
set_2 = {2,6,1,9}
set_3 = set()

Sets can be distinguished from tuples and lists by the following properties:

  1. They cannot contain duplicate elements
  2. They can contain elements of different data types
  3. The ordering of their members is not fixed
  4. Sets are mutable (just as lists), but indexing does not work
In [46]:
# ad 1: sets cannot contain duplicate elements
set_1 = {1,2,5,3,3,3}
print(set_1)
{1, 2, 3, 5}
In [47]:
# ad 2: sets can contain elements of different data types
set_2 = {2, 5, "bla"}
print(set_2)
{2, 'bla', 5}
In [48]:
# ad 3: The ordering of their members is not fixed
set_2 = {9, 8, 7, 6}
print(set_2)
{8, 9, 6, 7}
In [49]:
# ad 4: Sets are mutable (just as lists), but indexing does not work
set_2 = {2, 5, "bla"}
set_2.add("blubb") # we will learn about the logic behind this syntax below
set_2.remove(2)
print(set_2)
set_2[2]
{'bla', 'blubb', 5}
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-49-ac12eb86453c> in <module>()
      4 set_2.remove(2)
      5 print(set_2)
----> 6 set_2[2]

TypeError: 'set' object does not support indexing

Sets are useful for membership testing, identifying unions, differences or intersections.

In [50]:
set_1 = {1, 5, 9}
set_2 = {99, 1, 4}
3 in set_1
Out[50]:
False
In [51]:
set_1 - set_2 # difference
Out[51]:
{5, 9}
In [52]:
set_2 - set_1 # difference
Out[52]:
{4, 99}
In [53]:
set_1 | set_2  # union
Out[53]:
{1, 4, 5, 9, 99}
In [54]:
set_1 & set_2 # intersection
Out[54]:
{1}
In [55]:
set_1 ^ set_2 # symmetric_difference
Out[55]:
{4, 5, 9, 99}

Dictionaries

Dictionaries are among the most useful data types in Python, although their behavior might seem to be a bit counter-intuitive in the beginning. But don't worry, after a bit of practice you will be able to use dictionaries effectively - and this will prove useful in a wide array of situations.

Dictionaries are associations between 'keys' and 'values'. An empty dictionary is created by either {} or the function dict.

In [56]:
dict_1 = {} # alternative: dict_1 = dict()
dict_1
Out[56]:
{}

Then you can fill the dictionary by adding key-value assocations like this:

In [57]:
dict_1["First_Key"] = "The value"
print(dict_1["First_Key"])
print(dict_1)
The value
{'First_Key': 'The value'}

Keys must be unique, but values do not need to be unique. This makes sense: think about a dictionary in which you want to look up the key 'super'. If there were multiple keys called 'super', which value would you expect Python to return?

In [58]:
dict_2 = {"super" : "bad", "super" : "good"} # the second key will overwrite the first
dict_2
Out[58]:
{'super': 'good'}

On the other hand, there is no problem with both the key 'super' and 'great' both mapping to the value 'positive':

In [59]:
dict_2 = {"super" : "positive", "great" : "positive"}
dict_2
Out[59]:
{'great': 'positive', 'super': 'positive'}

You can use everything as a key as long as it is an immutable object (which itself does contain only immutable objects):

In [60]:
dict_2 = {"super" : "positive", "great" : "positive", 2 : "Two"}
dict_2
Out[60]:
{2: 'Two', 'great': 'positive', 'super': 'positive'}
In [61]:
dict_2 = {"super" : "positive", "great" : "positive", [1,3] : "Two"}
# Does not work because a list is mutable
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-61-248e1c3b28a6> in <module>()
----> 1 dict_2 = {"super" : "positive", "great" : "positive", [1,3] : "Two"}
      2 # Does not work because a list is mutable

TypeError: unhashable type: 'list'

To identify all different keys, or all the values of a dictionary you can use the methods keys and values (more on the concept of a method below):

In [62]:
dict_2.keys()
Out[62]:
dict_keys(['super', 'great', 2])
In [63]:
dict_2.values()
Out[63]:
dict_values(['positive', 'positive', 'Two'])

This way, it is also easy to check whether a certain key is part of a dictionary:

In [64]:
"super" in dict_2.keys()
Out[64]:
True
In [65]:
"super" in dict_2.values()
Out[65]:
False

First steps in programming

Functions and methods

Functions

Functions are useful: they take an input (one or more 'arguments'), execute an operation on the input, and return an output.

We have already encountered a number of function that are defined within the base code of Python (called 'builtin function' because they are already built into Python).

For example, the print function takes one argument and prints it to the console:

In [66]:
type(print)
Out[66]:
builtin_function_or_method
In [67]:
print(2) # The input is 2
2

The syntax for using functions is always the following: first, type the function name, then directly add the arguments in normal brackets. Some functions take more than one argument. If you do not know what a certain function does, you might use the help function to find out (althought the output might still look somehow crypted):

In [68]:
help(print)
Help on built-in function print in module builtins:

print(...)
    print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
    
    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file:  a file-like object (stream); defaults to the current sys.stdout.
    sep:   string inserted between values, default a space.
    end:   string appended after the last value, default a newline.
    flush: whether to forcibly flush the stream.

The easiest way to understand how functions work is to build your own function.

This is done via the keyword def.

Here we will use the following example and go through it one by one:

In [69]:
def test_function(x, y):
    """
    This is a function that sums up
    two numbers and returns the result
    as a float.
    
    Parameters
    ----------
    first : int or float
        A first number
    second : int or float
        A second number

    Returns
    -------
    float
        The sum of the two parameters
    """
    result = float(x + y)
    return result

As you can see, the function definition consists of the following parts:

  1. The keyword def indicates the beginning of the function definition.
  2. The name of the function. In this example, the function name is test_function. This will be what you need to type if you want to call the function later. If you define a new function with the same name as a previously defined one, Python overrides the reference to the older function, so be careful. Function identifiers should contain only lowercase latters.
  3. The parameters or arguments of the functions. This is how you can pass inputs to the function once you call it. In this example, the function has two parameters: x and y.
  4. The colon indicates the end of the function header.
  5. The documentation string, also called docstring. It begins and ends with """. Here, one describes what the function does. Although it is optional, it is recommendable to do it whenever you write a new function. There are conventions of how to write good docstrings, e.g. here.
  6. Next, in the function body, you write the code that specifies what the function does. Note that indendation is important. Here, we just define a new object called result.
  7. Finally, the return statement specifies what the function actually returns. In this case, the function just returns the object result.

This function is now ready for use:

In [70]:
test_function(5,5)
Out[70]:
10.0

Note that all definitions made within the function definition are local. They are discarded once the function was called.

In [71]:
bla = test_function(2,4)
result # will throw error since 'resuls' is defined only within the function
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-71-ce1126022d32> in <module>()
      1 bla = test_function(2,4)
----> 2 result # will throw error since 'resuls' is defined only within the function

NameError: name 'result' is not defined

Sometimes, a function has an default value for some parameters. If there is no value specified to them explicitly, the function uses the default value:

In [72]:
def test_function_2(x, y, z=10):
    """
    This is a function that sums up
    two numbers and returns the result
    as a float. 
    By default, the result is also
    multiplied by 10.
    
    Parameters
    ----------
    first : int or float
        A first number
    second : int or float
        A second number
    third: int (optional)
        A third number

    Returns
    -------
    float
        The sum of the two parameters times z.
    """
    result = float(x + y) * z
    return result
In [73]:
print(
    test_function_2(2,3, z=100) 
)
print(
    test_function_2(2,3) # Now the default value for z is used!
)
500.0
50.0

Methods

Everything in Python is an object. We will learn more about objects later when we discuss classes, but we will preview some of the content at this point.

Objects in Python have to major components, attributes and methods. Methods are basically functions that are called from within the object itself, and take the object as their first argument.

They usually perform operations that are very common for the specific object at hand.

For example, when you work with lists, a very common task is to append and element to the list. Therefore, the object type list has a method append that does exactly this:

In [74]:
list_1 = [1,4,2,6,5]
print(list_1)
list_1.append(3)
print(list_1)
[1, 4, 2, 6, 5]
[1, 4, 2, 6, 5, 3]

Another common task is to sort a list, and this is what the method sort is for:

In [75]:
print(list_1)
list_1.sort()
print(list_1)
list_1.sort(reverse=True) # as with functions, you can also give keywords to methods
print(list_1)
[1, 4, 2, 6, 5, 3]
[1, 2, 3, 4, 5, 6]
[6, 5, 4, 3, 2, 1]

For the sake of illustration, here are some methods for lists that are quite useful. For other types you might want to look up the relevant documentation.

In [76]:
l = [1, 5, 2, 5, 6, 0]

l.append(15) # appends an element to the list
# print(l) returns: [1, 5, 2, 5, 6, 0, 15]

x = l.count(5) # counts the occurences in the list
# print("Nb of occurences of 5: ", x) # x is 2

l.extend([1, 2, 3])
# print(l) returns: [1, 5, 2, 5, 6, 0, 15, 1, 2, 3]

x = l.index(5) # where is the first instance of 5 in l?
# print("First occurence of 5: ", x) # 1

l.insert(3, "Buff") # insert an element to the list
# print(l) # returns: [1, 5, 2, 'Buff', 5, 6, 0, 15, 1, 2, 3]

x = l.pop(3) # removes an object from the list and returns it
# print("l: ", l, "\nRemoved element: ", x) # x is Buff

l.remove(5) # Removes the first instance of an object from the list
# print(l) # returns: [1, 2, 5, 6, 0, 15, 1, 2, 3]

l.reverse() # Reverses the list
# print(l) # returns: [3, 2, 1, 15, 0, 6, 5, 2, 1]

l.sort(reverse=False) # Sorts the elements of the list
# print(l) # returns: [0, 1, 1, 2, 2, 3, 5, 6, 15]

It is very important to keep in mind that methods do not create a new object, but mutate the original object!

Remarks on the use of functions

There are a number of reasons for why writing functions is a good idea:

  1. It makes your code shorter and more transparent. While short code itself is not an indication for clarity, functions can be accompanied with a docstring that explains what a function does. This way, it is easy to make transparent what your code is doing - to others, but also to yourself when you get back to your project after a while.
  2. It helps you to structure your code This relates to the first point: a function summarizes your ideas about how to solve a certain problem, or how to best conduct a certain routine. Usually you do not want to concern yourself with this question everytime the routine is applied. Outsourcing the routine into a function helps you to stay focused on the particular coding challenge your are currently facing.
  3. It makes it easier to correct your code. Suppose you have written a routine that creates a random network. This routine is used frequently throughout your program. If you identify a mistake in the routine, you have to correct the mistake in every spot you are applying this routine. But if you have written a function creating the random network, you need to corrent only the function. This is much more reliable and saves a lot of time.
  4. It makes it easier to generalize your code. Once you have written a function it is easy to add new parameter to it.
  5. It makes it easier to re-use your code. This might not be too much of an advantage in the beginning of your programming career, but you will soon recognize that there are particular challenges you face very often. For example, to save the output of an ABM to a csv file, to create certain output plots, or to create agents with certain characteristics. For many such routines, Python provides you with built-in functions, but often it is a good idea to summarize them in your own function. Whenever you write a new model, you can re-use the functions from a previous model, and thus save time and avoid errors at the same time.

For these, and for many other reasons, there is a principle in software engineering called DRY, which stands for "don't repeat yourself"! Adhering to the DRY principle means to avoid WET solutions, for which writing your own functions is essential. You may read the wikipedia article on the principle, its SIF (short, interesting, and fun) and tells you what WET stands for.

If/else statements

If/then statements are very convenient and frequently used in programming in general, and in scientific applications in particular:

In [77]:
x = 4
if x == 2:
    print("x equals two!")
elif x < 2:
    print("x is smaller than two")
elif x > 2:
    print("x is bigger than two")
else:
    print("What he heck!?!?")
x is bigger than two

You can also nest if/else statement. This can be useful if you want to check whether a variable is of the right type to perform a certain operation.

But keep in mind that too many nested operations are often very bad for the performance of your code.

In [78]:
x = "2"
if type(x)==int: 
    if x == 2:
        print("x equals two!")
    elif x < 2:
        print("x is smaller than two")
    elif x > 2:
        print("x is bigger than two")
    else:
        print("What he heck!?!?")
elif type(x)==str:
    print("Why would you compare strings with ints?")
Why would you compare strings with ints?

You may have noticed the indentation in the if/else statement. In Python, indentation matters ($\neq$ e.g. R). This forces you to write well-structured and readable code - at least to some extent;)

Assertions

Assertions are somehow related to if/else statements, and they are very useful, in particular in the context of more complex models.

Assertions are used to make sure certain mistakes do not happen, or at least do not remain unnoticed.

An assertion allows you to test whether a certain condition holds, and produces an AssertionError if it doesn't. Thus, it prevents the program from moving on, until the conditions are favorable.

In [79]:
x = [1, 3, 5, -1]
assert len(x)<3, "This list is too long!"
print(len(x))
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-79-af24693e12e2> in <module>()
      1 x = [1, 3, 5, -1]
----> 2 assert len(x)<3, "This list is too long!"
      3 print(len(x))

AssertionError: This list is too long!

Note that you can write your own error text for assertion errors. This facilitates bugfixing enormously.

Loops

Loops are a feature of almost every programming language. There is a good reason fo reason for this: they are incredibly useful since they allow you to automate repetitive tasks.

There are two main types of loops in Python: for-loops and while loops. We will also encounter something very 'Pythonic': list comprehension, a very useful tool similar to for loops that helps you to speed up your code, and to make it more readable.

Unfortunately, there is one thing to keep in mind: even if loops are often very useful, they are also slow, and should be avoided whenever possible. List comprehensions are slightly faster, but should also used sparsely.

For loops

In a for-loop your programm performs an action with inputs alongside a certain input container. Let's look at an example:

In [80]:
loop_list = ["This", "is", "awesome", "!"]
for element in loop_list:
    print(element, end=' ') # The end='' prevents print() from adding \n at the end
# Guess the output!
This is awesome ! 

Note that the word after the keyword for is (almost) arbitrary. The following code does exactly the same thing as the code above:

In [81]:
loop_list = ["This", "is", "awesome", "!"]
for bla in loop_list:
    print(bla, end=' ')
This is awesome ! 

For-loops can be nested, and more complicated operations can be conducted within the loop:

In [82]:
loop_list = ["This", "is", "awesome", "!"]
for element in loop_list:
    if element == "awesome":
        print("f****** ", element, end="")
    else:
        print(element, end=' ')
This is f******  awesome! 

A very typical routine is to go through the indices of a list and use the elements of the list as the input for another operation.

For example, we might build a list with the square roots of the integers from 0 to 10 using a for loop! A useful function in this context is range():

In [83]:
list(range(10))
Out[83]:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Actually, range() produces an iterator, another data type, which is useful to speed up your code (more on this in the notes).

In [84]:
type(range(10))
Out[84]:
range

For now, remember you can use

for i in range(10):
    ...

There is an important distinction between the following two approaches of looping through lists:

In [85]:
l = [1, 2, 3, 4, 5]
l2_ = ["a", "b", "c"]

print("Loop through the elements:")
for i in l:
    print(i, end=" ")
    
print("\nLoop through the indices of the list:")
for i in range(len(l)):
    print(l[i], end=" ")
Loop through the elements:
1 2 3 4 5 
Loop through the indices of the list:
1 2 3 4 5 

Although in this case both procedures give you the same result, it is strongly recommended to use the second option, whenever there is no good argument against it.

The reason is that the first approach often yields undesired results, particularly once your lists become more complex, or when you do more than one thing within a loop. And this will happen often in the context of ABM, since you often use loops to execute operations at every time step.

The following simple example illustrates the advantage of looping through indicators.

In [86]:
l_1 = ["a", "b", "c"]
l_2 = [1, 2, 3]
l_3 = [1, 11, 14]

# Now consider you want to do this operation on all lists:
for i in l_1: # This does not work because elements of l_1 are strings
    print("Element called in l_1: ", l_1[i])
    print("Element called in l_2: ", l_2[i])
    print("Element called in l_3: ", l_3[i]) 
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-86-b49f5c09a57c> in <module>()
      5 # Now consider you want to do this operation on all lists:
      6 for i in l_1: # This does not work because elements of l_1 are strings
----> 7     print("Element called in l_1: ", l_1[i])
      8     print("Element called in l_2: ", l_2[i])
      9     print("Element called in l_3: ", l_3[i])

TypeError: list indices must be integers or slices, not str

But once you loop through indices, there is no problem:

In [87]:
for i in range(len(l_3)): # This works well:)
    print("Element called in l_1: ", l_1[i])
    print("Element called in l_2: ", l_2[i])
    print("Element called in l_3: ", l_3[i])   
Element called in l_1:  a
Element called in l_2:  1
Element called in l_3:  1
Element called in l_1:  b
Element called in l_2:  2
Element called in l_3:  11
Element called in l_1:  c
Element called in l_2:  3
Element called in l_3:  14

Since this will become an important issue once you are concerned with more advanced tasks, you should start looping through indices rather then elements right from the start!

But keep in mind: for loops are slow and should be avoided whenever possible.

List comprehension

Python has an awesome technique called list comprehension.

In [88]:
base_list = [3, 2, 5, 7] 
new_list = [x*10 for x in base_list]
new_list
Out[88]:
[30, 20, 50, 70]
In [89]:
base_list = range(10)
roots = [x**0.5 for x in base_list]
roots
Out[89]:
[0.0,
 1.0,
 1.4142135623730951,
 1.7320508075688772,
 2.0,
 2.23606797749979,
 2.449489742783178,
 2.6457513110645907,
 2.8284271247461903,
 3.0]

Sometimes it is useful to combine a list comprenehsion and an if/else statement.

For example, suppose you want a list with the squares of the even numbers from 1 to 20. You could either use arange(2,20,2) to create an array with the even numbers, and then use list comprehension on this.

Of you can add an if/else statement using the function floor from the math library (more on libraries below). This function rounds off a number, so $2\cdot\text{floor}(i/2)=i$ if and only if $i$ is an even number.

In [90]:
from math import floor
even_squares = [i**2 for i in range(20) if 2*floor(i/2) == i]
print(even_squares)
[0, 4, 16, 36, 64, 100, 144, 196, 256, 324]

While loops

The final programming technique we consider are while loops. They are less frequently used than for loops, but may come in handy. In a while loop, a certain operation is repeated until a certain condition is met:

In [91]:
counter = 10
stop_condition = 0
while counter >= stop_condition:
    print(counter)
    counter -= 1
10
9
8
7
6
5
4
3
2
1
0

You can also enrich your while loop with an else statement:

In [92]:
counter = 10
stop_condition = 0
while counter >= stop_condition:
    print(counter)
    counter -= 1
else:
    print("BOOOOM!")
10
9
8
7
6
5
4
3
2
1
0
BOOOOM!

If you want to mess up your computer, run a while loop without a stop condition;) CTRL+C will save you in such situations.

While-loops are useful when you want to approximate something and repeat the approximation until the error is very small. For example, suppose you want to calculate $\sqrt2$ using an ancient method of the Babylonians,

In [93]:
error = 0.0001
previous_val = 1.0 * (1 + 2)
new_val = 0.5 * (1 + 2)
approximations = [previous_val, new_val]

while abs(new_val - previous_val) > error: # Continue until the error is small
    previous_val = new_val
    new_val = 0.5 * (new_val + 2/new_val)
    approximations.append(new_val)

print("Approximation: ", new_val, "\n", 
     "Analytic solution: ", 2**0.5, "\n", 
     "Error: ", new_val-2**0.5, "\n")
Approximation:  1.4142135623746899 
 Analytic solution:  1.4142135623730951 
 Error:  1.5947243525715749e-12 

Here is a plot (we will learn how to create plots lates):

In [94]:
%matplotlib inline
import matplotlib.pyplot as plt # used for plotting
from matplotlib.ticker import MaxNLocator

plt.clf()
fig, ax = plt.subplots()
ax.plot(range(len(approximations)), approximations, label="Approximation")
ax.plot(range(len(approximations)), [2**0.5]*len(approximations), linestyle="--", label="True value")
ax.legend(loc=1)

ax.set_xlabel("Iteration step")
ax.xaxis.set_major_locator(MaxNLocator(integer=True))
ax.set_ylabel("New approximation")
ax.set_title("Approximation of square root of two")
plt.show()
<matplotlib.figure.Figure at 0x1073fe710>

First remarks on modules and namespaces

Python is organized with modules and packages (some of them also called 'libraries', but do not worry about these differences). This means that Python 'as is' has only limited functionality. However, many people have written modules that you can use to do more stuff.

At the moment you can think of modules as scripts that somebody has written, and that your Python distribution can access. Packages are collections of modules, which are usually used simultaneously. The precise difference is, however, not important at this stage.

For example, in base Python, there is no function to take the square root of a number, we need to write:

In [95]:
5**0.5
Out[95]:
2.23606797749979

However, there is a module called math that contains many useful function and variable definitions. It also contains a function sqrt. To use it, we need to import the module first. Conventionally, this is done at the very beginning of your script using the keyword import.

In [96]:
import math
math.sqrt(5)
Out[96]:
2.23606797749979

To access the function sqrt, you need to tell Python that it is in the math module first. Therefore, you need to write math.sqrt, and not just sqrt. This is because Python uses different namespaces for the imported modules and your current file.

You can, however, also load the function into your current namespace:

In [97]:
import math
from math import sqrt

sqrt(5)
Out[97]:
2.23606797749979

This can, however, can get confusing, so usually using the prefixes is a better idea.

Also, beware of using the command from math import *, which loads everything in the math module into your current namespace. This causes even more confusion!

But the math module not only contains function definitions, but also variable definitions.

Suppose you want to use pi or the Eulerian number:

In [98]:
print(math.pi)
print(math.e)
3.141592653589793
2.718281828459045

To figure out what is actually contained in a moduls you can use the dir function:

In [99]:
# dir(math) # returns a sorted list with all things defined in the math moduls

If called without input, dir returns all variables you have defined so far in your current namespace:

In [100]:
# dir()

Sometimes the name of a module is inconvenient. In this case you might use an alias for this name.

For example, a very common module is called numpy, which allows you to carry out many mathematical operations very efficiently. To avoid writing numpy all the time, it is now a convention to import it as np

In [101]:
import numpy as np
x = np.array([[1,2], [3,4]])
x
Out[101]:
array([[1, 2],
       [3, 4]])

There are thousands of modules available for Python. Some of them come with the base installation, others need to be installed later. Your Anaconda distribution comes with basically all modules needed for modelling and scientific computing, so you just need to import them.

This way, the functionality of Python keeps growing, and you can be sure that it never gets 'outdated'!

It is easy to write your own modules: write some stuff in a file, save the file in your current working directory. Then it is ready for being imported. This gets useful when you want to break projects into several files. We will get back to this when we work on our first ABM!

Errors

Although pretty annoying in the beginning you will soon learn to appreciate error messages: they help you to identify mistakes you made unconsciously, and give hints on how to fix them.

Remember: if there were no error messages, the program would just do something weird, without telling you, and in the end you were left wondering why the program does not return what you want it to!

Let us now look at an error message in more detail:

In [102]:
x = 2
y = "Hello!"
x / y
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-102-81f4d6dd309f> in <module>()
      1 x = 2
      2 y = "Hello!"
----> 3 x / y

TypeError: unsupported operand type(s) for /: 'int' and 'str'

In the first line, we see the type of the error. In our case, we have a TypeError: we want to divide two objects, but division is not defined for objects of the type str. Therefore, Python explains to us in the last line that dividing an integer by a string is not supported.

In the middle of the error message, Python tells us where the error occurs: the name of the input file, the module (more on this below), and the exact location in the code.

Often, the error messages contain enough information to resolve the error, but in other instances you must use a debugger. This helps you to delve into the code exactly before the error occurs, so you can inspect the objects in your program. For example, you check the types, lengths and contents of the objects, and trace the original source of the error.

This activity is called "debugging" and usually consumes a considerable share of your work time.

We will learn about debugging later in this course.

Sometimes it is not immediately obvious to you why the error occurs. Then you may comment out the line with the error, rerun the program, and see whether the error still persists (and whether the type of error has changed).

But for now, lets have a look at errors that occur frequently.

Common error messages

In [103]:
x = 2
y = "3"
print(x / y)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-103-2bdfa6c0380d> in <module>()
      1 x = 2
      2 y = "3"
----> 3 print(x / y)

TypeError: unsupported operand type(s) for /: 'int' and 'str'

We already encountered this type of error before: a type error means that you want to use an operation on an object, for which - because of its type - this operation is not defined.

Good ways to track and understand TypeErrors is to add a print() statement that prints the variable and its type:

print(x, type(x))
In [104]:
x = 3
print(4*z)
40.0

A name error means that you want python to follow a reference that does not exist. In the preceeding case, z was never assigned any value, so it is not possible to multiply it.

The best way to spot a name error is to first, check whether there is a typo in the call, and second to search your code via an editor to find the location where the variable is first defined. If this is after the first call, you have to move the assignment.

In [105]:
"x" = 2
  File "<ipython-input-105-864d3c0f472c>", line 1
    "x" = 2
           ^
SyntaxError: can't assign to literal

A syntax error occurs when python cannot figure out the syntax of a particular statement. In this case, it is simply impossible to assign a value to a string...if you want to, you need to use a dictionary.

Usually, error messages are usually very informative, but sometimes they can also become misleading.

For example, whenever you do something wrong with your indentations, Python will return an Indentation error. Many other programming languages do not have this feature. In R, for example, nested code is indicated by brackets. The advantage of the Python approach is that it forces you to write more readable code.

The disadvantage is that sometimes your code does not what you want, and from the resulting error message it is not directly obvious that the mistake is a wrong indentation.

Consider the following example:

In [106]:
# You wan to write a program that checks whether a number works as an index for a list
# If it does, the program should return the corresponding element of the list
l = [1, 2, 3, 4, 5]
x = 3.0
if x > 4:
    print("x is too big. List has only five elements!")
    x = int(x)
l[x]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-106-c60efcab372a> in <module>()
      6     print("x is too big. List has only five elements!")
      7     x = int(x)
----> 8 l[x]

TypeError: list indices must be integers or slices, not float

It is clear that the original mistake was the indentation: converting x to an integer is necessary, regardless of the if/else statement before! Python informs you about a type error, but in fact it is an indentation error: the line with x = int(x) must not be indended!

This simple example here is pretty clear, but once you write more complex code you will see that such indentation mistakes can be really difficult to spot! So you need to check your code carefully, even in the area before the error.

It is very helpful to familiarize you with the Python way of using indentation right from the beginning!

Try/except blocks

A useful way of handling errors and exceptions is the try/except block. This allows you to, first try whether some code works. If it does, no problem. But in case it produces a particular error Python automatically applies another block of code to correct for the error.

If the correction works, the program proceeds normally. If not, it raises an error:

In [107]:
x = "2" # Note that x is of type str
y = 4
try:
    z = y / x
except TypeError:
    print("A type error occured. Try to resolve by converting x to float.")
    x = float(x)
    z = y / x
print(z)
A type error occured. Try to resolve by converting x to float.
2.0

In the preceeding case, it would not be worth the effort to exit the program only because a string has not been converted to a float. But it is nevertheless useful to add the print statement so that you know that the exception has occured.

One useful application of the try/except block is when you want to make sure that variable names are free:

In [108]:
x = 2
y = 4
del(x) # Make sure that x does not exist and y exists
try:
    del(x)
except NameError: # if x does not exist...
    print("x did not exist, no need to delete it.")
    pass 
try:
    del(y)
    print("Succesfully deleted y!")
except NameError: # if y does not exist...
    pass 
# Now we can be sure that x and y are not associated with any other object
x did not exist, no need to delete it.
Succesfully deleted y!

Note that in the original example, if x could not have been converted to a float, Python would still return a TypeError, but from within the except statement.

Also, if a different error occurs within the try block, e.g. a NameError because x has not been defined previously, Python returns this error and exits the program:

In [109]:
x = "a" # Note that x is of type str and cannot be converted to a float
y = 4
try:
    z = y / x
except TypeError:
    print("A type error occured. Try to resolve by converting x to float.")
    x = float(x)
    z = y / x
print(z)
A type error occured. Try to resolve by converting x to float.
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-109-d4d9a9e82627> in <module>()
      3 try:
----> 4     z = y / x
      5 except TypeError:

TypeError: unsupported operand type(s) for /: 'int' and 'str'

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-109-d4d9a9e82627> in <module>()
      5 except TypeError:
      6     print("A type error occured. Try to resolve by converting x to float.")
----> 7     x = float(x)
      8     z = y / x
      9 print(z)

ValueError: could not convert string to float: 'a'

Getting help

Python has a very good documentation. If you are not sure about what a particular function does, check the help:

In [110]:
help(sum) # or "sum?"
Help on built-in function sum in module builtins:

sum(iterable, start=0, /)
    Return the sum of a 'start' value (default: 0) plus an iterable of numbers
    
    When the iterable is empty, return the start value.
    This function is intended specifically for use with numeric values and may
    reject non-numeric types.

Some coding guidelines

Writing nice code

There are many coding conventions that aim to assist programmers in writing code that can easily be understood by others. Once you feel more comfortable about dealing with Python, make sure to read the official guideline.

This guideline, known as PEP8 is part of the so called Python Enhancement Protocols, which can be found here.

Its useful to have a look at them, they contain some interesting information. I really recommend you to have a look at PEP8 because habituating this kind of coding right from the beginning will save you a lot of time later on.

Writing comprehensive code

Writing your model in a way that is comprehensible to others is very important.

Keep in mind that one of the major criticism of simulation models in economics, particularly agent-based models, is a lack of transparency. Writing nice and readable code is therefore very important to present your models in a transparent way. Python is a very readable language, and PEP8 will help you to exploit this readability.

When discussing features of code that make it particularly readable to others (but also to youreself after a while), the following points pop up frequently.

  • Use variable names that tell you what the variable does
  • Start every script with a short description of what the file is meant to do.
  • Add docstrings to functions that explain what it is doing (see PEP8 on how to write good docstrings)

For the sake of readability, you should always try to avoid the following:

  • Abbrevations and variable names that do not relate to the content or the functioning of a variable

Outlook

You have managed to learn a great deal of concepts today. Next week we will cover the following topics:

  • Classes
  • Plotting
  • Matrix algebra and numpy
  • Debugging
  • A first ABM