# Data structures¶

## Arrays¶

In Python the easiest way to implement a 2D array is a list of lists. For a 3D array: list of lists of lists and so on...

In [1]:
M = [[1, 2], [3, 4]]


You can access the emelents with multiple bracket [ ] indexing:

In [2]:
M[0][1]

Out[2]:
2
In [3]:
M = [[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[0, 9], [0, 0]]]  # 3x2x2 array
M[2][0][1]

Out[3]:
9

Exercise: Write a function that prints a 2D array in a tabular like format:

1   2
3   4
In [4]:
def array_print(M):
for i in range(len(M)):
for j in range(len(M[i])):
print(M[i][j], end='\t')   # TAB character
print()

In [5]:
array_print([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

1	2	3
4	5	6
7	8	9

In [6]:
array_print([[1, 2, 3, 3.5], [4, 5, 6, 6.5], [7, 8, 9, 9.5]])

1	2	3	3.5
4	5	6	6.5
7	8	9	9.5


## Embedded data structures¶

Exercise: Let's say that a store would like to set up a discount system for regular customers. The store's database has the name of the customers (a unique string) and their shopping costs so far. The customers are stored in a list, every element of the list is a pairs: their name and their list of shopping costs. For example one customer entry would look like this:

["Anna", [54, 23, 12, 56, 12, 71]]

The discounts are calculated in the following way:

Total shoppings > 200: $10\%$

Total shoppings > 500: $15\%$

Total shoppings > 1000: $20\%$

Otherwise no discount ($0\%$).

Write a function that calculates the discount for every customer.

• The input is the list of customer entries
• The output is the list of the discounts in a length 2 list: name and the discount
• For example a list of these: ["Anna", 10]

How to begin? Break down to subtasks!

1. Decide the discount from the total shopping cost ( 228 -> 10 )
2. calculate the total shopping cost (sum the list of costs)
3. Do this for every customer and return the results in a result list

There are two ways to achive this (design pattern):

• top-down: first write the final function assuming that the smaller subtask (functions) are already done, then do the smaller tasks
In [7]:
# top-down

def discount(customers):
result = []
for customer in customers:
result.append(calculate_discount(customer))
return result

def calculate_discount(customer):
name = customer[0]
total_cost = 0
for shopping in customer[1]:
total_cost += shopping
return [name, discount_from_total(total_cost)]

def discount_from_total(total):
if total > 1000:
return 20
if total > 500:
return 15
if total > 200:
return 10
return 0

In [8]:
discount([["Anna", [54, 23, 12, 56, 12, 71]],
["Bill", [11, 3, 12, 1, 12, 55]],
["Hagrid", [111, 545, 343, 56, 12, 66]],
["Not_a_wizard", [54, 222, 65, 56, 43, 71]]])

Out[8]:
[['Anna', 10], ['Bill', 0], ['Hagrid', 20], ['Not_a_wizard', 15]]

## Tuple¶

We have already seen strings and lists which are similar to the tuple.

An n-tuple can be created with a comma separeted list in a parenthesis or with the tuple() function:

In [9]:
t = (1, 5, 6, 2, 1)
print(t[2])
type(t)

6

Out[9]:
tuple
In [10]:
l = [1, 2, 3]
t = tuple(l)
print(t)

(1, 2, 3)


Use for loop as before:

In [11]:
for e in t:
print(e, end=" ")

1 2 3

In [12]:
t[1] = 4

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-12-87b0f225887f> in <module>
----> 1 t[1] = 4

TypeError: 'tuple' object does not support item assignment

Parenthesis is also used for grouping operations (like in $2\times(3+4)$), but that is different from the parenthesis of the tuple. Also in some cases you don't have to write the parenthesis at all.

In [13]:
x = 2, 3, 4
print(x)

(2, 3, 4)

In [14]:
x, y = 2, 3
print(x)
print(y)

2
3


1-tuples are different than a single object in a parenthesis, you can construct a 1-tuple with an ending comma inside a parenthesis.

In [15]:
print(type((1)))
print(type((1,)))

<class 'int'>
<class 'tuple'>


# Mutable and immutable types¶

The tuple is almost identical to the list except that it is immutable: you cannot assign a single element. The list is mutable.

You cannot change a tuple once created, except creating a new one, like in case of strings:

In [16]:
s = ("h", "e", "l", "l", "o")
print(s[2])
s = ("h", "a", "l", "l", "o")

string = "puppy"
string2 = string[:2] + "ff" + string[4:]
print(string2)

l
puffy

In [17]:
for e in s:
print(e, end=' ')

h a l l o
In [18]:
s[1] = "e"

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-18-6ca1c1f7962b> in <module>
----> 1 s[1] = "e"

TypeError: 'tuple' object does not support item assignment

# Dictionary¶

A dictionary is a series of pairs, we call them key-value pairs. A dictionary can have any type of key and any type of value, but the key have to be immutable.

You can contruct dictionaries with a curly bracket { } or with the dict() function.

In [19]:
d = {"puppy": 5}
type(d)

Out[19]:
dict

You can add a pair to the dictionary by assigning the value to the bracketed key.

In [20]:
d = {}             # empty dictionary
d["cat"] = [1, 5]  # add a new key-value pair
d["puppy"] = 3

In [21]:
d

Out[21]:
{'cat': [1, 5], 'puppy': 3}

You can access the elements by their keys in a bracket.

In [22]:
d["cat"]

Out[22]:
[1, 5]

You can use different type of keys (as long as they are immutable):

In [23]:
d = dict()
d[5] = 7
d[(1, 5)] = "puppy"
d[(1, 5)] = "snake"
d["cat"] = 3.14
print(d)

{5: 7, (1, 5): 'snake', 'cat': 3.14}


A for loop iterates over the keys:

In [24]:
for key in d:
print(key, ":", d[key])

5 : 7
(1, 5) : snake
cat : 3.14

In [25]:
customers = {"Anna": [54, 23, 12, 130],
"Bill": [11, 3, 12, 1, 12, 55],
"Hagrid": [111, 545, 343, 56, 12, 66],
"Not_a_wizard": [54, 222, 165, 56]}
print(customers)
print()
print(customers["Bill"])

{'Anna': [54, 23, 12, 130], 'Bill': [11, 3, 12, 1, 12, 55], 'Hagrid': [111, 545, 343, 56, 12, 66], 'Not_a_wizard': [54, 222, 165, 56]}

[11, 3, 12, 1, 12, 55]


Exercise: Write a program, which save the discount for every costumers! Rewrite the previous one!

In [26]:
def discount(customers):
for customer in customers:
customers[customer] = {"shl": customers[customer]}
customers[customer]["d"] = calculate_disc(customers[customer]["shl"])
print(customer, customers[customer]["d"])

def calculate_disc(shl):
total_cost = 0
for shopping in shl:
total_cost += shopping
return discount_from_total(total_cost)

def discount_from_total(total):
if total > 1000:
return 20
if total > 500:
return 15
if total > 200:
return 10
return 0

In [27]:
discount(customers)

Anna 10
Bill 0
Hagrid 20
Not_a_wizard 10

In [28]:
customers.keys()

Out[28]:
dict_keys(['Anna', 'Bill', 'Hagrid', 'Not_a_wizard'])
In [29]:
customers.values()

Out[29]:
dict_values([{'shl': [54, 23, 12, 130], 'd': 10}, {'shl': [11, 3, 12, 1, 12, 55], 'd': 0}, {'shl': [111, 545, 343, 56, 12, 66], 'd': 20}, {'shl': [54, 222, 165, 56], 'd': 10}])
In [30]:
for i in customers.values():
print(i)

{'shl': [54, 23, 12, 130], 'd': 10}
{'shl': [11, 3, 12, 1, 12, 55], 'd': 0}
{'shl': [111, 545, 343, 56, 12, 66], 'd': 20}
{'shl': [54, 222, 165, 56], 'd': 10}


## Hash function¶

The dictionary uses a so-called hash function to calculate where to put the elements. In this way it it can find every key-value quickly.

You can observe the hash values yourself with the hash() function:

In [31]:
print(hash((1, 5)))
print(hash(5), hash(0), hash(False), hash(True))
print(hash((5,)))
print(hash("puppy"))
print(hash("puppz"))
print(hash("muppy"))

3713081631939823281
5 0 0 1
3430023387570
-5006443583613666025
2966548838902777491
7008207655977323352
-4478024969822308348


The dictionary uses the hash function, but a similar function is common in both theoretical and applied computer science. There are advanced algorithm courses about hash functions.

The hash() functions are important in other areas of computer science Advances Datastructures and Techniques for Analysis of Algorithms, Katalin Friedl, Gyula Katona.

What is a hash function:

• assign a natural number to every piece of data (the value has a fixed width bit representation: 64, 128, 256 ... )
• it is a function so it assigns the same value to the same thing (every time).
• Sort-of unique meaning that there are a few different data with the same hash value
• it should be calculated quickly

There may be extra requirements in other applications (cryptographic hash function):

• infeasable to invert (find out the data from the hash), not impossible but rather slow
• (pseudo)random and non-continuous
• infeasable to mimic a given data (find a piece of data to match a given hash value)

### Applications¶

• checksum (file hash)
• indexing, fast access (dict)
• pseudo random number generators (cryptography)

# Iterating and functions of containers¶

A data type is iterable if it can be iterated over with a for loop. Example:

• list
• string (characters)
• tuple
• dict
    for x in <iterable>:
<do stuff>

There are some useful functions that can be used on any iterables:

In [32]:
# repetition
print("puppy "*3)
print((1, 2, 3)*3)
print([1, 2, 3]*3)

puppy puppy puppy
(1, 2, 3, 1, 2, 3, 1, 2, 3)
[1, 2, 3, 1, 2, 3, 1, 2, 3]

In [33]:
# concatenation (except for dict)
(1, 2) + (2, 4, 6)

Out[33]:
(1, 2, 2, 4, 6)
In [34]:
# universal (boolean and)
print(all((False, True, True, True)))
print(all((0, 1, 1, 1)))

False
False

In [35]:
# existential (boolean or)
any((0, 1, 1, 1))

Out[35]:
True
In [36]:
# "transpose"
zip([1, 2, 3], [11, 12, 13])

Out[36]:
<zip at 0x7f8dbdaacf08>
In [37]:
list(_)

Out[37]:
[(1, 11), (2, 12), (3, 13)]
In [38]:
array_print(_)    # use our previously written function

1	11
2	12
3	13

In [39]:
# sum (for numbers, not for strings)
sum((1, 2, 3))

Out[39]:
6
In [ ]: