Programming strategies

Recursion

Recursive humor:

  • recursion: see recursion
  • To understand recursion first you have to understand recursion.

Recursive abbreviations:

  • GNU: GNU's not Unix,
  • PHP: PHP Hypertext Preprocessor,
  • WINE: WINE Is Not an Emulator,
  • TikZ: TikZ is kein Zeichenprogramm.

Examples:

  • tower of Hanoi
  • Fibonacci numbers
  • Pascal triangle (binomial coefficients)

Dynamic programming

To solve a problem with

  1. resursion, by reducing a problem to a similar but smaller problems,
  2. but to avoid recursion, store the previously calculated results in a table.

Fibonacci numbers

The Fibonacci numbers are defined with a recursive formula: $F_0=0$, $F_1=1$ and $F_n=F_{n-1}+F_{n-2}$. We will write a function which calculates the Fibonacci numbers, first with recursive function calls.

To measure the runtime of the functions we use the time module's time.time() function. This gives the current time in seconds.

The recursive_fib.counter is a member of the recursive_fib function object. We use that to count how many times this function is called. As we change the value of it inside the function, we have to use it as a global variable.

Recursive solution

In [1]:
import time

def recursive_fib(n):
    recursive_fib.counter += 1
    if n <= 1:
        return n
    else:
        return recursive_fib(n-1) + recursive_fib(n-2)

recursive_fib.counter = 0
start = time.time()
f = recursive_fib(33)
print(time.time() - start)
print(f"F_33 = {f}, counter = {recursive_fib.counter}")
3.8393850326538086
F_33 = 3524578, counter = 11405773
In [2]:
recursive_fib.counter = 0
f = recursive_fib(5)
print(f"F_5 = {f}, counter = {recursive_fib.counter}")
F_5 = 5, counter = 15

This is terribly slow, the function is called extremely many times. Evaluating recursive_fib(5) needs 15 function calls:

                           fib(5)   
                     /               \
               fib(4)                 fib(3)   
             /        \               /     \ 
        fib(3)        fib(2)        fib(2) fib(1)
       /      \      /      \      /      \
    fib(2) fib(1) fib(1) fib(0) fib(1) fib(0)
   /      \
fib(1) fib(0)

With the values:

                          5  
                     /          \
               3                      2   
            /     \                /     \ 
         2           1           1         1
       /   \       /   \       /   \
      1     1     1     0     1     0
   /     \
  1       0

You can avoid the lots of function calls just by memorizing the previous results in a list. This is called a memory table.

Dynamic (iterative) solution

In [3]:
import time

def dynamic_fib1(n):
    f = [0, 1]
    for i in range(n-1):
        f.append(f[-2]+f[-1])
    return f[-1]

start = time.time()
f = dynamic_fib1(1000)
print(time.time() - start)
print(f"F_1000 = {f}")
0.000629425048828125
F_1000 = 43466557686937456435688527675040625802564660517371780402481729089536555417949051890403879840079255169295922593080322634775209689623239873322471161642996440906533187938298969649928516003704476137795166849228875
In [4]:
start = time.time()
for i in range(100):
    dynamic_fib1(1000)
print((time.time() - start)/100)
0.0003206467628479004

It is even more efficient if you store only the last two values.

In [5]:
import time

def dynamic_fib2(n):
    f0, f1 = 0, 1
    for i in range(n-1):
        f0, f1 = f1, f0 + f1
    return f1

print(dynamic_fib2(1000))

start = time.time()
for i in range(100):
    dynamic_fib2(1000)
print((time.time() - start)/100)
43466557686937456435688527675040625802564660517371780402481729089536555417949051890403879840079255169295922593080322634775209689623239873322471161642996440906533187938298969649928516003704476137795166849228875
0.0001365089416503906

Euklidean algorithm

The two programs are very similar: the same step is repeated either with recursive function calls, or in a loop. In the dynamic program two values are enough to save. The parameters in the programs are non negative integers:

In [6]:
def r_gcd(a, b):
    if b == 0:
        return a
    else:
        return r_gcd(b, a%b)

r_gcd(110, 242)
Out[6]:
22
In [7]:
def d_gcd(a, b):
    while b:   # repeat while b is not equal to 0
        a, b = b, a%b
    return a

d_gcd(110, 242)
Out[7]:
22

Tower of Hanoi

Exercise: given three rods with disks on them. The disks have an increasing radius and the rules are:

  • you can move only one disk at a time
  • bigger disk cannot be on top of a smaller disk

Move a stack of disks from one rod to an other! (the next animation is from tutorialspoint.com)

Tower Of Hanoi

Recursive solution: if you want to move $n$ disks from rod $A$ to rod $B$ then

  • first you have to move the top $n-1$ disks from rod $A$ to rod $C$
  • then move the bottom disk to rod $B$
  • finally move the $n-1$ disks from rod $C$ to their final position to rod $B$

This solution is easy to programming but not that efficient.

Recursive solution

In [8]:
def hanoi(n, from_, to_, third_):
    if n > 0:
        hanoi(n-1, from_, third_, to_)
        print(f"disk no. {n}: {from_} -> {to_}")
        hanoi(n-1, third_, to_, from_)
In [9]:
hanoi(3, "A", "B", "C")
disk no. 1: A -> B
disk no. 2: A -> C
disk no. 1: B -> C
disk no. 3: A -> B
disk no. 1: C -> A
disk no. 2: C -> B
disk no. 1: A -> B
In [10]:
hanoi(2, "A", "B", "C")
disk no. 1: A -> C
disk no. 2: A -> B
disk no. 1: C -> B
In [11]:
hanoi(1, 1, 2, 3)
disk no. 1: 1 -> 2
In [12]:
hanoi(4, 1, 2, 3)
disk no. 1: 1 -> 3
disk no. 2: 1 -> 2
disk no. 1: 3 -> 2
disk no. 3: 1 -> 3
disk no. 1: 2 -> 1
disk no. 2: 2 -> 3
disk no. 1: 1 -> 3
disk no. 4: 1 -> 2
disk no. 1: 3 -> 2
disk no. 2: 3 -> 1
disk no. 1: 2 -> 1
disk no. 3: 3 -> 2
disk no. 1: 1 -> 3
disk no. 2: 1 -> 2
disk no. 1: 3 -> 2

The non-recursive solution is more efficient but more difficult. You may try to write such a program. It helps if you notice that with odd $n$ the disks are moved between the nods in a loop of three, namely

A <-> B
A <-> C
B <-> C

and with even $n$ between the nods

A <-> C
A <-> B
B <-> C

More info can be found on Wikipedia.

Exchange

Let's say that you have a given type of coins: ¢1, ¢2, ¢5, ¢10 and so on.

You want to pay a given amount of money with the least possible number of coins. For example paying ¢10 with 10 times a ¢1 coin is not optimal. But paying ¢8 is optimal with ¢1 + ¢2 + ¢5.

Suppose that you have enough number of coins from every type to find the optimal number of coins to pay a given amount of money!

The greedy algorithm works in this case for the coins given above:

  • Find the biggest coin which covers not more than the target amount
  • Subtract that from the target amount
  • Repeat this until the target is ¢0

But this algorithm failes for example if you have ¢1, ¢5, ¢8, ¢10, ¢20 coins and want to pay ¢24. The greedy algorithm gives ¢20 + ¢1 + ¢1 + ¢1 + ¢1, but ¢8 + ¢8 + ¢8 is a better solution.

Recursive solution

Note that r_exchange.counter is a global variable. In this way we may count the number of function calls.

In [13]:
import time

def r_exchange(money):
    coins.sort()
    r_exchange.counter += 1
    min_coins = float('inf')
    if money in coins:
        return 1
    else:
        for coin in coins:
            if coin > money:
                break
            number_of_coins = 1 + r_exchange(money - coin)
            if number_of_coins < min_coins:
                min_coins = number_of_coins
    return min_coins
In [14]:
r_exchange.counter = 0
coins = [1, 2, 5, 10, 20]
start = time.time()
print(r_exchange(34))
print(time.time() - start)
print(r_exchange.counter)
4
5.436130523681641
4399137
In [15]:
r_exchange.counter = 0
coins = [1, 5, 8, 10, 20]
start = time.time()
print(r_exchange(24))
print(time.time() - start)
print(r_exchange.counter)
3
0.005151271820068359
800

It works OK but there is a faster solution.

Dynamic programming

We write a dynamic program, which calculate the optimal number of coins for every amount of money from 0 up to the target value. In this way you don't have to calculate the same thing several times as we did it in the recursive solution.

Let's go through the possible coins and try to pay the amount with that coin. $$\texttt{optimal}[\texttt{target}] = 1 + \texttt{optimal}[\texttt{target} - \texttt{coin}]$$ And look for the optimum by selecting the smallest over all coins.

In formula for example with coins ¢1, ¢2, ¢5 and the total amount of ¢24: $$\texttt{optimal}[24]=1+\min\left\{\begin{array}{l}\texttt{optimal}[24- 1]\\\texttt{optimal}[24-5]\\\texttt{optimal}[24-10]\end{array}\right\}$$ With the variable names of the next code: $$\texttt{min_coins}=1+\min\left\{\begin{array}{l}\texttt{memory_table}[24 - 1]\\\texttt{memory_table}[24 - 5]\\\texttt{memory_table}[24 - 10]\end{array}\right\}$$ The memory_table will be global, but only because we would like to see the content of it outside the function, otherwise it is not necessary.

In [16]:
def dynamic_exchange(money):
    global memory_table
    memory_table = [0]*(money+1)
    coins.sort()
    for t in range(1, money+1):
        min_coins = float('inf')
        for coin in coins:
            if coin > t:
                break
            if memory_table[t-coin] + 1 < min_coins:
                min_coins = memory_table[t-coin] + 1
        memory_table[t] = min_coins

    return memory_table[money]
In [17]:
coins = [1, 2, 5, 10, 20]
start = time.time()
print(dynamic_exchange(34))
print(time.time() - start)
4
0.00043010711669921875
In [18]:
print(memory_table)
[0, 1, 1, 2, 2, 1, 2, 2, 3, 3, 1, 2, 2, 3, 3, 2, 3, 3, 4, 4, 1, 2, 2, 3, 3, 2, 3, 3, 4, 4, 2, 3, 3, 4, 4]
In [19]:
coins = [1, 5, 8, 10, 20]
start = time.time()
print(dynamic_exchange(24))
print(time.time() - start)
3
0.0013692378997802734
In [20]:
print(memory_table)
[0, 1, 2, 3, 4, 1, 2, 3, 1, 2, 1, 2, 3, 2, 3, 2, 2, 3, 2, 3, 1, 2, 3, 3, 3]
In [21]:
coins = [5, 10, 20, 50]
start = time.time()
print(dynamic_exchange(24))
print(time.time() - start)
inf
0.0051174163818359375
In [22]:
print(memory_table)
[0, inf, inf, inf, inf, 1, inf, inf, inf, inf, 1, inf, inf, inf, inf, 2, inf, inf, inf, inf, 1, inf, inf, inf, inf]

Partitions

Exercise: How many ways can you decompose an integer into sum of positive integers (order matters)? For example you can write $3 = 1+1+1 = 1+2 = 2+1$, there are four ways to decompose. We will solve this with recursion and with dynamic programming.

In [23]:
def r_sums(n):
    if n == 0:
        return [[]]
    else:
        sumlist = []
        for i in range(1, n+1):
            li = [i]
            for l in r_sums(n-i):
                sumlist.append(li + l)
    return sumlist

n = 4
print(f"The number {n} has {len(r_sums(n))} decompositions.")
for s in r_sums(n):
    print((" + ".join(str(x) for x in s)))
The number 4 has 8 decompositions.
1 + 1 + 1 + 1
1 + 1 + 2
1 + 2 + 1
1 + 3
2 + 1 + 1
2 + 2
3 + 1
4
In [24]:
def sums_d(n):
    global memory_table
    memory_table = [[[]]]
    if n == 0:
        return memory_table[n]
    for i in range(1, n+1):
        sumlist = [[i]]
        for j in range(1, i):
            for l in memory_table[i - j]:
                sumlist.append([j] + l)
        memory_table.append(sumlist)
    return memory_table[n]
In [25]:
for s in sums_d(n):
    print(" + ".join(str(x) for x in s))
4
1 + 3
1 + 1 + 2
1 + 1 + 1 + 1
1 + 2 + 1
2 + 2
2 + 1 + 1
3 + 1
In [26]:
print(memory_table)
[[[]], [[1]], [[2], [1, 1]], [[3], [1, 2], [1, 1, 1], [2, 1]], [[4], [1, 3], [1, 1, 2], [1, 1, 1, 1], [1, 2, 1], [2, 2], [2, 1, 1], [3, 1]]]

Finite-state machines

Escaped characters

How python parses a string like this?

In [27]:
print("\"Hello\"\nbackslash: \\not a new line")
"Hello"
backslash: \not a new line

For a backslash character the parser goes into a listening state which means that the next character is treated differently.

Let's say that you read the string character-by-character and have a state variable s=0.

  • If you encounter a backslash and s=0: set s=1
  • If you encounter \ n t " or ' and s=1 then
    • you have a special character, increase the counter and
    • reset s=0
  • if you encounter any other character then just set s=0
In [28]:
def escape(sample):
    s = 0
    counter = 0    
    for i in sample:
        if i == "\\":
            if s == 0:
                s = 1
            else:
                print("\\", end=" ")
                counter += 1
                s = 0
        elif i in 'nt"\'':
            if s == 1:
                s = 0
                print(i, end=" ")
                counter += 1
        else:
            s = 0
    print()
    return counter
In [29]:
sample = r"\"Hello\"\nbackslash: \\not a new line"
print((escape(sample)))
" " n \ 
4

Digraphs

A digraph is a pair of characters used to write either a single phoneme, or a sequence of phonemes that does not correspond to the normal values of the two characters combined. In English, there are several digraphs, like 'sc' that represents /s/ (e.g. in scene) or /ʃ/ (e.g. in conscious), or like 'ng' that represents /ŋ/ (e.g. in thing). In Hungarian a digraph 'ly' has a double version 'lly'. Count them in a text. You may use here three states: s=0 is the base state, s=1 after an 'l', and s=2 after 'll'.

In [31]:
def ly(sample):
    s = 0
    counter = [0, 0]
    for i in sample:
        if i == 'l':
            if s <= 1:
                s += 1
        elif i == 'y':
            if s == 1:
                counter[0] += 1
            elif s == 2:
                counter[1] += 1
            s = 0
        else:
            s = 0
    return counter
In [32]:
sample = "gally lyuk alma xylofon folyam"
lys = ly(sample)
print("there are {0} 'ly' and {1} 'lly'".format(lys[0],lys[1]))
there are 2 'ly' and 1 'lly'

Parenthesis

Count how many parentheis are in a formula. Let's start with s=0 and increase s whenever you find a "(" and decrease it when you find a ")".

If the counter becomes negative or if it is not 0 at the end of the string the formula is not valid.

In [ ]: