Programming strategies

Recursion

Recursive humor:

  • recursion: see recursion
  • To understand recursion first you have to understand recursion.

Recursive abbreviations:

  • GNU: GNU's not Unix,
  • PHP: PHP Hypertext Preprocessor,
  • WINE: WINE Is Not an Emulator,
  • TikZ: TikZ is kein Zeichenprogramm.

Examples:

  • tower of Hanoi
  • Fibonacci numbers
  • Pascal triangle (binomial coefficients)

Dynamic programming

To solve a problem with

  1. resursion, by reducing a problem to a similar but smaller problems,
  2. but to avoid recursion, store the previously calculated results in a table.

Fibonacci numbers

The Fibonacci numbers are defined with a recursive formula. We will write a function which calculates the Fibonacci numbers, first with recursive function calls.

To measure the runtime of the functions we use the time module's time.time() function. This gives the current time in seconds.

The recursive_fib.counter is a member of the recursive_fib function object. We use that to count how many times this function is called. As we change the value of it inside the function, we have to use it as a global variable.

Recursive solution

In [1]:
import time

def recursive_fib(n):
    recursive_fib.counter += 1
    if n <= 1:
        return n
    else:
        return recursive_fib(n-1) + recursive_fib(n-2)

recursive_fib.counter = 0
start = time.time()
print(recursive_fib(33))
print(time.time() - start)
print(recursive_fib.counter)
3524578
3.9227097034454346
11405773

This is terribly slow, the function is called extremely many times. You can avoid the lots of function calls just by memorizing the previous results in a list. This is called a memory table.

Iterative solution

In [2]:
import time

def dp_fib1(n):
    f = [0, 1]
    for i in range(n-1):
        f.append(f[-2]+f[-1])
    return f[-1]

print(dp_fib1(1000))
start = time.time()
dp_fib1(1000)
print(time.time() - start)
43466557686937456435688527675040625802564660517371780402481729089536555417949051890403879840079255169295922593080322634775209689623239873322471161642996440906533187938298969649928516003704476137795166849228875
0.000667572021484375
In [3]:
start = time.time()
for i in range(100):
    dp_fib1(1000)
print((time.time() - start)/100)
0.00048427343368530275

It is even more efficient if you store only the last two values.

In [4]:
import time

def dp_fib2(n):
    f = [0, 1]
    for i in range(n-1):
        f = [f[1], f[0] + f[1]]
    return f[1]

print(dp_fib2(1000))

start = time.time()
for i in range(100):
    dp_fib2(1000)
print((time.time() - start)/100)
43466557686937456435688527675040625802564660517371780402481729089536555417949051890403879840079255169295922593080322634775209689623239873322471161642996440906533187938298969649928516003704476137795166849228875
0.0005146026611328125

Tower of Hanoi

Exercise: given three rods with disks on them. The disks have an increasing radius and the rules are:

  • you can move only one disk at a time
  • bigger disk cannot be on top of a smaller disk

Move a stack of disks from one rod to an other! (the next animation is from tutorialspoint.com)

Tower Of Hanoi

Recursive solution: if you want to move $n$ disks from rod $A$ to rod $B$ then

  • first you have to move the top $n-1$ disks from rod $A$ to rod $C$
  • then move the bottom disk to rod $B$
  • finally move the $n-1$ disks from rod $C$ to their final position to rod $B$

This solution is easy to programming but not that efficient.

Recursive solution

In [5]:
def hanoi(n, from_, to_, third_):
    if n > 0:
        hanoi(n-1, from_, third_, to_)
        print("disk no. {0}: {1} ==> {2}".format(
            n, from_, to_))
        hanoi(n-1, third_, to_, from_)
In [6]:
hanoi(3, "A", "B", "C")
disk no. 1: A ==> B
disk no. 2: A ==> C
disk no. 1: B ==> C
disk no. 3: A ==> B
disk no. 1: C ==> A
disk no. 2: C ==> B
disk no. 1: A ==> B
In [7]:
hanoi(2, "A", "B", "C")
disk no. 1: A ==> C
disk no. 2: A ==> B
disk no. 1: C ==> B
In [8]:
hanoi(1, 1, 2, 3)
disk no. 1: 1 ==> 2

The non-recursive solution is more efficient but harder to write the code.

Exchange

Let's say that you have a given type of coins: ¢1, ¢2, ¢5, ¢10 and so on.

You want to pay a given amount of money with the least possible number of coins. For example paying ¢10 with 10 times a ¢1 coin is not optimal. But paying ¢8 is optimal with ¢1 + ¢2 + ¢5.

Suppose that you have enough number of coins from every type to find the optimal number of coins to pay a given amount of money!

The greedy algorithm works in this case for the coins given above:

  • Find the biggest coin which covers not more than the target amount
  • Subtract that from the target amount
  • Continue until the target is ¢0

But this algorithm failes for example if you have ¢1, ¢5, ¢8, ¢10, ¢20 coins and want to pay ¢24. The greedy algorithm gives ¢20 + ¢1 + ¢1 + ¢1 + ¢1, but ¢8 + ¢8 + ¢8 is a better solution.

Recursive solution

Note that r_exchange.counter is a global variable.

In [9]:
import time

def r_exchange(money):
    coins.sort()
    r_exchange.counter += 1
    min_coins = float('inf')
    if money in coins:
        return 1
    else:
        for coin in coins:
            if coin > money:
                break
            number_of_coins = 1 + r_exchange(money - coin)
            if number_of_coins < min_coins:
                min_coins = number_of_coins
    return min_coins
In [10]:
r_exchange.counter = 0
coins = [1, 2, 5, 10, 20]
start = time.time()
print(r_exchange(34))
print(time.time() - start)
print(r_exchange.counter)
4
4.569389820098877
4399137
In [11]:
r_exchange.counter = 0
coins = [1, 5, 8, 10, 20]
start = time.time()
print(r_exchange(24))
print(time.time() - start)
print(r_exchange.counter)
3
0.0031881332397460938
800

It works OK but there is a faster solution.

Dynamic programming

We write a dynamic program, which calculate the optimal number of coins for every amount of money from 0 up to the target value. In this way you don't have to calculate the same thing several times as we did it in the recursive solution.

Let's go through the possible coins and try to pay the amount with that coin. $$\texttt{optimal}[\texttt{target}] = 1 + \texttt{optimal}[\texttt{target} - \texttt{coin}]$$ And look for the optimum by selecting the smallest over all coins.

In formula for example with coins ¢1, ¢2, ¢5 and the total amount of ¢24: $$\texttt{optimal}[24]=1+\min\left\{\begin{array}{l}\texttt{optimal}[24- 1]\\\texttt{optimal}[24-5]\\\texttt{optimal}[24-10]\end{array}\right\}$$ With the variable names of the next code: $$\texttt{min_coins}=1+\min\left\{\begin{array}{l}\texttt{memory_table}[24 - 1]\\\texttt{memory_table}[24 - 5]\\\texttt{memory_table}[24 - 10]\end{array}\right\}$$ The memory_table will be global, but only because we would like to see the content of it outside the function.

In [12]:
def dynamic_exchange(money):
    global memory_table
    memory_table = [0]*(money+1)
    coins.sort()
    for t in range(1, money+1):
        min_coins = float('inf')
        for coin in coins:
            if coin > t:
                break
            if memory_table[t-coin] + 1 < min_coins:
                min_coins = memory_table[t-coin] + 1
        memory_table[t] = min_coins

    return memory_table[money]
In [13]:
coins = [1, 2, 5, 10, 20]
start = time.time()
print(dynamic_exchange(34))
print(time.time() - start)
4
0.0014696121215820312
In [14]:
print(memory_table)
[0, 1, 1, 2, 2, 1, 2, 2, 3, 3, 1, 2, 2, 3, 3, 2, 3, 3, 4, 4, 1, 2, 2, 3, 3, 2, 3, 3, 4, 4, 2, 3, 3, 4, 4]
In [15]:
coins = [1, 5, 8, 10, 20]
start = time.time()
print(dynamic_exchange(24))
print(time.time() - start)
3
0.005361080169677734
In [16]:
print(memory_table)
[0, 1, 2, 3, 4, 1, 2, 3, 1, 2, 1, 2, 3, 2, 3, 2, 2, 3, 2, 3, 1, 2, 3, 3, 3]
In [17]:
coins = [5, 10, 20, 50]
start = time.time()
print(dynamic_exchange(24))
print(time.time() - start)
inf
0.0016167163848876953
In [18]:
print(memory_table)
[0, inf, inf, inf, inf, 1, inf, inf, inf, inf, 1, inf, inf, inf, inf, 2, inf, inf, inf, inf, 1, inf, inf, inf, inf]

Partitions

Exercise: How many ways can you decompose an integer into sum of positive integers (order matters)? For example you can write $3 = 1+1+1 = 1+2 = 2+1$, there are four ways to decompose. We will solve this with recursion and with dynamic programming.

In [19]:
def sums(n):
    if n == 0:
        return [[]]
    else:
        sumlist = []
        for i in range(1, n+1):
            L = [i]
            for l in sums(n-i):
                sumlist.append(L + l)
    return sumlist

n = 4
print(("The number {0} has {1} decompositions.".
       format(n, len(sums(n)))))
for s in sums(n):
    print((" + ".join(str(x) for x in s)))
The number 4 has 8 decompositions.
1 + 1 + 1 + 1
1 + 1 + 2
1 + 2 + 1
1 + 3
2 + 1 + 1
2 + 2
3 + 1
4
In [20]:
def sums_d(n):
    global memory_table
    memory_table = [[[]]]
    if n == 0:
        return memory_table[n]
    for i in range(1, n+1):
        sumlist = [[i]]
        for j in range(1, i):
            for l in memory_table[i - j]:
                sumlist.append([j] + l)
        memory_table.append(sumlist)
    return memory_table[n]
In [21]:
for s in sums_d(n):
    print(" + ".join(str(x) for x in s))
4
1 + 3
1 + 1 + 2
1 + 1 + 1 + 1
1 + 2 + 1
2 + 2
2 + 1 + 1
3 + 1
In [22]:
print(memory_table)
[[[]], [[1]], [[2], [1, 1]], [[3], [1, 2], [1, 1, 1], [2, 1]], [[4], [1, 3], [1, 1, 2], [1, 1, 1, 1], [1, 2, 1], [2, 2], [2, 1, 1], [3, 1]]]

Finite-state machines

Escaped characters

How python parses a string like this?

In [23]:
print("\"Hello\"\nbackslash: \\not a new line")
"Hello"
backslash: \not a new line

For a backslash character the parser goes into a listening state which means that the next character is treated differently.

Let's say that you read the string character-by-character and have a state variable s=0.

  • If you encounter a backslash and s=0: set s=1
  • If you encounter \ n t " or ' and s=1 then
    • you have a special character, increase the counter and
    • reset s=0
  • if you encounter any other character then just set s=0
In [24]:
def escape(sample):
    s = 0
    counter = 0    
    for i in sample:
        if i == "\\":
            if s == 0:
                s = 1
            else:
                print("\\", end=" ")
                counter += 1
                s = 0
        elif i in 'nt"\'':
            if s == 1:
                s = 0
                print(i, end=" ")
                counter += 1
        else:
            s = 0
    print()
    return counter
In [25]:
sample = r"\"Hello\"\nbackslash: \\not a new line"
print((escape(sample)))
" " n \ 
4

Digraphs

A digraph is a pair of characters used to write either a single phoneme, or a sequence of phonemes that does not correspond to the normal values of the two characters combined. In English, there are several digraphs, like 'sc' that represents /s/ (e.g. in scene) or /ʃ/ (e.g. in conscious), or like 'ng' that represents /ŋ/ (e.g. in thing). In Hungarian a digraph 'ly' has a double version 'lly'. Count them in a text. You may use here three states: s=0 is the base state, s=1 after an 'l', and s=2 after 'll'.

In [26]:
def ly(sample):
    s = 0
    counter = [0, 0]
    for i in sample:
        if i == 'l':
            if s <= 1:
                s += 1
        elif i == 'y':
            if s == 1:
                counter[0] += 1
            elif s == 2:
                counter[1] += 1
            s = 0
        else:
            s = 0
    return counter
In [27]:
sample = "gally lyuk alma xylofon folyam"
lys = ly(sample)
print("there are {0} 'ly' and {1} 'lly'".format(lys[0],lys[1]))
there are 2 'ly' and 1 'lly'

Parenthesis

Count how many parentheis are in a formula. Let's start with s=0 and increase s whenever you find a "(" and decrease it when you find a ")".

If the counter becomes negative then some of the parenthesis are wrong. Or if the counter is not 0 at the end of the string then the parenthesis are wrong, too.

In [ ]: