{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Programming strategies\n",
"\n",
"## Recursion\n",
"\n",
"Recursive humor:\n",
"* ***recursion***: *see recursion*\n",
"* *To understand recursion first you have to understand recursion.*\n",
"\n",
"Recursive abbreviations:\n",
"* GNU: GNU's not Unix,\n",
"* PHP: PHP Hypertext Preprocessor, \n",
"* WINE: WINE Is Not an Emulator,\n",
"* TikZ: TikZ is kein Zeichenprogramm.\n",
"\n",
"Examples:\n",
"* [tower of Hanoi](https://en.wikipedia.org/wiki/Tower_of_Hanoi)\n",
"* Fibonacci numbers\n",
"* Pascal triangle (binomial coefficients)\n",
"\n",
"## Dynamic programming\n",
"\n",
"To solve a problem with\n",
"1. resursion, by reducing a problem to a similar but smaller problems,\n",
"1. but to avoid recursion, store the previously calculated results in a table."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Fibonacci numbers\n",
"The Fibonacci numbers are defined with a recursive formula: $F_0=0$, $F_1=1$ and $F_n=F_{n-1}+F_{n-2}$. We will write a function which calculates the Fibonacci numbers, first with recursive function calls. \n",
"\n",
"To measure the runtime of the functions we use the `time` module's `time.time()` function. This gives the current time in seconds.\n",
"\n",
"The `recursive_fib.counter` is a member of the `recursive_fib` function object.\n",
"We use that to count how many times this function is called. As we change the value of it inside the function, we have to use it as a global variable.\n",
"\n",
"#### Recursive solution"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import time\n",
"\n",
"def recursive_fib(n):\n",
" recursive_fib.counter += 1\n",
" if n <= 1:\n",
" return n\n",
" else:\n",
" return recursive_fib(n-1) + recursive_fib(n-2)\n",
"\n",
"recursive_fib.counter = 0\n",
"start = time.time()\n",
"f = recursive_fib(33)\n",
"print(time.time() - start)\n",
"print(f\"F_33 = {f}, counter = {recursive_fib.counter}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"recursive_fib.counter = 0\n",
"f = recursive_fib(5)\n",
"print(f\"F_5 = {f}, counter = {recursive_fib.counter}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is terribly slow, the function is called extremely many times. Evaluating ```recursive_fib(5)``` needs 15 function calls:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```\n",
" fib(5) \n",
" / \\\n",
" fib(4) fib(3) \n",
" / \\ / \\ \n",
" fib(3) fib(2) fib(2) fib(1)\n",
" / \\ / \\ / \\\n",
" fib(2) fib(1) fib(1) fib(0) fib(1) fib(0)\n",
" / \\\n",
"fib(1) fib(0)\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"With the values:\n",
"```\n",
" 5 \n",
" / \\\n",
" 3 2 \n",
" / \\ / \\ \n",
" 2 1 1 1\n",
" / \\ / \\ / \\\n",
" 1 1 1 0 1 0\n",
" / \\\n",
" 1 0\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can avoid the lots of function calls just by memorizing the previous results in a list.\n",
"This is called a memory table.\n",
"\n",
"#### Dynamic (iterative) solution"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import time\n",
"\n",
"def dynamic_fib1(n):\n",
" f = [0, 1]\n",
" for i in range(n-1):\n",
" f.append(f[-2]+f[-1])\n",
" return f[-1]\n",
"\n",
"start = time.time()\n",
"f = dynamic_fib1(1000)\n",
"print(time.time() - start)\n",
"print(f\"F_1000 = {f}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"start = time.time()\n",
"for i in range(100):\n",
" dynamic_fib1(1000)\n",
"print((time.time() - start)/100)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It is even more efficient if you store only the last two values."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import time\n",
"\n",
"def dynamic_fib2(n):\n",
" f0, f1 = 0, 1\n",
" for i in range(n-1):\n",
" f0, f1 = f1, f0 + f1\n",
" return f1\n",
"\n",
"print(dynamic_fib2(1000))\n",
"\n",
"start = time.time()\n",
"for i in range(100):\n",
" dynamic_fib2(1000)\n",
"print((time.time() - start)/100)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Euklidean algorithm\n",
"The two programs are very similar: the same step is repeated either with recursive function calls, or in a loop. In the dynamic program two values are enough to save. The parameters in the programs are non negative integers:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def r_gcd(a, b):\n",
" if b == 0:\n",
" return a\n",
" else:\n",
" return r_gcd(b, a%b)\n",
"\n",
"r_gcd(110, 242)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def d_gcd(a, b):\n",
" while b: # repeat while b is not equal to 0\n",
" a, b = b, a%b\n",
" return a\n",
"\n",
"d_gcd(110, 242)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Tower of Hanoi\n",
"\n",
"**Exercise:** given three rods with disks on them. The disks have an increasing radius and the rules are:\n",
"* you can move only one disk at a time\n",
"* bigger disk cannot be on top of a smaller disk\n",
"\n",
"Move a stack of disks from one rod to an other!\n",
"(the next animation is from [tutorialspoint.com]( https://www.tutorialspoint.com/data_structures_algorithms/tower_of_hanoi.htm))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"Recursive solution: if you want to move $n$ disks from rod $A$ to rod $B$ then\n",
"* first you have to move the top $n-1$ disks from rod $A$ to rod $C$\n",
"* then move the bottom disk to rod $B$\n",
"* finally move the $n-1$ disks from rod $C$ to their final position to rod $B$\n",
"\n",
"This solution is easy to programming but not that efficient.\n",
"#### Recursive solution"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def hanoi(n, from_, to_, third_):\n",
" if n > 0:\n",
" hanoi(n-1, from_, third_, to_)\n",
" print(f\"disk no. {n}: {from_} -> {to_}\")\n",
" hanoi(n-1, third_, to_, from_)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"hanoi(3, \"A\", \"B\", \"C\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"hanoi(2, \"A\", \"B\", \"C\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"hanoi(1, 1, 2, 3)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"hanoi(4, 1, 2, 3)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The non-recursive solution is more efficient but more difficult. You may try to write such a program. It helps if you notice that with odd $n$ the disks are moved between the nods in a loop of three, namely\n",
"```\n",
"A <-> B\n",
"A <-> C\n",
"B <-> C\n",
"```\n",
"and with even $n$ between the nods\n",
"```\n",
"A <-> C\n",
"A <-> B\n",
"B <-> C\n",
"```\n",
"More info can be found on [Wikipedia](https://en.wikipedia.org/wiki/Tower_of_Hanoi)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Exchange\n",
"Let's say that you have a given type of coins: ¢1, ¢2, ¢5, ¢10 and so on.\n",
"\n",
"You want to pay a given amount of money with the least possible number of coins.\n",
"For example paying ¢10 with 10 times a ¢1 coin is not optimal.\n",
"But paying ¢8 is optimal with ¢1 + ¢2 + ¢5.\n",
"\n",
"Suppose that you have enough number of coins from every type to find the optimal number of coins to pay a given amount of money!\n",
"\n",
"The ***greedy algorithm*** works in this case for the coins given above:\n",
"* Find the biggest coin which covers not more than the target amount\n",
"* Subtract that from the target amount\n",
"* Repeat this until the target is ¢0\n",
"\n",
"But this algorithm failes for example if you have ¢1, ¢5, ¢8, ¢10, ¢20 coins and want to pay ¢24. The greedy algorithm gives ¢20 + ¢1 + ¢1 + ¢1 + ¢1, but ¢8 + ¢8 + ¢8 is a better solution.\n",
"\n",
"#### Recursive solution\n",
"Note that `r_exchange.counter` is a global variable. In this way we may count the number of function calls."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import time\n",
"\n",
"def r_exchange(money):\n",
" coins.sort()\n",
" r_exchange.counter += 1\n",
" min_coins = float('inf')\n",
" if money in coins:\n",
" return 1\n",
" else:\n",
" for coin in coins:\n",
" if coin > money:\n",
" break\n",
" number_of_coins = 1 + r_exchange(money - coin)\n",
" if number_of_coins < min_coins:\n",
" min_coins = number_of_coins\n",
" return min_coins"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"r_exchange.counter = 0\n",
"coins = [1, 2, 5, 10, 20]\n",
"start = time.time()\n",
"print(r_exchange(34))\n",
"print(time.time() - start)\n",
"print(r_exchange.counter)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"r_exchange.counter = 0\n",
"coins = [1, 5, 8, 10, 20]\n",
"start = time.time()\n",
"print(r_exchange(24))\n",
"print(time.time() - start)\n",
"print(r_exchange.counter)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It works OK but there is a faster solution."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Dynamic programming\n",
"\n",
"We write a dynamic program, which calculate the optimal number of coins for every amount of money from 0 up to the target value. \n",
"In this way you don't have to calculate the same thing several times as we did it in the recursive solution.\n",
"\n",
"Let's go through the possible coins and try to pay the amount with that coin.\n",
"$$\\texttt{optimal}[\\texttt{target}] = 1 + \\texttt{optimal}[\\texttt{target} - \\texttt{coin}]$$\n",
"And look for the optimum by selecting the smallest over all coins.\n",
"\n",
"In formula for example with coins ¢1, ¢2, ¢5 and the total amount of ¢24:\n",
"$$\\texttt{optimal}[24]=1+\\min\\left\\{\\begin{array}{l}\\texttt{optimal}[24- 1]\\\\\\texttt{optimal}[24-5]\\\\\\texttt{optimal}[24-10]\\end{array}\\right\\}$$\n",
"With the variable names of the next code:\n",
"$$\\texttt{min_coins}=1+\\min\\left\\{\\begin{array}{l}\\texttt{memory_table}[24 - 1]\\\\\\texttt{memory_table}[24 - 5]\\\\\\texttt{memory_table}[24 - 10]\\end{array}\\right\\}$$\n",
"The `memory_table` will be global, but only because we would like to see the content of it outside the function, otherwise it is not necessary."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def dynamic_exchange(money):\n",
" global memory_table\n",
" memory_table = [0]*(money+1)\n",
" coins.sort()\n",
" for t in range(1, money+1):\n",
" min_coins = float('inf')\n",
" for coin in coins:\n",
" if coin > t:\n",
" break\n",
" if memory_table[t-coin] + 1 < min_coins:\n",
" min_coins = memory_table[t-coin] + 1\n",
" memory_table[t] = min_coins\n",
"\n",
" return memory_table[money]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"coins = [1, 2, 5, 10, 20]\n",
"start = time.time()\n",
"print(dynamic_exchange(34))\n",
"print(time.time() - start)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(memory_table)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"coins = [1, 5, 8, 10, 20]\n",
"start = time.time()\n",
"print(dynamic_exchange(24))\n",
"print(time.time() - start)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(memory_table)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"coins = [5, 10, 20, 50]\n",
"start = time.time()\n",
"print(dynamic_exchange(24))\n",
"print(time.time() - start)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(memory_table)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Partitions\n",
"**Exercise:** How many ways can you decompose an integer into sum of positive integers (order matters)?\n",
"For example you can write $3 = 1+1+1 = 1+2 = 2+1$, there are four ways to decompose.\n",
"We will solve this with recursion and with dynamic programming."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def r_sums(n):\n",
" if n == 0:\n",
" return [[]]\n",
" else:\n",
" sumlist = []\n",
" for i in range(1, n+1):\n",
" li = [i]\n",
" for l in r_sums(n-i):\n",
" sumlist.append(li + l)\n",
" return sumlist\n",
"\n",
"n = 4\n",
"print(f\"The number {n} has {len(r_sums(n))} decompositions.\")\n",
"for s in r_sums(n):\n",
" print((\" + \".join(str(x) for x in s)))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def sums_d(n):\n",
" global memory_table\n",
" memory_table = [[[]]]\n",
" if n == 0:\n",
" return memory_table[n]\n",
" for i in range(1, n+1):\n",
" sumlist = [[i]]\n",
" for j in range(1, i):\n",
" for l in memory_table[i - j]:\n",
" sumlist.append([j] + l)\n",
" memory_table.append(sumlist)\n",
" return memory_table[n]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"for s in sums_d(n):\n",
" print(\" + \".join(str(x) for x in s))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(memory_table)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Finite-state machines\n",
"### Escaped characters\n",
"\n",
"How python parses a string like this?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(\"\\\"Hello\\\"\\nbackslash: \\\\not a new line\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For a backslash character the parser goes into a listening state which means that the next character is treated differently.\n",
"\n",
"Let's say that you read the string character-by-character and have a state variable `s=0`.\n",
"\n",
"* If you encounter a backslash and `s=0`: set `s=1`\n",
"* If you encounter `\\ n t \"` or `'` and `s=1` then\n",
" * you have a special character, increase the counter and\n",
" * reset `s=0`\n",
"* if you encounter any other character then just set `s=0`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def escape(sample):\n",
" s = 0\n",
" counter = 0 \n",
" for i in sample:\n",
" if i == \"\\\\\":\n",
" if s == 0:\n",
" s = 1\n",
" else:\n",
" print(\"\\\\\", end=\" \")\n",
" counter += 1\n",
" s = 0\n",
" elif i in 'nt\"\\'':\n",
" if s == 1:\n",
" s = 0\n",
" print(i, end=\" \")\n",
" counter += 1\n",
" else:\n",
" s = 0\n",
" print()\n",
" return counter"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sample = r\"\\\"Hello\\\"\\nbackslash: \\\\not a new line\"\n",
"print((escape(sample)))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Digraphs\n",
"A digraph is a pair of characters used to write either a single phoneme, or a sequence of phonemes that does not correspond to the normal values of the two characters combined. In English, there are several digraphs, like 'sc' that represents /s/ (e.g. in scene) or /ʃ/ (e.g. in conscious), or like 'ng' that represents /ŋ/ (e.g. in thing). In Hungarian a digraph 'ly' has a double version 'lly'. Count them in a text.\n",
"You may use here three states: `s=0` is the base state, `s=1` after an 'l', and `s=2` after 'll'."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def ly(sample):\n",
" s = 0\n",
" counter = [0, 0]\n",
" for i in sample:\n",
" if i == 'l':\n",
" if s <= 1:\n",
" s += 1\n",
" elif i == 'y':\n",
" if s == 1:\n",
" counter[0] += 1\n",
" elif s == 2:\n",
" counter[1] += 1\n",
" s = 0\n",
" else:\n",
" s = 0\n",
" return counter"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sample = \"gally lyuk alma xylofon folyam\"\n",
"lys = ly(sample)\n",
"print(\"there are {0} 'ly' and {1} 'lly'\".format(lys[0],lys[1]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Parenthesis\n",
"Count how many parentheis are in a formula.\n",
"Let's start with `s=0` and increase `s` whenever you find a `\"(\"` and decrease it when you find a `\")\"`.\n",
"\n",
"If the counter becomes negative or if it is not 0 at the end of the string the formula is not valid."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.8"
},
"pygments_lexer": "ipython3",
"version": "3.6"
},
"nbformat": 4,
"nbformat_minor": 1
}