a = "That's"
b = "fine"
Operator | Description | Input | Output |
---|---|---|---|
+ | Concatenation | a + b | That'sfine |
* | Repetition | 2 * b | finefine |
[] | Slice | a[1] | h |
[:] | Range Slice | a[1:4] | hat |
in | Membership | 'a' in a | True |
not in | Membership | 'a' not in b | True |
r/R | Raw String suppresses escape chars | r'\n' | \n |
% | Format: %s %d %o (octal) %x (hex) %f %e %E %g %G |
x = 12.345
y = 12.3e10
print("f: %f, e: %e, g: %g" % (x, x, x) )
print("f: %f, e: %e, g: %g" % (y, y, y) )
hi = "how ARE you"
Method | Description | Input | Output |
---|---|---|---|
.capitalize() | first letter into upper case | hi.capitalize() | How are you |
.lower() | lower case | hi.lower() | how are you |
.upper() | upper case | hi.upper() | HOW ARE YOU |
.title() | every word capitalized | hi.title() | How Are You |
hi = "how ARE you"
hi.title()
hi.upper()
Method | Boolean value |
---|---|
.isalnum() | alphanumeric characters (no symbols)? |
.isalpha() | alphabetic characters (no symbols)? |
.islower() | lower case? |
.isnumeric() | numeric characters? |
.isspace() | whitespace characters? |
.istitle() | is in title case? |
.isupper() | upper case? |
alnum = "23 apples"
num = "-1234"
title = "Big Apple"
print(alnum.isalnum(), num.isnumeric(), title.istitle())
alnum = "23apples"
num = "1234"
white = " \t \n\n "
print(alnum.isalnum(), num.isnumeric(), white.isspace())
hi = "how ARE you"
seq = ['h', 'o', 'w'] #OR seq = 'h', 'o', 'w'
wh = "\t this \n "
Method | Description | Input | Output |
---|---|---|---|
.join() | concatenates with separator string | " < ".join(seq) | h < o < w |
.lstrip() | removes leading whitespaces | wh.lstrip() | "this \n " |
.rstrip() | removes trailing whitespaces | wh.rstrip() | "\t this" |
.strip() | performs lstrip() and rstrip() | wh.strip() | "this" |
.replace(old, new [, m]) | replaces old with new at most m times | hi.replace("o", "O") | hOw ARE yOu |
.split(s[,m]) | splits at s max m times, returns list | hi.split() | [ "how", "ARE", "you" ] |
s = " < "
seq = "a", "b", "c" # a sequence of strings (tuple, list)
print (s.join( seq ))
hi = "how ARE you today"
hi.replace("o", "O", 2)
hi.split() # if no argument is given, separate at white spaces
hi.split("o", 2) # separate at the first two occurances
s = '.,.dots or commas.,.,...'
print(s.strip('.,'))
print(s.rstrip('.,'))
print(s.lstrip('.,'))
s = "where"
print('0123456789'*3)
print(s.center(30))
print(s.rjust(30))
print(s.ljust(30))
The parameter of the method tells the final width.
You can print a table nicely:
tabular = [["First row", -2, -310], ["Second row", 3, 1], ["Third row",-321, 11]]
tabular_string = ""
for row in tabular:
tabular_string += row[0].ljust(13)
for i in range(1, len(row)):
tabular_string += str(row[i]).rjust(7)
tabular_string += "\n"
print(tabular_string)
format
method¶The object is the formatting string and the parameters are the things to subtitute.
The numbers in the brackets mark the parameters.
'{0}-{1}-{2} {0}, {1}, {2}, {0}{0}{0}'.format('X', 'Y', 'Z')
The format marker "{ }"
can have optional formatting instructions: {number:optional}
optional | Meaning |
---|---|
d | decimal |
b | binary |
o | octal |
x, X | hex, capital HEX |
f, F | float |
e, E | exponential form: something times 10 to some power |
< | left justified |
> | right justified |
^ | centered |
c^ | centered but with a character 'c' as padding |
print("01234 01234 01234 0123456789")
print('{0:5} {1:5d} {2:>5} {3:*^10}'.format('0123', 1234, '|', 'center'))
"int {0:d}, hex {0:x} {0:X}, oct {0:o}, bin {0:b}".format(42)
"{0}, {0:e}, {0:f}, {0:8.4f}, {0:15.1f}".format(-12.345)
You can also name the parameters, it is more convinient then indices.
'The center is: ({x}, {y})'.format(x=3, y=5)
x1 = 3; y1 = 4
print('The center is: ({x}, {y})'.format(x=x1, y=y1))
tabular = [["First row", -2, -310], ["Second row", 3, 1], ["Third row",-321, 11]]
table_string = ""
for row in tabular:
table_string += "{0:_<13}".format(row[0])
for i in range(1, len(row)):
table_string += "{0:7d}".format(row[i])
table_string += "\n"
print(table_string)
A regular expression (regex, regexp) is a sequence of characters that define a search pattern. It is used in different programming languages and text editors. The aim is to recognize some string with given properties (like email-address, date, roman numeral, IP-address,...)
The next characters has special meaning: . ^ $ * + ? { } [ ] ( ) \ |
Character | Description | Example | Fits to |
---|---|---|---|
[] | set of characters | "[abcd]" | a, b,... |
[a-z] | an intervall | "[0-9a-fA-F]" | B, 5,... |
[^chars] | not the listed chars | "[^qx]" | a, b, c,... |
\ | to escape special characters | "\s" | space, tab,... |
. | any character (except newline) | "Wh..." | Where, Whose,... |
^ | beginning | "^Once" | Once..... |
$ | ends | "finished.\$" | .....finished. |
? | zero or one occurrences | "colou?r" | color, colour |
* | zero or more occurrences (greedy) | "woo*w" | woooooow |
+ | one or more occurrences (greedy) | "wo+w" | wow, woow |
*? | zero or more (lazy) | "w.*?w" | |
+? | one or more (lazy) | "w.+?w" | |
{n} | exactly n occurrences | "al{2}e{2}" | allee |
{n,} | at least n occurrences | "oh{3,}" | ohhhhhh |
{,n} | at most n occurrences | "woo{,3}w" | woooow, wooow, woow, wow |
{n,m} | at least n at most m occurrences | "wo{1,3}w" | wow,... |
| | either or | "H(a|ae|ä)ndel" | Handel, Haendel, Händel |
() | capture and group |
Character | Description | Examples |
---|---|---|
\b | beginning or end of a word | r"\bis" r"st\b" |
\B | NOT the beginning or the end of a word | r"\Bis" r"st\B" |
\d | digits (0-9) | r"\d\d-\d\d" |
\D | NOT a digits | r"\d\d-\D" |
\s | white space character | r"for\sever" |
\S | NOT a white space | r"\S" |
\w | any word character (a to Z, 0-9, and _) | r"\s\w\w\w\s" |
\W | NOT a word character | r"\Wword\W" |
Online with explanations: https://regex101.com/#python or https://extendsclass.com/regex-tester.html#python or https://www.regextester.com/ or with cheatsheet: https://pythex.org/.
Examples can be found in https://www.programiz.com/python-programming/regex
You have to import the functions of the modul re
, because they are not default. Put this line in the beginning of your code.
import re
The functions in the module re
:
Function | Description |
---|---|
findall(p, s) | returns a list containing all matches of p in s |
search(p, s) | returns a "match object" if there is a match of p in s |
split(p, s) | split at each match of p in s and returns a list |
sub(p, n, s[, m]) | replaces all or m matches of p with a new string n in s |
finditer(p, s) | returns an iterable object on the match objects of p in s |
Match object defines where is the pattern in the string and what is it exactly. These info can be read out by the .span()
and .group()
methods.
s = "confirmation"
p = ".i"
print(re.findall(p, s))
print(re.search(p, s))
x = re.search(p, s)
print(x.span())
print(x.group())
print(re.split(p, s))
Since backslash and other special characters can be used in a RegEx pattern, you have to be careful with them.
The best if you use a so called raw string as pattern. In this format one backslash means actually one backspash. You don't have to escape the backslash.
If you put an r
in front of the string, then it is in a raw format.
x = re.finditer(p, s)
for y in x:
print(y.group())
Exercise: Triple the quotation mark in a string. Anyone!
st = """This "word" is an 'other' one."""
p = """(['"])"""
print(re.sub(p, r"\1\1\1", st))
s = "This 'string' has two 'quoted' words"
p1 = "'.*'" # greedy
p2 = "'.*?'" # lazy
print(re.findall(p1, s), " --> greedy")
print(re.findall(p2, s), " --> lazy")