Python Data Structures
Following are data structures of Python
- String
- List
- Dictionary
- Tuple
- These are collections
- Collection: more than one thing in single variable
- Allows to put many values in a single variable of different types (except in Strings)
1) STRING
String = Character Set
>>> fruit = "banana"
for String variable 'fruit'
length --> 6
array --> b a n a n a
index --> 0 1 2 3 4 5
#) Length Function
>>> len(fruit) o/p => 6
#) Index Operator / Sub Operator / Look-up Operator
- Used for access array or character set
>>> print(fruit[0]) o/p => b
>>> print(fruit[4]) o/p => n
>>> print(fruit[5]) o/p => a
>>> print(fruit[6]) o/p => IndexError Trackback
>>> print(fruit[-1]) o/p => IndexError Trackback
#) String are not mutable
- It is not possible to change content of a String, must make new string to make any change.
>>> fruit = 'Banana'
>>> fruit[0] = 'b' o/p => TypeError Traceback
So,
>>> temp = fruit.lower()
>>> print(temp) o/p => banana
#) Looping through String
Indeterminate Loop
- if need to know position/index
index = 0
while index < len(fruit):
letter = fruit[index]
print(index, letter)
index = index + 1
Determinate Loop
- if no need to know position/index
- Convenient way to loop through (a String)
- 'iteration variable' iterates through the 'sequence/order set'
- 'iteration variable' moves through all the values 'in' the 'sequence'
- block/body of code is executed once for each value (for 'iteration variable') in the 'sequence'
for letter in fruit: <= 'letter' is an iteration variable. 'fruit' is a sequence / order set.
print(letter) <= block/body of loop
#) Slicing Strings
- Looks for continuous section of String
- Uses colon operator, with index numbers on each side of colon
>>> sss = 'Monty Python'
M o n t y P y t h o n
0 1 2 3 4 5 6 7 8 9 10 11
- Second number <-- one beyond the end of the slice = "up to but not including"
>>> print(sss[0:4]) o/p => Monty
>>> print(sss[6:7]) o/p => P
- If second number is beyond the end of string, it stops at the end (without any Traceback)
>>> print(sss[6:20]) o/p => Python
- If first or last number of the slice is leave off, then it is assumed to be the beginning or end of the String respectively
>>> print(sss[:2]) o/p => Mo
>>> print(sss[8:]) o/p => thon
>>> print(sss[:]) o/p => Monty Python
#) Manipulating Strings
- Strings are immutable
String concatenation using '+' operator
>>> a = 'Hello'
>>> b = a + 'There'
>>> printf(b) o/p => HelloThere
>>> b = a + ' ' + 'There'
>>> printf(b) o/p => Hello There
#) String Operators
'in' as a Logical Operator
- Used to check if one String is (substring) in another String
- Returns True / False, and can be used in an 'if' statement
>>> fruit = 'banana'
>>> 'n' in fruit o/p => True
>>> 'm' in fruit o/p => False
>>> 'nan' in fruit o/p => True
>>> if 'a' in fruit :
print('Found it!') o/p => Found it
Comparison Operators : == , < , >
- Same as for numbers
- Compares characters in sequence until check fails
#) String Library (for class str)
>>> greet = 'Hello Bob'
>>> greet.lower() --> returns lowercase copy of String variable, without changing String variable
>>> 'Hi There'.lower() --> returns lowercase copy of String constant too
>>> type(greet) o/p => <class 'str>
>>> dir(greet) o/p => capabilities/methods of type/class
#) Searching a String
- Looks for one String (substring) in another String
- Finds first occurrence of substring
- Returns found index position, otherwise return -1 if not found
- It is different that 'in' operator, which returns True/False on existence
>>> fruit = 'banana'
>>> pos = fruit.find('na')
>>> print(pos) o/p => 2
>>> pos = fruit.find('z')
>>> print(pos) o/p => -1
#) Search and Replace
- Replaces passed String #1 by passed String #2 with main String
>>> greet = 'Hello Bob'
>>> greet2 = greet.replace('Bob', 'Alice')
>>> print(greet2) o/p => Hello Alice
>>> greet2 = greet.replace('o', 'X')
>>> print(greet2) o/p => HellX BXb
#) Stripping/Trimming Whitespace
- Remove whitespace at the left/beginning
>>> greet.lstrip() o/p => 'Hello Bob '
- Remove whitespace at the right/ending
>>> greet.rstrip() o/p => ' Hello Bob'
- Remove whitespace at both left & right
>>> greet.strip() o/p => 'Hello Bob'
[where variable declaration as, greet = ' Hello Bob ']
In Python 3 all String are Unicode Strings.
While in Python 2 Unicode Strings are indicated with prefix 'u', which used to need some conversion for using.
1.1) File Handling & Reading
#) Opening a File
- returns a handle use to manipulate the File
- File handle is not the data or actual content, but handle for operations on File: open, read, write, close
>>> file_handle = open(file_path, open_mode)
file_path : either full or relative path of File
open_mode is optional
open_mode = 'r’ : to read the file (default)
open_mode = 'w' : to write the file
#) Newline character
- It is represented as a single character '\n' in String
- It is a special character
>>> lines = 'Hello\nWorld!'
>>> lines o/p => Hello\nWorld!
>>> print(lines) o/p => Hello
World!
>>> lines = 'X\nY'
>>> print(lines) o/p => X
Y
>>> len(lines) o/p => 3
- With Newline character, File content is not collection/sequence of lines, but actually it is a long String of characters with are punctuated with Newline character.
- Newline character is also considered as 'Whitespace Character', which can be trimmed using strip functions
NOTE: print() function also adds ONE newline to its printed content
#) File Handle as Sequence of lines
xfile = open('abc.txt') <-- Default open in read mode. Fails if wrong filepath or non-existing File.
for each_line in xfile: <-- 'for' loop reads File handle like a sequence of lines
print(each_line) <-- It prints Newline character from file, while another Newline character as default print() behaviour
#) Reading entire File content
xfile = open('abc.txt')
inp = xfile.read() <-- read whole File (including newline characters) into a single long string
print(len(inp)) <-- length includes count of Newline characters too
print(inp[:20])
NOTE: quit() function used to end the program
2) LIST
- A linear collection of values, stay in provided order (ordered collection)
#) List Constants
- surrounded by square brackets and elements are separated by commas
>>> print([1, 24, 76]) o/p => [1, 24, 76]
>>> print(['red', 'yellow', 'blue']) o/p => ['red', 'yellow', 'blue']
- list element can be any object, even another list
>>> print(['red', 24, 98.6]) o/p => ['red', 24, 98.6]
>>> print([1, [5, 6], 7]) o/p => [1, [5, 6], 7] ... there are 3 elements in primary list
- list can be empty
>>> print([]) o/p => []
#) Looking inside the Lists ... using SUB-operator
- any single element in a list can be retrieved using an index specified in square brackets
friends = [ 'Alice' , 'Bob' , 'Charlie']
>>> print(friends[1]) o/p => Bob
#) Lists are mutable
- It is possible to change an element of a List using index operator
>>> my_list = [1, 2, 4, 16, 256]
>>> print(my_list) o/p => [1, 2, 4, 16, 256]
>>> my_list[2] = 8
>>> print(my_list) o/p => [1, 2, 8, 16, 256]
#) Length of Lists
- number of elements of any set or sequence
>>> x = [1, 2, 'alice', [1.1, 2.2.], 3]
>>> print(len(x)) o/p => 5
#) Range() function
- returns a List
- returns numbers that range from 0 to one less than parameter
>>> print(range(4)) o/p => [0, 1, 2, 3]
>>> friends = [ 'Alice' , 'Bob' , 'Charlie']
>>> print(range(len(friends))) o/p => [0, 1, 2] ... len(friends) = 3
Counted variable loop using range()
for i in range(len(friends)):
friend = friends[i]
print('Hello', friend)
#) Operations on List
Concatenating List using '+'
- doesn't change the operators
>>> a = [11, 12, 13]
>>> b = [14, 15, 16]
>>> c = a + b o/p => [11, 12, 13, 14, 15, 16]
Slicing using ':'
- Just like String, the second number is "up to but not including"
- slicing of List is same as slicing of String
>>> t = [10, 11, 12, 13, 14, 15, 16]
>>> t[1:3] o/p => [11, 12]
>>> t[:4] o/p => [10, 11, 12, 13]
>>> t[4:] o/p => [14, 15, 16]
>>> t[:] o/p => [10, 11, 12, 13, 14, 15, 16]
List Methods
>>> x = list()
>>> type(x) o/p => <type 'list'>
>>> dir(x) o/p => Lists all methods of list
List constructor
>>> stuff = list() Create an empty List
>>> print(stuff) o/p => []
>>> stuff.append('book') List stays in same order & new element gets added at the end of List (as Lists are mutable)
>>> stuff.append(99)
>>> print(stuff) o/p => ['book', 99]
>>> stuff.append(14.5)
>>> print(stuff) o/p => ['book', 99, 14.5]
stuff = stuff.append() *** mess-up the 'stuff' data, as append() has different return value
'in' and 'not in'
- Logical operators check the items presence in the List, without modifying it
- Returns True or False
>>> item in list o/p => True/False
>>> item not in list o/p => True/False
#) Sorting of List
- changes the order to ascending order
- sorting may be possible only for same data type
>>> friends = ['PQR', 'XYZ', 'ABC']
>>> friends.sort()
>>> print(friends) o/p => ['ABC', 'PQR', 'XYZ']
#) Numeric functions
>>> nums = [1, 3, 6, 2, 9]
>>> print(len(nums)) o/p => 5
>>> print(max(nums)) o/p => 9
>>> print(min(nums)) o/p => 1
>>> print(sum(nums)) o/p => 21
#) Splitting String to List
- Function: list.spit(delimiter_character)
- Breaks a string into parts and produces a List of strings
>>> abc = "Hello! My name is Bob"
>>> stuff = abc.split()
>>> print(stuff) o/p => ['Hello!', 'My', 'name', 'is', 'Bob']
- (when delimiter is not specified), By default delimiter is White-Spaces: space, \n etc
- (when delimiter is not specified), By default multiple spaces are treated like one delimiter
3) DICTIONARY
- It is Python's most powerful data collection.
- Allows fast database-like operations in Python
- Similar to Properties/Map/HashMap in JAVA, Property Bag in C#/.Net, Associative Arrays in Perl/PHP
- A "bag" of values, each with its own label
- It is a key-value or label-value pair dictionary[key_type] = value
- keys are unique. can't put same key more than once, same value can be more than once.
- It has unpredictable order (Dictionaries are like bags - no order)
>>> purse = dict() Make an empty Dictionary
>>> purse['money'] = 12
>>> purse['candy'] = 3
>>> purse['tissues'] = 75
>>> print(purse) o/p => {'money': 12, 'tissues': 75, 'candy': 3}
>>> print(purse['candy']) o/p => 3
- Dictionaries are like Lists, except that they use keys instead of numbers to look up values
- List have positions, while Dictionaries have labels
- Dictionaries are mutable
>>> purse['candy'] = 23
>>> print(purse['candy']) o/p => 23
#) Dictionary Literals (Constants)
- Use curly braces and have list of key:value pairs jjj = {'chuck': 1, 'fred': 42, 'jan': 100]
- Empty dictionary using empty curly braces ooo = { }
- Key and Values can be of different type (than string:int)
#) Dictionary Traceback
- It is an error to reference a key which is not in the dictionary (it is same to list as looking out of boundary)
>>> purse = dict()
>>> print(purse['Bob']) o/p => KeyError Traceback
>>> purse['Alice'] = 29
>>> print(purse['Bob']) o/p => KeyError Traceback
'in' & 'not in' operator
- To check if a key is in Dictionary (it is same to List operators)
>>> 'Bob' in purse o/p => False
get() Method for Dictionaries
- The pattern of getting a value of a key if present in Dictionary, else getting default value
if kay_name in dict_values:
ret = dict_values[key_name] # VALUE FROM KEY
else:
ret = 0 # DEFAULT VALUE
This can be replaces as
ret = dict_values.get(key_name, 0) # VALUE FROM KEY or DEFAULT VALUE
- It doesn't Traceback, works whether the key exist or not
- Retrieve/Create/Update counter
#) Definite loops and Dictionaries
- Even though Dictionaries are not stored in order, definite loop ('for' loop) possible that goes through all entries in Dictionary
- Loops goes through all 'keys' in dictionary and looks up respective values (Different than in List, where it goes through value)
for key_name in dict_values:
print(key_name, dict_values[key_name])
#) Retrieving list of Keys & Values
>>> jjj = { 'chuck' : 1 , 'fred' : 42 , 'jan' : 100 }
Conversion of Dictionary to List. This gives List of keys (in some order)
>>> print(list(jjj)) o/p => ['jan', 'chuck', 'fred']
To get List of keys
>>> print(jjj.keys()) o/p => ['jan', 'chuck', 'fred']
To get list of values. Gives values in order corresponding to keys()
>>> print(jjj.values()) o/p => [100, 1, 42]
To get list of (key, value) pair, which is list of Tuples
>>> print(jjj.items()) o/p => [('jan', 100), ('chuck', 1), ('fred', 42)]
#) Two Iteration Variables
- It very succinct and convenient way to loop through key-value pair in a Dictionary
- Both variable bounces together
jjj = { 'chuck' : 1 , 'fred' : 42 , 'jan' : 100 }
for key,val in jjj.items():
print(key, val)
4) TUPLE
- It is another kind of sequence that functions much like a List (exactly looks like List, but uses parenthesis)
- It is a limited version of List
- It is more efficient version of List, in terms of memory use and performance (without allocation of extra memory required to make changes.)
#) Tuples are Immutable, similar to Strings, unlike Lists
- It is unmodifiable List - once a Tuple is created, can't be alter it's contents similar to String
NOTE: List allocates extra memory for being mutable
>>> tpl = (12, 4, 31)
>>> print(tpl) o/p => (12, 4, 31)
>>> tpl[0] = 13 o/p => Traceback
>>> str = 'ABC'
>>> str[0] = 'a' o/p => Traceback
>>> lst = [12, 4, 31]
>>> lst[0] = 13
>>> print(lst) o/p => [13, 4, 31]
- So Tuples can't sort(), append(), extend(), reverse(), etc
>>> tpl = tuple()
>>> dir(t) o/p => ['count', 'index']
- It is preferable to make temporary variables using Tuples over Lists
#) Tuple and Assignment
- It can also be put at left-hand side of assignment statement
>>> (x, y) = (4, 'fred') # same as x = 4 and y = 'fred'
>>> print(x) o/p => 4
>>> print(y) o/p => fred
>>> (a, b) = (44, 55) # same as a = 44 and b = 55
>>> print(a) o/p => 44
>>> print(b) o/p => 55
>>> (p, q) = fun_ret_tuple() # function returning 2 values Tuple
>>> (p, q) = 99 o/p => Traceback
#) Tuples and Dictionaries
- item() method in Dictionaries returns a list of (key, value) Tuples
>>> jjj = { 'chuck' : 1 , 'fred' : 42 , 'jan' : 100 }
>>> print(jjj.items()) o/p => [('jan', 100), ('chuck', 1), ('fred', 42)]
#) Tuples are Comparable, like Strings
- It can compare items in sequence until any comparison fails (continues next item comparison only current comparison success)
>>> (0, 11, 12) < (5, 1, 2) o/p => True
>>> (0, 1, 12) < (0, 11, 2) o/p => True
>>> (10, 1, 2) < (5, 1, 2) o/p => False
>>> ('Abc', 'Pqr') < ('Abc', 'Pz') o/p => True
>>> ('Abc', 'Xyz') < ('Mbc', 'Pqr') o/p => True
#) sorted() Method
- Python has built-in sorted() library function for Lists & Yuples
- This function takes a sequence and returns List as sorted version of that
ascending order: sorted(Tuple or List, reverse=False) DEFAULT
descending order: sorted(Tuple or List, reverse=True)
>>> tpl_lst = [('a', 10), ('c', 22), ('b', 1)]
>>> sorted(tpl_lst) o/p => [('a', 10), ('b', 1), ('c', 22)]
>>> tpl_lst = [(1, 10), (3, 22), (2, 1), (1, 1)]
>>> sorted(tpl_lst) o/p => [(1, 1), (1, 10), (2, 1), (3, 22)]
>>> tpl_lst = [(1, 11, 22), (33, 2, 1), (2, 3, 4), (1, 11, 0)]
>>> sorted(tpl_lst) o/p => [(1, 11, 0), (1, 11, 22), (2, 3, 4), (33, 2, 1)]
Key order sorting of Dictionary
- Sorting of Dictionary by key using it's items() method to get Tuples & using sorted() function over that Tuples
>>> dict = {'a': 10, 'b': 1, 'c': 22}
>>> print(dict) o/p => {'a':10, 'c': 22, 'b': 1}
>>> dict.items() o/p => dict_items([('a':10), ('c': 22), ('b': 1)])
>>> tpl = sorted(dict.items()
>>> print(tpl) o/p => [('a':10), ('b': 1), ('c': 22)]
- For sorting of Dictionary by value, need to create a temporary List of Tuples in (value, key) format using loop to append() in that temporary List. Then passing that List to sorted( , reverse=True)
#) List Comprehension
- It is used to create a dynamic List
- It is a way of expressing list as expression, instead of expressing list with appends or as constants with commas
[ (v,k) for k,v in dict.items() ] # It makes list of reversed Tuples of Dictionary
- It can take Dictionary and provides Tuples sorted by value order
>>> dict = {'a':10, 'b':1, 'c':22}
>>> print( sorted ( [ (v,k) for k,v in dict.items() ] ) ) o/p => [(1, 'b'), (10, 'a'), (22, 'c')]
- This make list of reversed tuples and then sort it, which is similar to
lst = list()
for k, v in dict.items():
new_tpl = (v,k)
lst.append(new_tpl)
sorted(lst)
References:
1) https://www.coursera.org/learn/python ... Python Data Structures