Menu-Submenu

Python Data Structure


Python Data Structures

Following are data structures of Python
  1. String
  2. List
  3. Dictionary
  4. Tuple

  • These are collections
  • Collection: more than one thing in single variable
  • Allows to put many values in a single variable of different types (except in Strings)

1) STRING
String = Character Set

>>> fruit = "banana"
for String variable 'fruit'
length --> 6
array  --> b a n a n a
index  --> 0 1 2 3 4 5

#) Length Function
>>> len(fruit) o/p => 6

#) Index Operator / Sub Operator / Look-up Operator
  • Used for access array or character set
>>> print(fruit[0]) o/p => b
>>> print(fruit[4]) o/p => n
>>> print(fruit[5]) o/p => a

>>> print(fruit[6]) o/p => IndexError Trackback
>>> print(fruit[-1]) o/p => IndexError Trackback

#) String are not mutable
  • It is not possible to change content of a String, must make new string to make any change.
>>> fruit = 'Banana'
>>> fruit[0] = 'b' o/p => TypeError Traceback
So,
>>> temp = fruit.lower()
>>> print(temp) o/p => banana

 #) Looping through String

Indeterminate Loop
  • if need to know position/index

index = 0
while index < len(fruit):
letter = fruit[index]
print(index, letter)
index = index + 1

Determinate Loop
  • if no need to know position/index
  • Convenient way to loop through (a String)
  • 'iteration variable' iterates through the 'sequence/order set'
  • 'iteration variable' moves through all the values 'in' the 'sequence'
  • block/body of code is executed once for each value (for 'iteration variable') in the 'sequence'

for letter in fruit: <= 'letter' is an iteration variable. 'fruit' is a sequence / order set. 
print(letter) <= block/body of loop

#) Slicing Strings
  • Looks for continuous section of String
  • Uses colon operator, with index numbers on each side of colon

>>> sss = 'Monty Python'
M o n t y   P y t h o n
0 1 2 3 4 5 6 7 8 9 10 11

  • Second number   <-- one beyond the end of the slice  = "up to but not including"
>>> print(sss[0:4]) o/p => Monty
>>> print(sss[6:7]) o/p => P

  • If second number is beyond the end of string, it stops at the end (without any Traceback)
>>> print(sss[6:20]) o/p => Python

  • If first or last number of the slice is leave off, then it is assumed to be the beginning or end of the String respectively
>>> print(sss[:2]) o/p => Mo
>>> print(sss[8:]) o/p => thon
>>> print(sss[:]) o/p => Monty Python

#) Manipulating Strings
  • Strings are immutable

String concatenation using '+' operator
>>> a = 'Hello'
>>> b = a + 'There'
>>> printf(b) o/p => HelloThere
>>> b = a + ' ' + 'There'
>>> printf(b) o/p => Hello There

 #) String Operators

'in' as a Logical Operator
  • Used to check if one String is (substring) in another String
  • Returns True / False, and can be used in an 'if' statement
>>> fruit = 'banana'
>>> 'n' in fruit o/p => True
>>> 'm' in fruit o/p => False
>>> 'nan' in fruit o/p => True
>>> if 'a' in fruit :
        print('Found it!') o/p => Found it

Comparison Operators : == , < , >
  • Same as for numbers
  • Compares characters in sequence until check fails

#) String Library (for class str)
>>> greet = 'Hello Bob'
>>> greet.lower() --> returns lowercase copy of String variable, without changing String variable
>>> 'Hi There'.lower() --> returns lowercase copy of String constant too
>>> type(greet) o/p => <class 'str>
>>> dir(greet) o/p => capabilities/methods of type/class


#) Searching a String
  • Looks for one String (substring) in another String
  • Finds first occurrence of substring
  • Returns found index position, otherwise return -1 if not found
  • It is different that 'in' operator, which returns True/False on existence

>>> fruit = 'banana'
>>> pos = fruit.find('na')
>>> print(pos) o/p => 2
>>> pos = fruit.find('z')
>>> print(pos) o/p => -1

#) Search and Replace
  • Replaces passed String #1 by passed String #2 with main String
>>> greet = 'Hello Bob'
>>> greet2 = greet.replace('Bob', 'Alice')
>>> print(greet2) o/p => Hello Alice
>>> greet2 = greet.replace('o', 'X')
>>> print(greet2) o/p => HellX BXb

#) Stripping/Trimming Whitespace
  • Remove whitespace at the left/beginning
>>> greet.lstrip() o/p => 'Hello Bob    '
  • Remove whitespace at the right/ending
>>> greet.rstrip() o/p => '    Hello Bob'
  • Remove whitespace at both left & right
>>> greet.strip() o/p => 'Hello Bob'
[where variable declaration as, greet = '    Hello Bob ']

In Python 3 all String are Unicode Strings.
While in Python 2 Unicode Strings are indicated with prefix 'u', which used to need some conversion for using.


1.1) File Handling & Reading
#) Opening a File
  • returns a handle use to manipulate the File
  • File handle is not the data or actual content, but handle for operations on File: open, read, write, close
 >>> file_handle = open(file_path, open_mode)
file_path : either full or relative path of File
open_mode is optional
open_mode = 'r’ : to read the file (default)
open_mode = 'w' : to write the file

#) Newline character
  • It is represented as a single character '\n' in String
  • It is a special character
>>> lines = 'Hello\nWorld!'
>>> lines o/p => Hello\nWorld!
>>> print(lines) o/p => Hello
             World!
>>> lines = 'X\nY'
>>> print(lines) o/p =>  X
              Y
>>> len(lines) o/p => 3

  • With Newline character, File content is not collection/sequence of lines, but actually it is a long String of characters with are punctuated with Newline character.
  • Newline character is also considered as 'Whitespace Character', which can be trimmed using strip functions

NOTE: print() function also adds ONE newline to its printed content

#) File Handle as Sequence of lines
xfile = open('abc.txt') <-- Default open in read mode. Fails if wrong filepath or non-existing File.
for each_line in xfile: <-- 'for' loop reads File handle like a sequence of lines
print(each_line) <-- It prints Newline character from file, while another Newline character as default print() behaviour

#) Reading entire File content
xfile = open('abc.txt') 
inp = xfile.read() <-- read whole File (including newline characters) into a single long string
print(len(inp)) <-- length includes count of Newline characters too
print(inp[:20])

NOTE: quit() function used to end the program


2) LIST
  • A linear collection of values, stay in provided order (ordered collection)

#) List Constants

  • surrounded by square brackets and elements are separated by commas
>>> print([1, 24, 76]) o/p => [1, 24, 76]
>>> print(['red', 'yellow', 'blue']) o/p => ['red', 'yellow', 'blue']

  • list element can be any object, even another list
>>> print(['red', 24, 98.6]) o/p => ['red', 24, 98.6]
>>> print([1, [5, 6], 7]) o/p => [1, [5, 6], 7] ... there are 3 elements in primary list

  • list can be empty
>>> print([]) o/p => []

 #) Looking inside the Lists ... using SUB-operator
  • any single element in a list can be retrieved using an index specified in square brackets
friends = [ 'Alice' , 'Bob' , 'Charlie']
>>> print(friends[1]) o/p => Bob

#) Lists are mutable
  • It is possible to change an element of a List using index operator
>>> my_list = [1, 2, 4, 16, 256]
>>> print(my_list) o/p => [1, 2, 4, 16, 256]
>>> my_list[2] = 8
>>> print(my_list) o/p => [1, 2, 8, 16, 256]

#) Length of Lists
  • number of elements of any set or sequence
>>> x = [1, 2, 'alice', [1.1, 2.2.], 3]
>>> print(len(x)) o/p => 5

#) Range() function
  • returns a List
  • returns numbers that range from 0 to one less than parameter
>>> print(range(4)) o/p => [0, 1, 2, 3]

>>> friends = [ 'Alice' , 'Bob' , 'Charlie']
>>> print(range(len(friends))) o/p => [0, 1, 2] ... len(friends) = 3

Counted variable loop using range()
for i in range(len(friends)):
friend = friends[i]
print('Hello', friend)

 #) Operations on List

Concatenating List using '+'
  • doesn't change the operators
>>> a = [11, 12, 13]
>>> b = [14, 15, 16]
>>> c = a + b o/p => [11, 12, 13, 14, 15, 16]

 Slicing using ':'
  • Just like String, the second number is "up to but not including"
  • slicing of List is same as slicing of String
>>> t = [10, 11, 12, 13, 14, 15, 16]
>>> t[1:3] o/p => [11, 12]
>>> t[:4] o/p => [10, 11, 12, 13]
>>> t[4:] o/p => [14, 15, 16]
>>> t[:] o/p => [10, 11, 12, 13, 14, 15, 16]

List Methods
>>> x = list()
>>> type(x) o/p =>      <type 'list'>
>>> dir(x) o/p => Lists all methods of list

List constructor
>>> stuff = list() Create an empty List
>>> print(stuff) o/p => []
>>> stuff.append('book') List stays in same order & new element gets added at the end of List (as Lists are mutable)
>>> stuff.append(99)
>>> print(stuff) o/p => ['book', 99]
>>> stuff.append(14.5)
>>> print(stuff) o/p => ['book', 99, 14.5]

stuff = stuff.append() *** mess-up the 'stuff' data, as append() has different return value


'in' and 'not in'
  • Logical operators check the items presence in the List, without modifying it
  • Returns True or False
>>> item in list o/p => True/False
>>> item not in list o/p => True/False


 #) Sorting of List
  • changes the order to ascending order
  • sorting may be possible only for same data type
>>> friends = ['PQR', 'XYZ', 'ABC']
>>> friends.sort()
>>> print(friends) o/p => ['ABC', 'PQR', 'XYZ']

#) Numeric functions
>>> nums = [1, 3, 6, 2, 9]
>>> print(len(nums)) o/p => 5
>>> print(max(nums)) o/p => 9
>>> print(min(nums)) o/p => 1
>>> print(sum(nums)) o/p => 21

 #) Splitting String to List
  • Function: list.spit(delimiter_character)
  • Breaks a string into parts and produces a List of strings
>>> abc = "Hello! My name is Bob"
>>> stuff = abc.split()
>>> print(stuff) o/p => ['Hello!', 'My', 'name', 'is', 'Bob']
  • (when delimiter is not specified), By default delimiter is White-Spaces: space, \n etc
  • (when delimiter is not specified), By default multiple spaces are treated like one delimiter


3) DICTIONARY

  • It is Python's most powerful data collection.
  • Allows fast database-like operations in Python
  • Similar to Properties/Map/HashMap in JAVA, Property Bag in C#/.Net, Associative Arrays in Perl/PHP
  • A "bag" of values, each with its own label
  • It is a key-value or label-value pair dictionary[key_type] = value
  • keys are unique. can't put same key more than once, same value can be more than once.
  • It has unpredictable order (Dictionaries are like bags - no order)

>>> purse = dict() Make an empty Dictionary
>>> purse['money'] = 12
>>> purse['candy'] = 3
>>> purse['tissues'] = 75
>>> print(purse) o/p => {'money': 12, 'tissues': 75, 'candy': 3}
>>> print(purse['candy']) o/p => 3

  • Dictionaries are like Lists, except that they use keys instead of numbers to look up values
  • List have positions, while Dictionaries have labels
  • Dictionaries are mutable
>>> purse['candy'] = 23
>>> print(purse['candy']) o/p => 23

#) Dictionary Literals (Constants)
  • Use curly braces and have list of key:value pairs jjj = {'chuck': 1, 'fred': 42, 'jan': 100]
  • Empty dictionary using empty curly braces ooo = { }
  • Key and Values can be of different type (than string:int)

 #) Dictionary Traceback
  • It is an error to reference a key which is not in the dictionary (it is same to list as looking out of boundary)
>>> purse = dict()
>>> print(purse['Bob']) o/p => KeyError Traceback
>>> purse['Alice'] = 29
>>> print(purse['Bob']) o/p => KeyError Traceback


'in' & 'not in' operator
  • To check if a key is in Dictionary (it is same to List operators)
>>> 'Bob' in purse o/p => False

get() Method for Dictionaries
  • The pattern of getting a value of a key if present in Dictionary, else getting default value

if kay_name in dict_values:
ret = dict_values[key_name] # VALUE FROM KEY
else:
ret = 0 # DEFAULT VALUE

This can be replaces as
ret = dict_values.get(key_name, 0) # VALUE FROM KEY or DEFAULT VALUE

  • It doesn't Traceback, works whether the key exist or not
  • Retrieve/Create/Update counter

 #) Definite loops and Dictionaries
  • Even though Dictionaries are not stored in order, definite loop ('for' loop) possible that goes through all entries in Dictionary
  • Loops goes through all 'keys' in dictionary and looks up respective values (Different than in List, where it goes through value)

for key_name in dict_values:
print(key_name, dict_values[key_name])

#) Retrieving list of Keys & Values
>>> jjj = { 'chuck' : 1 , 'fred' : 42 , 'jan' : 100 }

Conversion of Dictionary to List. This gives List of keys (in some order)
>>> print(list(jjj)) o/p => ['jan', 'chuck', 'fred']

To get List of keys
>>> print(jjj.keys()) o/p => ['jan', 'chuck', 'fred']

To get list of values. Gives values in order corresponding to keys()
>>> print(jjj.values()) o/p => [100, 1, 42]

To get list of (key, value) pair, which is list of Tuples
>>> print(jjj.items()) o/p => [('jan', 100), ('chuck', 1), ('fred', 42)]

 #) Two Iteration Variables
  • It very succinct and convenient way to loop through key-value pair in a Dictionary
  • Both variable bounces together
jjj = { 'chuck' : 1 , 'fred' : 42 , 'jan' : 100 }
for key,val in jjj.items():
print(key, val)


4) TUPLE

  • It is another kind of sequence that functions much like a List (exactly looks like List, but uses parenthesis)
  • It is a limited version of List
  • It is more efficient version of List, in terms of memory use and performance (without allocation of extra memory required to make changes.)

#) Tuples are Immutable, similar to Strings, unlike Lists
  • It is unmodifiable List - once a Tuple is created, can't be alter it's contents similar to String
NOTE: List allocates extra memory for being mutable
>>> tpl = (12, 4, 31)
>>> print(tpl) o/p => (12, 4, 31)
>>> tpl[0] = 13 o/p => Traceback


>>> str = 'ABC'
>>> str[0] = 'a' o/p => Traceback

 >>> lst = [12, 4, 31]
>>> lst[0] = 13
>>> print(lst) o/p => [13, 4, 31]

  • So Tuples can't sort(), append(), extend(), reverse(), etc
>>> tpl = tuple()
>>> dir(t) o/p => ['count', 'index']

  • It is preferable to make temporary variables using Tuples over Lists

 #) Tuple and Assignment
  • It can also be put at left-hand side of assignment statement
>>> (x, y) = (4, 'fred') # same as x = 4 and y = 'fred'
>>> print(x) o/p => 4
>>> print(y) o/p => fred
>>> (a, b) = (44, 55) # same as a = 44 and b = 55
>>> print(a) o/p => 44
>>> print(b) o/p => 55
>>> (p, q) = fun_ret_tuple() # function returning 2 values Tuple
>>> (p, q) = 99 o/p => Traceback

 #) Tuples and Dictionaries
  • item() method in Dictionaries returns a list of (key, value) Tuples
>>> jjj = { 'chuck' : 1 , 'fred' : 42 , 'jan' : 100 }
>>> print(jjj.items()) o/p => [('jan', 100), ('chuck', 1), ('fred', 42)]

#) Tuples are Comparable, like Strings
  • It can compare items in sequence until any comparison fails (continues next item comparison only current comparison success)
>>> (0, 11, 12) < (5, 1, 2) o/p => True
>>> (0, 1, 12) < (0, 11, 2) o/p => True
>>> (10, 1, 2) < (5, 1, 2) o/p => False
>>> ('Abc', 'Pqr') < ('Abc', 'Pz') o/p => True
>>> ('Abc', 'Xyz') < ('Mbc', 'Pqr') o/p => True

 #) sorted() Method
  • Python has built-in sorted() library function for Lists & Yuples
  • This function takes a sequence and returns List as sorted version of that

ascending order:    sorted(Tuple or List, reverse=False) DEFAULT
descending order:  sorted(Tuple or List, reverse=True)

>>> tpl_lst = [('a', 10), ('c', 22), ('b', 1)]
>>> sorted(tpl_lst) o/p => [('a', 10), ('b', 1), ('c', 22)]

>>> tpl_lst = [(1, 10), (3, 22), (2, 1), (1, 1)]
>>> sorted(tpl_lst) o/p => [(1, 1), (1, 10), (2, 1), (3, 22)]

>>> tpl_lst = [(1, 11, 22), (33, 2, 1), (2, 3, 4), (1, 11, 0)]
>>> sorted(tpl_lst) o/p => [(1, 11, 0), (1, 11, 22), (2, 3, 4), (33, 2, 1)]

 Key order sorting of Dictionary
  • Sorting of Dictionary by key using it's items() method to get Tuples & using sorted() function over that Tuples
>>> dict = {'a': 10, 'b': 1, 'c': 22}
>>> print(dict) o/p => {'a':10, 'c': 22, 'b': 1}
>>> dict.items() o/p => dict_items([('a':10), ('c': 22), ('b': 1)])
>>> tpl = sorted(dict.items()
>>> print(tpl) o/p => [('a':10), ('b': 1), ('c': 22)]

  • For sorting of Dictionary by value, need to create a temporary List of Tuples in (value, key) format using loop to append() in that temporary List. Then passing that List to sorted( , reverse=True)

#) List Comprehension 
  • It is used to create a dynamic List
  • It is a way of expressing list as expression, instead of expressing list with appends or as constants with commas
[ (v,k) for k,v in dict.items() ] # It makes list of reversed Tuples of Dictionary

  • It can take Dictionary and provides Tuples sorted by value order
>>> dict = {'a':10, 'b':1, 'c':22}
>>> print( sorted ( [ (v,k) for k,v in dict.items() ] )  ) o/p => [(1, 'b'), (10, 'a'), (22, 'c')]
  • This make list of reversed tuples and then sort it, which is similar to
lst = list()
for k, v in dict.items():
new_tpl = (v,k)
lst.append(new_tpl)
sorted(lst)

References:
1) https://www.coursera.org/learn/python   ... Python Data Structures