Hash Tables and Trees
Hash Tables and Hashing
Hashing
Hashing is a technique used to store and allocate data (in a hash table array for example) by obtaining a key from the inputted data and processing it to its respective index. This hashing process through a Hash Function which has many different forms and each will spit out different results even from one key.
For example, we may have a series of string such as "apple", "dog" and "cat" and we want to place this is an array of strings. Sure, we can just insert them using a For-loop and the resulting array will be :
{"apple", "dog","cat"}.
Now assume we have hundreds of different strings and we one to search for the string "cat" in the array. We now have to do a linear search through the array which has the time complexity of O(n). This wouldn't be an effective method of storing and searching data if we have a large amount of data stored. A way to make the time complexity O(1) or usually close to it, we need to use hashing and hash tables.
Let's say our hash function for this particular problem is simply return the first of the index of the string minus the ascii code of 'a'. That means "apple" will be allocated to index 0 of the array, "dog" for index 3, and "cat" will be on index 2. Now our array will be:
{"apple", "", "cat", "dog"}
If we now want to search for "cat" in the array, we process the input "cat" which will directly lead us to index 2. Therefore we have a time complexity of O(1) which is far better than a linear time complexity of O(n).
Hash Functions
There are many forms of hash function, but here are 5 which are included in the BINUS curriculum.
1. Mid Square
This method requires to square the key and take the n-digits from the middle of the squared key. n is the number of digits of the key (before it is squared)
2. Division
This method is basically using the mod(%) operator to the key we have.
3. Folding
Divide the key into parts where each part has the same number of digits (except maybe for the last one). Add these parts together to produce the hashed key.
Key: 5678
Folding: 56 + 78 = 134
*note that we can also mod the obtained hashed key if our array size is not enough (for example 134 % 100 = 34 if our max array size is 100)*
4. Digit Extraction
We can take, for example, the first, third and fifth digit of the key. Consider the key 14568, then from this digit extraction we will obtain the hashed key 158.
5. Rotating Hash
We can use one of the previous 4 hash functions and that rotate the hash key. For example if we rotate the hashed key 158 we will obtain a new hash key of 851.
Collisions
Collisions happen when 2 or more different data has the same hash key and is therefore allocated to the same block of array. Lets use the first hash table and hash function as our example. We currently have the array: {"apple", "", "cat", "dog"} . What happens if we add the string "car" into the hash table. Well, the hash function will process the string "car" and return the hash key which has the value of 2. However "cat" is already index 2 of the array. So how do we deal with this?
1. Linear Probing
This method basically instructs us to do a linear search for the next empty block and then place the string "cat" on that block. So now our array is {"apple", "", "cat", "dog", "car"}. Linear probing is not very effective when there are large amounts of collision in the hash table and searching will be far slower than O(1) and may be closer to O(n) if we are unlucky.
2. Chaining
Chaining will require us to use an array of linked list. When we input the string of the same hash key such as "car" with "cat", the node "cat" will now point to "car". This will result in the hash table being much more organised and eventhough searching "car" is not O(1) it is still better to use chaining than linear probing in most cases.
Here is a program that implements a hash table, hashing and chaining:
Binary Trees
Types of Binary Trees
- Perfect Binary Tree - a binary tree which has the same depth on every level, or there are not "missing nodes" on the tree
- Complete Binary Tree - "missing nodes" can occur only on the last level
- Skewed Binary Tree - each node only has one child
- Balanced Binary Tree - no leaf is much farther away from the root than any other leaf
Properties
- Maximum number of nodes on any level is 2 to the power of the level
- maximun number of nodes on a binary tree is 2 to the power of height subtracted by 1.
- Minimun height of a binary tree for n nodes is 2log(n)
- Maximum height of a binary tree of n nodes is n -1
Representation
- Array
- Index 0 is the Root node
- Left child is 2p+1, where p is the parent index
- Right child is 2p+2, where p is the parent index
- Index parent is (p-1)/2
- Linked List
- We need a left pointer
- We need a right pointer
- A parent pointer can also be added if you want
Expression Trees
A program that creates an expression tree from a prefix string:
A program that implements prefix traversal:
A segment of a program that implements postfix traversal:
A segment of a program that implements infix traversal:

Comments
Post a Comment