Collision Resolution Lesson Plan - 9. Contents Evocation Objective Introduction Collision Resolution Separate Chaining Open addressing Linked list Bucket.

Презентация:



Advertisements
Похожие презентации
Searching Lesson Plan - 6. Contents Evocation Objective Introduction Sequential Search Algorithm Variations on sequential search Mind map Summary.
Advertisements

Hashing Lesson Plan - 8. Contents Evocation Objective Introduction General idea Hash Tables Types of Hashing Hash function Hashing methods Hashing algorithm.
UNIT-III SORTING Lesson Plan-1 Insertion Sort, Selection Sort.
SPLAY TREE The basic idea of the splay tree is that every time a node is accessed, it is pushed to the root by a series of tree rotations. This series.
1/27 Chapter 9: Template Functions And Template Classes.
Loader Design Options Linkage Editors Dynamic Linking Bootstrap Loaders.
© 2005 Cisco Systems, Inc. All rights reserved. BGP v Route Selection Using Policy Controls Applying Route-Maps as BGP Filters.
Lesson 3 - HTML Formatting. Text Formatting Tags TagDescription Defines bold text Defines big text Defines emphasized text Defines italic text Defines.
Sequences Sequences are patterns. Each pattern or number in a sequence is called a term. The number at the start is called the first term. The term-to-term.
AVL-Trees COMP171 Fall AVL Trees / Slide 2 Balanced binary tree * The disadvantage of a binary search tree is that its height can be as large as.
HPC Pipelining Parallelism is achieved by starting to execute one instruction before the previous one is finished. The simplest kind overlaps the execution.
© 2005 Cisco Systems, Inc. All rights reserved. BGP v Customer-to-Provider Connectivity with BGP Connecting a Multihomed Customer to Multiple Service.
Trees, Binary Tree and Binary Search Tree Lesson Plan - 1.
What to expect? How to prepare? What to do? How to win and find a good job? BUSINESS ENGLISH COURSE NOVA KAKHOVKA GUMNASUIM 2012.
Benford Benford's law, also called the first-digit law, states that in lists of numbers from many (but not all) real-life sources of data, the leading.
PERT/CPM PROJECT SCHEDULING Allocation of resources. Includes assigning the starting and completion dates to each part (or activity) in such a manner that.
© 2009 Avaya Inc. All rights reserved.1 Chapter Two, Voic Pro Components Module Two – Actions, Variables & Conditions.
DRAFTING TECHNIQUES I 136. Here is a basic shape. From here, we will do some advanced drafting once we put this shape on a sheet as a drawing. Select.
BREADTH FIRST TRAVERSAL Lesson Plan -3. Evocation.
REFERENCE ELEMENTS 64. If your REFERENCE ELEMENTS toolbar is not in view and not hidden, you can retrieve it from the toolbars menu seen here. 65.
Транксрипт:

Collision Resolution Lesson Plan - 9

Contents Evocation Objective Introduction Collision Resolution Separate Chaining Open addressing Linked list Bucket Hashing

ANNEXURE-I Evocation

Objective To learn the basic knowledge about collision resolution methods

ANNEXURE-II Introduction-Collision resolution Hash a new key to address may create a collision When an element is inserted, it hashes to the same value as an already inserted element, then we have a collision and need to resolve it Two categories of collision resolution techniques Open addressing (closed hashing) Separate Chaining (open hashing)

Collision resolution concept In hashing algorithm, there must be some empty element in list all times Full list is defined as a list in which all elements except one contains data Load factor is defined as number of elements in list divided by number of physical elements allocated for list expressed in percentage α= (k/n) × 100 k - Number of filled elements in list n- Total number of elements allocated to list

Clustering Tendency of data to build up across a hashed list is known as clustering Two types of clustering Primary clustering Secondary clustering Primary Clustering Primary clustering occurs when data cluster around a home address Easy to identify Secondary Clustering Secondary clustering occurs when data become grouped along a collision path throughout a list Difficult to identify

Collision Resolution Methods Linear Probe Quadratic Probe Pseudorandom Key offset Collision Resolution Open Addressing Linked listBuckets

Separate Chaining The idea is to keep a list of all elements that hash to the same value The array elements are pointers to the first nodes of the lists A new item is inserted to the front of the list Two list operations are insertion and deletion Advantages: Better space utilization for large items Simple collision handling: searching linked list Overflow: we can store more items than the hash table size Deletion is quick and easy: deletion from the linked list

Example Keys: 0, 1, 4, 9, 16, 25, 36, 49, 64, 81 hash(key) = key %

Example

Insertion: insert = 4 x mod 11 =

Analyzing of Hashing with separate chaining Worst case All keys hash into the same bucket a single linked list insert, delete, find take O(n) time Average case Keys are uniformly distributed into buckets O(1+N/B): N is the number of elements in a hash table, B is the number of buckets If N = O(B), then O(1) time per operation N/B is called the load factor of the hash table

Open Addressing Separate chaining has the disadvantage of using linked lists Requires the implementation of a second data structure Open addressing resolves collisions in prime area, the area that contains all of home address In an open addressing hashing system, all the data go inside the table A bigger table is needed Generally the load factor should be below 0.5 If a collision occurs, alternative cells are tried until an empty cell is found

Open Addressing More formally: Cells h 0 (x), h 1 (x), h 2 (x), …are tried in succession where h i (x) = (hash(x) + f(i)) mod Table Size, with f(0) = 0 The function f is the collision resolution strategy There are three common collision resolution strategies: Linear Probing Quadratic probing Pseudorandom Collision resolution Key offset Double hashing

Evocation

Linear Probing Data cannot be stored in home address then resolve the collision by adding 1 to current address In linear probing, collisions are resolved by sequentially scanning an array (with wraparound) until an empty cell is found i.e. f is a linear function of i, typically f(i)= i Example: Insert items with keys: 89, 18, 49, 58, 9 into an empty hash table Table size is 10 Hash function is hash(x) = x mod 10 f(i) = i Advantages Simple to implement Data tend to remain near home address

Linear Probing Example

Linear Probing – Analysis Example What is the average number of probes for a successful search and an unsuccessful search for this hash table? Hash Function: h(x) = x mod 11 Successful Search: 20: : : : 2, : 3,4 24: 2,3,4, : : 9,10, 0 Avg. Probe for SS = ( )/8=15/8 Unsuccessful Search: We assume that the hash function uniformly distributes the keys 0: 0,1 -- 1: : 2,3,4,5,6 -- 3: 3,4,5,6 4: 4,5,6 -- 5: 5,6 -- 6: : : 8,9,10,0,1 9: 9,10,0, : 10,0,1 Avg. Probe for US = ( )/11=31/11

Linear Probing (insert 12) 12 = 1 x mod 11 = 1

Search with linear probing (Search 15) 15 = 1 x mod 11 = 4 NOT FOUND !

Deletion with linear probing: LAZY (Delete 9) 9 = 0 x mod 11 = 9 FOUND ! D

Linear Probe collision resolution Hash First insert: no collision Second insert: no collision Marry Dodd Sarah Trapp Bryan Devaux Harry Eagle Patrick Linn Tuan Ngo Feldman [000] [001] [002] [003] [004] [005] [006] [007] [008] [305] [306] Probe 1 Probe 2

Quadratic Probe Quadratic Probing eliminates primary clustering problem of linear probing Secondary clustering can be eliminated by adding a value other than 1 to the current address In quadratic probing, the increment is the collision probe number squared Collision function is quadratic The popular choice is f(i) = i 2 If the hash function evaluates to h and a search in cell h is inconclusive, we try cells h + 1 2, h+2 2, … h + i 2 i.e. It examines cells 1,4,9 and so on away from the original probe Remember that subsequent probe points are a quadratic number of positions from the original probe point

Quadratic Probe Disadvantage Time required to square the probe number Eliminate the multiplication factor by using increment factor that increases by 2 with each probe Add the increment factor to the previous increment gives the next increment Limitation Impossible to generate a new address for every element in list

Quadratic Collision Resolution Increments Probe numberCollision location Probe 2 and increment New address = 11+1 = = 42+4= = 96+9= = = = = = = = = = = = = = = 86

Quadratic Probe

Quadratic Probing (insert 12)

ANNEXURE-III Deep Breathing Meditation Sit comfortably with your back straight. Put one hand on your chest and the other on your stomach Breathe in through your nose. The hand on your stomach should rise. The hand on your chest should move very little Exhale through your mouth, pushing out as much air as you can while contracting your abdominal muscles. The hand on your stomach should move in as you exhale, but your other hand should move very little Continue to breathe in through your nose and out through your mouth. Try to inhale enough so that your lower abdomen rises and falls. Count slowly as you exhale

Optical Illusion What do you see in the image below?

Logo Identification

Quiz The technologically advanced humanoid robot ASIMO is made by which car company? Along with whom did Bill Gates found Microsoft? What science fiction writer wrote the three laws of robotics? Nano, Shuffle, Classic and Touch are variations of what?

Double Hashing When collision occurs use a second hash function Hash 2 (x) = R – (x mod R) R: greatest prime number smaller than table-size Inserting 12 H 2 (x) = 7 – (x mod 7) = 7 – (12 mod 7) = 2 Check H(x) If collision occurs check H(x) + 2 If collision occurs check H(x) + 4 If collision occurs check H(x) + 6 If collision occurs check H(x) + 8 H(x) + i * H 2 (x)

Double Hashing (insert 12)

Pseudorandom Collision Resolution Use pseudorandom number to resolve collision Instead of using key as a factor in random number calculation, use the collision address Resolve the collision using pseudorandom number generator, where a is 3 and c is 5 Y=(ax+c) modulo listsize = (3*1 + 5) Modulo 307 = 8 Limitation All keys follow only one collision path through the list

Pseudorandom Collision Resolution First insert: no collision Hash Second insert: no collision [000] [001] [002] [003] [004] [005] [006] [007] [008] [305] [306] Marry Dodd Sarah Trapp Bryan Devaux Patrick Linn Harry Eagle Tuan Ngo Feldman Probe1 Pseudorandom Y=3X+5

Key Offset Double hashing method that produces different collision paths for different keys Pseudorandom number generator produces a new address as a function of previous address, key offset calculates the new address as function of old address and key offset = [ key/listsize] address = ((offset + old address) modulo listSize) Example When key is and list size is 307 using modulo division hashing method generates address of 1 offset = [166702/307] = 543 address = (( ) modulo 307) =237

Key Offset If 237 were a collision, repeat the process to locate the next address offset = [166702/307] = 543 address = (( ) modulo 307) =166 KeyHome address Key offset Probe 1Probe

Evocation

Linked list Collision Resolution Major disadvantage to open addressing is that each collision resolution increases the probability of future collisions Eliminated by linked list approach Linked list is ordered collection of data in which element contains the location of next element

Linked List Collision Resolution [000] [001] [002] [003] [004] [005] [006] [007] [008] [305] [306] Marry Dodd Sarah Trapp Bryan Devaux Patrick Linn Tuan Ngo Feldman Harry Eagle ChrisWalljasper

Linked list Collision Resolution Use separate area to store collisions and chains together in linked list Two storage areas: prime area and overflow area Each element in prime area contains additional field a link header pointer to a linked list of overflow data in overflow area When collision occurs, one element is stored in prime area and chained to corresponding linked list in over flow area overflow area is typically implemented as linked list in dynamic memory

Linked list Collision Resolution Linked list is stored in any order, but LIFO sequence or key sequence LIFO sequence is fastest for insert because the linked list need not be scanned to insert data Element being inserted into overflow is placed at beginning of linked list and linked to node in prime area In key sequenced lists, key in prime area is smallest to provide for faster search retrieval

Evocation

Bucket Hashing Keys are hashed to bucket nodes that accommodate multiple data occurrences Bucket hold multiple data, collisions are postponed until bucket is full Example Each address is large enough to hold data for three employees Collision will not occur until tried to add fourth employee to address Two problems Use more space because many of bucket are empty or partially empty at any time It will not completely resolves collision problem

Bucket Hashing Marry Dodd Sarah Trapp Harry Eagle Ann Giorgis Byan Devaux Chris jasper Feldman [000] Bucket 0 [001] Bucket 1 [002] Bucket 2 [003] Bucket 307

ANNEXURE-IV Mind Map Collision Resolution Clustering Separate Chaining Open Addressing Bucket hashing Linked List Linear probe Quadratic probe Pseudo random Key offset

ANNEXURE -V Summary Hash a new key to address may create a collision Clustering is tendency of data to build up across a hashed list Three methods used t resolve collisions are open addressing, linked list and buckets Open addressing method can be divided into linear probe, quadratic probe, Pseudorandom rehashing and key offset rehashing In linear probe method, when collision occurs new data are stored in next available address

Summary In quadratic method, increment is the collision probe number squared In pseudorandom method, a random number generator is used to rehash address In key offset method, an offset is used to rehash address In linked list technique, separate areas store collisions and chain all synonyms together in linked list In bucket hashing, bucket accommodates multiple data occurrence