hash算法-wqfhenanxc-ChinaUnix博客

wqfhenanxc

首页　| 　博文目录　| 　关于我

wqfhenanxc

博客访问： 1111783
博文数量： 242
博客积分： 10209
博客等级：上将
技术积分： 3028
用户组：普通用户
注册时间： 2008-03-12 09:27

文章分类

全部博文（242）

点滴（2）
数据库相关（3）
Java学习（1）
windows编程（2）
P2P相关（1）
网络安全（3）
汇编语言（3）
unix网络编程（20）
学习C++（26）
思想人生（22）
英语学习（1）
linux系统（30）
history of weste（0）
社会人文（0）
linux c编程（63）
算法（36）

我读算法之美（1）
shell编程（28）
未分配的博文（1）

文章存档

2014年（1）

2013年（1）

2010年（51）

2009年（65）

2008年（124）

我的朋友

相关博文

hash算法

分类：项目管理

2009-11-30 21:06:28

Although searching for an element in a hash table can take as long as searching for an element in a linked list-Θ(n) time in the worst case-in practice, hashing performs extremely well. Under reasonable assumptions, the expected time to search for an element in a hash table is O(1).

Direct addressing is applicable when we can afford to allocate an array that has one position for every possible key.Direct addressing is a simple technique that works well when the universe U of keys is reasonably small.

When the number of keys actually stored is small relative to the total number of possible keys, hash tables become an effective alternative to directly addressing an array, since a hash table typically uses an array of size proportional to the number of keys actually stored.

The point of the hash function is to reduce the range of array indices that need to be handled.

We might choose a suitable hash function h to avoid collisions. While a well-designed, "random"-looking hash function can minimize the number of collisions, we still need a method for resolving the collisions that do occur.

Collisions can be resolved by chaining. How well does hashing with chaining perform?

Given a hash table T with m slots that stores n elements, we define the load factor α for T as n/m, that is, the average number of elements stored in a chain. Our analysis will be in terms of α, which can be less than, equal to, or greater than 1.

The worst-case behavior of hashing with chaining is terrible: all n keys hash to the same slot, creating a list of length n. The worst-case time for searching is thus Θ(n) plus the time to compute the hash function-no better than if we used one linked list for all the elements. Clearly, hash tables are not used for their worst-case performance.

The average performance of hashing depends on how well the hash function h distributes the set of keys to be stored among the m slots, on the average.

simple uniform hashing: any given element is equally likely to hash into any of the m slots, independently of where any other element has hashed to.

假设有0,1,2,...m-1共m个slot，the length of T[j] is nj，故有n=n0+n1+n2+....+n[m-1]. nj的期望E[nj]=α=n/m.

In a hash table in which collisions are resolved by chaining, an unsuccessful search takes expected time Θ(1 + α), a successful search takes expected time Θ(1 + α), under the assumption of simple uniform hashing.

What does this analysis mean? If the number of hash-table slots is at least proportional to the number of elements in the table, we have n = O(m) and, consequently, α = n/m = O(m)/m = O(1). Thus, searching takes constant time on average. Since insertion takes O(1) worst-case time and deletion takes O(1) worst-case time when the lists are doubly linked, all dictionary operations can be supported in O(1) time on average.

A good hash function satisfies the assumption of simple uniform hashing。

阅读(1243) | 评论(0) | 转发(0) |

上一篇：大数据量，海量数据处理方法总结转自兵马俑bbs

下一篇：请问替代test语句的[ ]和[[ ]]表达式的区别转载http://blog.chinauni

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6