算法Bloom filter-luoxingao-ChinaUnix博客

Network security

首页　| 　博文目录　| 　关于我

luoxingao

博客访问： 48171
博文数量： 12
博客积分： 1410
博客等级：上尉
技术积分： 150
用户组：普通用户
注册时间： 2008-07-24 19:54

文章分类

全部博文（12）

网络安全（1）
好玩的（1）
未分配的博文（10）

文章存档

2009年（2）

2008年（10）

我的朋友

最近访客

推荐博文

算法Bloom filter

分类：

2008-08-07 11:07:28

今天看了一篇英文论文，名Defending Against TCP SYN Flooding Attacks Under Different Types of IP Spoofing.pdf，这里讲了两个经典算法，一是Bloom filter，二是change-point，前者是一种astorage-efficient data structure，高效存储的数据结构算法，后者是一种统计算法，本文介绍前一种。

原文：Bloom filter was first proposed by Bloom [3] in 1970.
Recently, it has been adapted for use in some methods
for defending against DDoS attacks [2, 5, 11]. A
Bloom filter is composed of a vector v of m bits, initially
all set to 0. We have k independent hash functions,
h1, h2, . . . , hk, each with a range 0, 1, . . . , m − 1.
The vector v can showthe existence of an element from
some address space A. Given an element a ∈ A, the
bits at positions hi(a), 1 ≤ i ≤ k, in v are set to 1. Note
that a particular bit may be set to 1 multiple times and
hence may potentially lead to inaccurate results. Given
a query on the existence of b in A, we check the bits at
positions hi(b), 1 ≤ i ≤ k. If any one of them is 0, then
b is certainly not in A. Otherwise, we conjecture that b
is in it. However, there is a certain probability that the
Bloom filter will give a false result. This probability is
referred to as the false positive rate.
A variant of the original Bloom filter, called counting
Bloom filter, uses a table of counters to replace the
n bits. Each counter represents the number of times
the corresponding location has been hit. When a key
a (such as an IP address) is inserted or deleted, the
value of the corresponding counter in each row is increased
or decreased by 1, respectively, according to
hi(a) for all k rows. If an IP address b is already stored
in the modified bloom filter, the counters at locations
hi(b), 1 ≤ i ≤ k, in the table should all be nonzero.

大概意思是说一个向量V（n维的），初始化为全零，有k个hash函数，每个hash函数是独立的而且hash后的值在0——n-1之内，对每个要存的a变量，用这k个独立的hash函数对其进行hash，hash后的值对应在V向量中置为1.在删除一个变量a时，把相应的1改为零。现在要查看有b没有，只要hash后每个位置都为1就可说明b在这个表里，否则只要有一个是0就说明b不在这个结构中。

这种结构的应用，比如说在网络编程中我们可能要存很多ip，有时要存源ip，目的ip等，如果我们直接存储，那将花费很大的内存和时间，而当我们要查找时更是花费时间。如果用上面的结构可以解决这个问题，查找也很方便。棕色部分是一种更有效且更准确的方法，用tables代替向量Vector。

文件:	Defending Against TCP SYN Flooding Attacks Under Different Types of IP Spoofing.pdf
大小:	356KB
下载:	下载

阅读(1418) | 评论(0) | 转发(0) |

上一篇：snort_inline中与防火墙联动的实现的体会

下一篇：一段使网页上的图片动起来的javascript代码

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6