[译]Cassandra的数据读写与压缩-laoliulaoliu-ChinaUnix博客

miraclemiracle.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

laoliulaoliu

博客访问： 4608223
博文数量： 1214
博客积分： 13195
博客等级：上将
技术积分： 9105
用户组：普通用户
注册时间： 2007-01-19 14:41

个人简介

C++,python,热爱算法和机器学习

文章分类

全部博文（1214）

cloud（3）
operation（9）
tornado（4）
mac_os（1）
golang（4）
架构（13）
git（4）
security（29）
shell（1）
macbook（1）
ruby（13）
javascript（15）
design（3）
testing（1）
mac（1）
bigdata（69）
nosql（46）
R（9）
gcj/acm（6）
NLP（10）
小说（3）
matlab（4）
web（44）
java（66）
product（7）
c#（1）
language（4）
machine learning（76）
science（4）
opencourse（2）
windows（3）
search（33）
algorithm（65）
database（51）
compiler（11）
ACE（5）
poem（1）
programming（29）
python（140）
assembly（1）
linux（49）
C++（16）
book（2）
cate（1）
phliosophy（3）
mental（30）
Science fiction（1）
Software（5）
c（23）
network（65）
CS（15）
thinking（10）
BSD（13）
solaris10（2）
life（57）
Debian（16）
economy（7）
Mathematics（57）
OS（8）
ibm（2）
gentoo（32）
未分配的博文（8）

文章存档

2021年（13）

2020年（49）

2019年（14）

2018年（27）

2017年（69）

2016年（100）

2015年（106）

2014年（240）

2013年（5）

2012年（193）

2011年（155）

2010年（93）

2009年（62）

2008年（51）

2007年（37）

我的朋友

相关博文

[译]Cassandra的数据读写与压缩

分类：大数据

2014-05-19 17:23:30

文章来源：

本文翻译主要来自Datastax的cassandra1.2文档。http://www.datastax.com/documentation/cassandra/1.2/index.html 。此外还有一些来自于相关官方博客。

该翻译作为实验室大数据组的学习材料的一部分，适合对Cassandra已经有一定了解的读者。

未经本人许可，请勿转载。

简述

1、不是sql（没有事务、没有join），但是不仅仅是kv

2、来自于Google BigTable的灵感。

3、基于列族的。

例子：

还有二级索引、分布式counter、复合列等等

Cassandra Storage Engine

目标：最小化随机IO。

一次写入的流程：

写入的特点是：

没有读取、没有seek

只有顺序io

sstable不再改变：很容易备份

一次读的流程：

压缩

目的：减少sstable数量

合并多个sstable的顺序

顺序IO

SStable的样子：

再说压缩：

Cassandra中，讲新的列写入新的sstable中，那么压缩就是为了将多个sstable合并成一个。

Figure 1: adding sstables with size tiered compaction

因此，一段时间后，会有一行的许多版本会存在于多个不同的sstable中。这些版本中的每一个都可能有不同的列集合。如果sstable就这么积攒下去，读一行数据就需要多次定位到多个文件中去。

因此需要合并，合并也是高性能的，不需要随机IO，因为行也都被有序的存储在了各自的sstable中（基于primary key的顺序）。

Figure 2: sstables under size-tiered compaction after many inserts

cassnadra的大小分层压缩策略跟bigtable论文中的很像：当到达足够数量的sstable（默认4个）的时候，就进行合并。

图1中，一个绿色格子就代表一个sstable，一行就代表一次压缩合并。一旦sstable到了4个，就合并在一起。图2展示了一段时间之后的层次结构，第一层的sstable合并成第二层，第二层的会合并成第三层…

在频繁更新的任务中，会出现三个问题：

1、性能会不一致，因为不能确保一行到底跨越了多少个sstable。最糟糕的例子是，我们可能在每个sstable都有某一行的某些列。

2、因为无法确定到底过时的列会被合并的多块，因此可能会浪费大量的空间，尤其是很多delete的时候。

3、Space can also be a problem as sstables grow larger from repeated compactions, since an obsolete sstable cannot be removed until the merged sstable is completely written. In the worst case of a single set of large sstable with no obsolete rows to remove, Cassandra would need 100% as much free space as is used by the sstables being compacted, into which to write the merged one.

Cassandra1.0之后引进了Leveled compaction策略，这是基于Chromium团队的levelDB的

Leveled Compaction （译者注：翻译的不是很懂）

leveled compation创建固定大小的sstable（默认5MB）,他们组成了“levels”。在每一层里面，sstable们能确保不重叠。每一层都比前一层大10倍。

Figure 3: adding sstables under leveled compaction

图3中，新的sstable首先加入第一层level， L0.然后立刻合并成sstable到L1，（蓝色的），当L1满了，就合并成L2（紫色的）。 Subsequent sstables generated in L1 will be compacted with the sstables in L2 with which they overlap. As more data is added , leveled compaction results in a situation like the one shown in figure 4.

Figure 4: sstables under leveled compaction after many inserts

这种方式能解决上述问题：

1、这种合并压缩能确保90%的读取都能从单个sstable中获取（假设行的大小统一）。最坏的情况是读取层的数量次。比如 10T的数据会读取7个。

2、之多10%的空间会因为过时行而浪费。

3、在compact时只需要有10*sstable大小的空间被临时使用。

使用：通过在创建或者更新表结构时加入： compaction_strategy option set to LeveledCompactionStrategy .（更新也是后台的，所以对于已经存在的表，修改compact类型不影响读写）

由于leveled compaction要确保上面的问题，他比size-tiered compation 要花费大概两倍的io。对于写入为主的负载，这种额外的io并不会因为上面的好处带来很多收益，因为没有多少行的旧版本涉及。

设置的一些细节：Leveled compaction ignores the concurrent_compactorssetting. Concurrent compaction is designed to avoid tiered compaction’s problem of a backlog of small compaction sets becoming blocked temporarily while the compaction system is busy with a large set. Leveled compaction does not have this problem, since all compaction sets are roughly the same size. Leveled compaction does honor the multithreaded_compaction setting, which allows using one thread per sstable to speed up compaction. However, most compaction tuning will still involve usingcompaction_throughput_mb_per_sec (default: 16) to throttle compaction back.

什么时候使用leveled compation呢：英文版，中文版

数据管理

为了管理和访问数据，那么就必须知道Cassandra如何读写数据的，hinted handoff特征，与ACID的一致和不一致的地方。 在Cassandra中，一致性指的是如何更新和同步一行的数据到他的所有副本上。 In Cassandra, consistency refers to how up-to-date and synchronized a row of data is on all of its replicas.

to be continue…

阅读(1344) | 评论(0) | 转发(0) |

上一篇：【译】使用Leveled Compaction的时机

下一篇：介绍CASSANDRA中的压缩

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6