Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1077721
  • 博文数量: 104
  • 博客积分: 3715
  • 博客等级: 中校
  • 技术积分: 1868
  • 用 户 组: 普通用户
  • 注册时间: 2006-04-30 08:38
文章分类

全部博文(104)

文章存档

2013年(1)

2012年(9)

2011年(41)

2010年(3)

2009年(3)

2008年(47)

分类:

2008-09-02 18:43:12

2006年的一篇文章,原来的题目是“What really make transactions faster ?”
这篇文章对目前已有的一些TM的实现进行了评测,并且提出了一个Transactional Locking
的概念。这篇文章指出了一些很有意思的结论。例如,决定STM性能的是它的相对代价,机械地用STM生成的
代码的scalability比手写的细粒度的锁的代码要好。

Every shared location has an versioned-write-lock(a single word, with one bit
indicating locking and other bits as version numbers), every transaction has a
write set (contain address of the variables and values to be written to the
variables) and a read set (contain address of locks and the observed versions of
the locks associated with readed variables).

Commit mode
    1. A trans load first check to see if the loaded address appears in write
    set, if so, return the value in the write set. Otherwise, fetches the lock
    value associated with the variable. If it is locked, load operation can
    either spin until the lock is release or abort. If it is not locked, saving
    the version in the read set and then fetches the actual variable.
    (There needs a safe load operation, to avoid bad memory accesses)

    2. A trans store is performed by saving the value and address in the trans's
    write set.

    3. During the execution of a transaction, we need periodically validate the
    read set, if the read set is invalid, abort the transaction.

    4. Commit. Get locks of locations to be written. If a lock(location) appears
    in the write set also appears in the read set, then the getting operation
    must atomically finish:
    a. get the lock
    b. validate that the current version field of the lock agrees with that in
    the read set.
    A CAS can finish both a and b.
    Get the locks in convenient order, using bounded spin to avoid deadlocks.
    Reread the locks of all read-only locations to make sure version numbers
    haven't changed. If a version does not match, release the write locks, abort
    the trans.
    Now, the trans is commited. Write back all value in write set. Release the
    write locks using a simple store. (incream the version, clean the lock bit)

Encounter Mode
    This mode must assumes a type-stable closed memory pool.
    In this mode, if a trans T wants to write a shared location, it locks the
    location, saves the old value(and location address, and lock addredd) into
    T's write set. Then, write the new value in the location.
    A trans T wants to load a shared location, it check if the lock is free or
    held by T, if so, read the value from the location, otherwise, spin or
    abort.
    Also need periodical validation.
    When to commit, re-read the locks of all read-only locations to make sure
    version numbers haven't changed. If some changes, abort, otherwise,
    release all write locks(increame version, clear the lock bit).
    When abort, restore all values in the write-set.
    In this mode, locks are held longer than commit mode. Avoid check write
    set on every read.

CM
    To avoid live-lock. Use a back-off delay at abort time similar to CSMA-CD.
    The delay interval is a function of (a) a random number generated at abort
    time;(b) the length of the prior write-set; (c) the number of prior aborts

Problem: Page 5 left-bottom part. This method cannot provent modify-after-free
happening.

Design Space
Locking VS. Non-blocking
In skiplist, Locking STMs have two times throughput of Non-blocking, 5 times of
single locks. When contention is light, TL/Encounter/PO has the best
performance, but when contention is heavy, it has worse performance.

PO VS. PS
PO has better locality, which is very important. PO may introduct more than 50%
abort rate than PS, but still, PO has better scalability and performance than PS
, so, the locality instead of the abort rate dominates scalability. Aborts rates
is shadowed by better locality.

Encounter VS. Commit
When contention is heavy, Encounter will suffer. Commit with PO can do almost as
well as Encounter. Commit scale well under extreamly heavy contention. Commit's
scalability is even better than hand write non-blocking algorithm using CAS.

What makes the Trans faster?
It is the relative overhead, as can be seen from the single thread performance,
that determine which algorithm performs better.

阅读(1840) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~