My thinking of debelopment of Datarace Detection-wuweidan-ChinaUnix博客

wuweidan的ChinaUnix博客

首页　| 　博文目录　| 　关于我

wuweidan

博客访问： 42320
博文数量： 47
博客积分： 1000
博客等级：少尉
技术积分： 490
用户组：普通用户
注册时间： 2009-12-02 21:19

文章分类

全部博文（47）

DRAM Technology（3）
Others（8）
Engineer Jobs（3）
Undergraduate wo（3）
CPU microarchite（18）

Rigel（2）

interconnection （3）
Concurrency Bug（10）
未分配的博文（2）

文章存档

2010年（1）

2009年（46）

我的朋友

最近访客

cynthia

推荐博文

My thinking of debelopment of Datarace Detection

分类：

2009-12-04 21:52:14

Datarace detection will continue to be critical in the future. Past reserch[1,2] in this field generally divided into 3 groups: static, dynamic and postmotern.

Static analysis has higher coverage, it does not requires code execution (idle for OS); most of them based on type system, thus they are computational expensive and thorough test also incur scalability issue. They are also too conservative and would generate too many false alarm.

Dynamic and postmotern method can only detect race on the running path, but they have many runtime information which allow them to make more aggressive decisions. Deterministic Replay [3-9] is one type of postmotern method and can be used to detect data race. This type of method requires a whole tool chain from recorder to replayer and certain support to record and replay. It is not easy to make it right. FDR[4] model an one way issue 1G processor using simulator, with log size in dozens of Mb for 1s (34mb/1s on average, 6.25% for 4MB cache). Strata[6] record the global state, futher reduce the log size; it is attractive if it is scalable. RDR[5] reduce the log size of FDR to 4% by regulated the race edges and set/LRU timestamp. Rerun[7] and DeLorean[8] explore the fact that threads do not communicate all the time. Rerun has similar log size of RDR but with much lower hardware cost (0.2kb vs. 24kb). DeLorean has impressive low log size (2.1bits per kilo instructions vs. 3bytes of Rerun), but DeLorean runs on Bulk processor, a specially design platform; its performance is not as good as Rerun.

Dynamic test or on the fly test is the most worthy way. The advantage is that they detect datarace on the fly, thus can use most intimate run time information. Generally there are two types of algorithms in this field: happen before and lockset.

But there are several drawbacks in this type.

First, it require access history to performance the check. Although postmotern methods also require that, but their log could be saved in other places and only use when replay; in dynamic detection, the access history is needed immediately. So although theoratically the method can detect any races, but in real application they are limited by the access history. [18] increases the granularity of datarace detection in order to lower the cost. [19] proposed "weaker-than" relation to reduce the amount of history; it also incoporates static analysis to reduce detection candidate.[3] proposed a method to reduce the history in run time. [10] theoratically proved that the history can be limited to the last write and all the concurrent read after that write, on a new write the history could be discarded. [11] observes that 4 recent conflicting accesses are enough to exposed a datarace. This problem may not be so complicated as it sounds.

It relies on instrumentation, which makes it not practical in OS detection. [12] use profile technique to reduce the overhead of datarace detection. It is based on the notion that frequently tested code are less likely to have datarace. This rule applies to OS which is a thoroughly tested bulk of code. Furthermore, My current work using only instruction count to decide execution order and detect races; it can cheat OS and user space the same, so it may reduce the severness of this problem.

On the fly detection can only detect races on current execution path. Yes, it is right! But computers are invented for repeated work, so run the test as many times as possible can solve this problem for most cases. [13] use datarace detection to help compiler detect cycles in program execution; according to the author (UIUC), their method is to run the program repeatedly until the error report become stable.

In recent years, there are novel techniques proposed in this field. These techniques either do not rely on tranditional methods or incoporate new conception to significantly lower the overhead of them. See recent datarace detection.

    In my view, as the hardware resource become abundant, more and more software functionality can be offloaded to the hardware, bringing higher speed. But the hardware space can not be as unlimited as software, so careful tuning is still necessary. Hardware implementation data race detection requires careful study of the generated instruction information characteristic and design the hardware accordingly. For example, the access history can be dynamically discarded during run time, so we can allocate necessary storing space for them, but the space does not need to be too large. On the other hand, to select dynamical discarded information we need more computation which will be speedup by hardware.
    The other result of abundant hardware resource is that CPU core number is increasing continually, thus brings scalability issue for many aspects of system design. In my view, hardware data race detection also encounters such a problem. A global checking module is not practical if the core number is large. This is a general issue not only applicable for data race. Within future architecture, interconnection network would play an important role in scalable inter CPU core communication; its central idea is to decentralize the communication, thus allowing substantial overlap and distribution. Software reliability issues like data race detection can shift to this new platform without influencing the critical path operations. That is my current work on implementing data race on interconnection network.
    There are even more aggressive techniques beyond datarace detection or deterministic replay which are surviving technique; there are avoiding technique.

[1] Aoun Raza. "A Review of Race Detection Mechanisms". 2006

[2] Nels E. Beckman. "A Survey of Methods for Preventing Race Conditions".

[3] M. Ronse and K. De Bosschere. RecPlay : A Fully Integrated Practical Record/Replay System. In ACM Transactions on Computer Systems,1999

[4] Min Xu, Rastislav Bodik, Mark D. Hill1. "A “Flight Data Recorder” for Enabling Full-system Multiprocessor Deterministic Replay". ISCA'03

[5] Min Xu，Rastislav Bodík，Mark D. Hill. "A Regulated Transitive Reduction (RTR) for Longer Memory Race Recording". ASPLOS XII,2006

[6] Satish Narayanasamyy, Cristiano Pereiray, Brad Calder "Recording Shared Memory Dependencies Using Strata" ASPLOS'06

[7] Derek R. Hower, Mark D. Hill. "Rerun: Exploiting Episodes for Lightweight Memory Race Recording" ISCA'08

[8] P. Montesinos, L. Ceze, and J.Torrellas.

"DeLorean: Recording and Deterministically Replaying Shared-Memory Multiprocessor Execution Efficiently", ISCA'08

[9] R. H. B. Netzer. Optimal Tracing and Replay for Debugging Shared-Memory Parallel Programs. In Proceedings of the Workshop on Parallel and Distributed Debugging (PADD),1993.

[10] Utpal Banerjee, Brian Bliss, Zhiqiang Ma, Paul Petersen. "A Theory of Data Race Detection". PADTADIV,July 17, 2006

[11] Shan Lu, Soyeon Park, Eunsoo Seo and Yuanyuan Zhou. "Learning from Mistakes— A Comprehensive Study on Real World Concurrency Bug Characteristics". ASPLOS'08

[12]

[13] Yuelu Duan, Xiaobing Feng, Lei Wang, Chao Zhang, Pen-Chung Yew. "Detecting and Eliminating Potential Violation of Sequential Consistency for Concurrent C/C++ Programs".

[14] Adrian Nistor,Darko Marinov,Josep Torrellas.

"Light64: Lightweight Hardware Support for Data RaceDetection during Systematic Testing of Parallel Programs"

[15] Tayfun Elmas, Shaz Qadeer, and Serdar Tasiran. "Goldilocks: Eciently Computing the Happens-Before Relation Using Locksets". PLDI'06

[17] Shantanu Gupta, Florin Sultan, Srihari Cadambi, Franjo Ivancic, Martin Rotteler, "Using hardware transactional memory for data race detection," ISPDP'09

[18] Christoph von Praun and Thomas R. Gross. "Object Race Detection". OOPSLA'01

[19] Jong-Deok Choi, Keunwoo Lee, Alexey Loginov, Robert O’Callahan, Vivek Sarkar, Manu Sridharan. "Efficient and Precise Datarace Detection for Multithreaded Object-Oriented Programs". PLDI’02.

阅读(1227) | 评论(0) | 转发(0) |

上一篇：GPU A closer look

下一篇：Recent Datarace Detection

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6