Chinaunix首页 | 论坛 | 博客
  • 博客访问: 2160696
  • 博文数量: 436
  • 博客积分: 9833
  • 博客等级: 中将
  • 技术积分: 5558
  • 用 户 组: 普通用户
  • 注册时间: 2010-09-29 10:27
文章存档

2013年(47)

2012年(79)

2011年(192)

2010年(118)

分类: LINUX

2012-12-24 20:01:49

1、引言

Our key idea is thus simple: measure the number of cycles spent by threads waiting for each bottleneck and accelerate the bottlenecks responsible for the highest thread waiting cycles.

我们主要的想法就是:测量线程等待每一个瓶颈的周期数,并加速负责最高线程等待周期的瓶颈。

This solution is too costly because (a) writing correct parallel programs is already a daunting task, and (b) serializing bottlenecks change with machine configu- ration, program input set, and program phase (as we show in Sec- tion 2.2), thus, what may seem like a bottleneck to the programmer may not be a bottleneck in the field and vice versa.

这个解决方法代价很高因为(a)程序员写并行程序是一个艰巨的任务(b)一系列的瓶颈会随着机器配置、程序输入集、程序计划阶段改变而改变,所以是不是一个瓶颈不一定

The programmer, compiler or library delimits potential bot- tlenecks using BottleneckCall and BottleneckReturn instructions, and replaces the code that waits for bottlenecks with a Bottleneck- Wait instruction.

程序员利用BottleneckCallBottleneckReturn指令分割潜在的瓶颈,用Bottleneck-Wait指令代替等待瓶颈的代码

The bottlenecks with the highest number of thread waiting cycles are selected for acceleration on one or more large cores. On executing a BottleneckCall instruction, the small core checks if the bottleneck has been selected for acceleration.

最高线程等待的周期的瓶颈被选择为加速在一个或多个大核上。在执行BC指令时小核检查瓶颈是否被选择为加速

How- ever, it only applies to barriers in statically scheduled workloads, where the work to be performed by each thread is known before runtime.

只适用于静态调度的障碍

 3.3 加速瓶颈

BIS, consists of two parts: identification of critical bottlenecks and acceleration of those bottlenecks.

BIS,包括两部分:识别临界瓶颈并且加速这些瓶颈。

Identification of critical bottlenecks is done in hardware based on information provided by the software.

识别临界瓶颈是在软件提供的信息基础上在硬件上实现的。

There are multiple ways to accelerate a bottleneck, e.g. increasing core frequency, giving a thread higher priority in shared hard- ware resources, or migrating the bottleneck to a faster core with a more aggressive microarchitecture or higher frequency.

有许多方法加速瓶颈,例如提高核的频率,共享硬件资源给一个线程更高的优先权,或者把瓶颈移到有更积极的微体系结构建模或更高频率的核中。

 

问题:

1.       However, these proposals lack generality and finegrained adaptivity.中的finegrained 怎么理解(细粒)

阅读(854) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~