cpu family :代表处理器属于哪一个系列,6指的是6系列,主要包括:Pentium Pro、Pentium II、Pentium II Xeon、Pentium III和Pentium III Xeon处理器 model:型号,可用来确定处理器的制作技术以及属于该系列的第几代设计(或核心,通常和cpu family配合使用 stepping:步进编号用来标识处理器的设计或制作版本,有助于控制和跟踪处理器的更改,步进还可以让最终用户更具体地识别其系统安装的处理器版本,确定微处理器的内部设计或制作特性。 siblings:逻辑处理器的数目,这里是2,即一个处理器核上有分成两个逻辑核 core id: 是我们通常说的CPU ID cpu cores:这个处理器上有几个核 apic id: 高级可编程中断控制器的编号 cpuid level:这个东西没怎么搞清楚,如果有明白的朋友清给我留言。
二 oprofile 1 现面来看看我的机器支持的硬件事件,大体上解释解释
1 CPU_CLK_UNHALTED: (counter: all) Clock cycles when not halted (min count: 6000) Unit masks (default 0x0) ---------- 0x00: Unhalted core cycles 0x01: Unhalted bus cycles 0x02: Unhalted bus cycles of this core while the other core is halted 在采样的这段时间内,不在halted状态(挂起或停机状态)下的机器周期数,具体有分为一下三种情况: (1)不在unhalted 状态的cpu的周期数 (2)不在unhalted状态的总线的周期数 (3)当其他的核在halted状态时,本地核上的不在halted状态的周期数 2 INST_RETIRED.ANY_P: (counter: all) number of instructions retired (min count: 6000) 引退指令的数目,很多资料上说,当一条指令执行完后,把它从指令队列中删除,才叫指令的引退
3 L2_RQSTS: (counter: all) number of L2 cache requests (min count: 500) Unit masks (default 0x7f) ---------- 0xc0: core: all cores 0x40: core: this core 0x30: prefetch: all inclusive 0x10: prefetch: Hardware prefetch only 0x00: prefetch: exclude hardware prefetch 0x08: (M)ESI: Modified 0x04: M(E)SI: Exclusive 0x02: ME(S)I: Shared 0x01: MES(I): Invalid L2 cache请求的数目,又分为以下几种情况: (1)所有核的L2 cache的请求 (2)当前核的L2 cache的请求 (3)所有核预取时对L2 cache的请求 (4)仅包含硬件预取时,L2 cache的请求 (5)预取时,除硬件预取之外的其他的L2 cache请求 (6)访问处于M态的L2 cache请求的次数 (7)访问处于E态的L2 cache请求的次数 下面的其他状态类似 4 LLC_MISSES: (counter: all) L2 cache demand requests from this core that missed the L2 (min count: 6000) Unit masks (default 0x41) ---------- 0x41: No unit mask 当前核上,L2请求缺失的次数 5 LLC_REFS: (counter: all) L2 cache demand requests from this core (min count: 6000) Unit masks (default 0x4f) ---------- 0x4f: No unit mask 当前核上L2 请求的数量
6 MISALIGN_MEM_REF: (counter: all) number of misaligned data memory references (min count: 500) 未对齐的数据内存引用的数量
7 SEGMENT_REG_LOADS: (counter: all) number of segment register loads (min count: 500) 段寄存器加载的数量
8 DTLB_MISSES: (counter: all) DTLB miss events (min count: 500) Unit masks (default 0xf) ---------- 0x01: ANY Memory accesses that missed the DTLB. 0x02: MISS_LD DTLB misses due to load operations. 0x04: L0_MISS_LD L0 DTLB misses due to load operations. 0x08: MISS_ST TLB misses due to store operations. DTLB 未命中的数量,有分为以下几种情况: (1)所有内存访问的DTLB的缺失 (2)加载操作时的DTLB的缺失 (3)加载操作时L0 DTLB的缺失数 (4)store操作时,TLB的缺失数
9 PAGE_WALKS: (counter: all) Page table walk events (min count: 500) Unit masks (default 0x2) ---------- 0x01: COUNT Number of page-walks executed. 0x02: CYCLES Duration of page-walks in core cycles. (1)遍历页表的次数 (2)内核周期中遍历页表的时间
10 MUL: (counter: all) number of multiplies (min count: 1000) DIV: (counter: all) number of divides (min count: 500) CYCLES_DIV_BUSY: (counter: all) cycles divider is busy (min count: 1000) IDLE_DURING_DIV: (counter: all) cycles divider is busy and all other execution units are idle. (min count: 1000)
11 L2_ADS: (counter: all) Cycles the L2 address bus is in use. (min count: 500) Unit masks (default 0x40) ---------- 0xc0: All cores 0x40: This core L2 地址线的使用的cycle数
12 L2_DBUS_BUSY_RD: (counter: all) Cycles the L2 transfers data to the core. (min count: 500) Unit masks (default 0x40) ---------- 0xc0: All cores 0x40: This core L2 数据总线向内核传送数据的周期数
13 L2_LINES_IN: (counter: all) number of allocated lines in L2 (min count: 500) Unit masks (default 0x70) ---------- 0xc0: core: all cores 0x40: core: this core 0x30: prefetch: all inclusive 0x10: prefetch: Hardware prefetch only 0x00: prefetch: exclude hardware prefetch L2 line 分配的次数(应该是写分配的协议)
14 L2_M_LINES_IN: (counter: all) number of modified lines allocated in L2 (min count: 500) Unit masks (default 0x40) ---------- 0xc0: All cores 0x40: This core 处于M态的L2 line分配的次数
15 L2_LINES_OUT: (counter: all) number of recovered lines from L2 (min count: 500) Unit masks (default 0x70) ---------- 0xc0: core: all cores 0x40: core: this core 0x30: prefetch: all inclusive 0x10: prefetch: Hardware prefetch only 0x00: prefetch: exclude hardware prefetch L2 line被覆盖的次数
L2_M_LINES_OUT: (counter: all) number of modified lines removed from L2 (min count: 500) Unit masks (default 0x70) ---------- 0xc0: core: all cores 0x40: core: this core 0x30: prefetch: all inclusive 0x10: prefetch: Hardware prefetch only 0x00: prefetch: exclude hardware prefetch 处于M态的L2 line被覆盖的次数,这时被覆盖的话,根据不同的cache一致性协议,或者协会内存,或者到别的状态。
16 L2_IFETCH: (counter: all) number of L2 cacheable instruction fetches (min count: 500) Unit masks (default 0x4f) ---------- 0xc0: core: all cores 0x40: core: this core 0x08: (M)ESI: Modified 0x04: M(E)SI: Exclusive 0x02: ME(S)I: Shared 0x01: MES(I): Invalid
17 L2_LD: (counter: all) number of L2 data loads (min count: 500) Unit masks (default 0x7f) ---------- 0xc0: core: all cores 0x40: core: this core 0x30: prefetch: all inclusive 0x10: prefetch: Hardware prefetch only 0x00: prefetch: exclude hardware prefetch 0x08: (M)ESI: Modified 0x04: M(E)SI: Exclusive 0x02: ME(S)I: Shared 0x01: MES(I): Invalid 18 L2_ST: (counter: all) number of L2 data stores (min count: 500) Unit masks (default 0x4f) ---------- 0xc0: core: all cores 0x40: core: this core 0x08: (M)ESI: Modified 0x04: M(E)SI: Exclusive 0x02: ME(S)I: Shared 0x01: MES(I): Invalid L2_LOCK: (counter: all) number of locked L2 data accesses (min count: 500) Unit masks (default 0x4f) ---------- 0xc0: core: all cores 0x40: core: this core 0x08: (M)ESI: Modified 0x04: M(E)SI: Exclusive 0x02: ME(S)I: Shared 0x01: MES(I): Invalid 19 L2_REJECT_BUSQ: (counter: all) Rejected L2 cache requests (min count: 500) Unit masks (default 0x7f) ---------- 0xc0: core: all cores 0x40: core: this core 0x30: prefetch: all inclusive 0x10: prefetch: Hardware prefetch only 0x00: prefetch: exclude hardware prefetch 0x08: (M)ESI: Modified 0x04: M(E)SI: Exclusive 0x02: ME(S)I: Shared 0x01: MES(I): Invalid L2_NO_REQ: (counter: all) Cycles no L2 cache requests are pending (min count: 500) Unit masks (default 0x40) ---------- 0xc0: All cores 0x40: This core 20 EIST_TRANS_ALL: (counter: all) Intel(tm) Enhanced SpeedStep(r) Technology transitions (min count: 500) THERMAL_TRIP: (counter: all) Number of thermal trips (min count: 500) Unit masks (default 0xc0) ---------- 0xc0: No unit mask 热断路事件的发生次数,当cpu的温度超过某个值的时候就会触发事件
21 L1 cache的相关操作 L1D_CACHE_LD: (counter: all) L1 cacheable data read operations (min count: 500) Unit masks (default 0xf) ---------- 0x08: (M)ESI: Modified 0x04: M(E)SI: Exclusive 0x02: ME(S)I: Shared 0x01: MES(I): Invalid L1D_CACHE_ST: (counter: all) L1 cacheable data write operations (min count: 500) Unit masks (default 0xf) ---------- 0x08: (M)ESI: Modified 0x04: M(E)SI: Exclusive 0x02: ME(S)I: Shared 0x01: MES(I): Invalid L1D_CACHE_LOCK: (counter: all) L1 cacheable lock read operations (min count: 500) Unit masks (default 0xf) ----------L1D_CACHE_LD: (counter: all) L1 cacheable data read operations (min count: 500) Unit masks (default 0xf) ---------- 0x08: (M)ESI: Modified 0x04: M(E)SI: Exclusive 0x02: ME(S)I: Shared 0x01: MES(I): Invalid L1D_CACHE_ST: (counter: all) L1 cacheable data write operations (min count: 500) Unit masks (default 0xf) ---------- 0x08: (M)ESI: Modified 0x04: M(E)SI: Exclusive 0x02: ME(S)I: Shared 0x01: MES(I): Invalid L1D_CACHE_LOCK: (counter: all) L1 cacheable lock read operations (min count: 500) Unit masks (default 0xf) ---------- 0x08: (M)ESI: Modified 0x04: M(E)SI: Exclusive 0x02: ME(S)I: Shared 0x01: MES(I): Invalid L1D_CACHE_LOCK_DURATION: (counter: all) Duration of L1 data cacheable locked operations (min count: 500) Unit masks (default 0x10) ---------- 0x10: No unit mask L1D_ALL_REF: (counter: all) All references to the L1 data cache (min count: 500) Unit masks (default 0x10) ---------- 0x10: No unit mask L1D_ALL_CACHE_REF: (counter: all) L1 data cacheable reads and writes (min count: 500) Unit masks (default 0x2) ---------- 0x02: No unit mask L1D_REPL: (counter: all) Cache lines allocated in the L1 data cache (min count: 500) Unit masks (default 0xf) ---------- 0x0f: No unit mask L1D_M_REPL: (counter: all) Modified cache lines allocated in the L1 data cache (min count: 500) L1D_M_EVICT: (counter: all) Modified cache lines evicted from the L1 data cache (min count: 500)
0x08: (M)ESI: Modified 0x04: M(E)SI: Exclusive 0x02: ME(S)I: Shared 0x01: MES(I): Invalid L1D_CACHE_LOCK_DURATION: (counter: all) Duration of L1 data cacheable locked operations (min count: 500) Unit masks (default 0x10) ---------- 0x10: No unit mask L1D_ALL_REF: (counter: all) All references to the L1 data cache (min count: 500) Unit masks (default 0x10) ---------- 0x10: No unit mask L1D_ALL_CACHE_REF: (counter: all) L1 data cacheable reads and writes (min count: 500) Unit masks (default 0x2) ---------- 0x02: No unit mask L1D_REPL: (counter: all) Cache lines allocated in the L1 data cache (min count: 500) Unit masks (default 0xf) ---------- 0x0f: No unit mask L1D_M_REPL: (counter: all) Modified cache lines allocated in the L1 data cache (min count: 500) L1D_M_EVICT: (counter: all) Modified cache lines evicted from the L1 data cache (min count: 500) L1D_PEND_MISS: (counter: all) Total number of outstanding L1 data cache misses at any cycle (min count: 500) L1D_SPLIT: (counter: all) Cache line split load/stores (min count: 500) Unit masks (default 0x1) ---------- 0x01: split loads 0x02: split stores 先写到这里吧,硬件监测事件太多了!我们应用的关键是如何灵活应用这些监测事件或是监测事件的不同组合,来达到我们的目的。在后面的文章中,将以具体的实例展示oprofile的使用方法!