性能优化准则
这里所谓的法则是想提一下著名的Amdahl准则
懒得翻译,举个例子吧,一个系统有n个模块,在主流程中耗时比重分别是
、,其中如果模块1所占比重
= 0.1,
通过优化将模块1的性能提高到原来的100倍,那优化工作对整体的提升为:
![](/attachment/201305/24/20556902_136938049177QH.png)
如果模块2占的比重是 = 0.8,通过优化将模块2的性能提高到原来的2倍,那优化工作对整体的提升为:
![](/attachment/201305/24/20556902_1369380603ppKe.png)
从这个观点出发,只有将系统的大部分进行很大的提高,才可能对整体有较大的提升。
优化机制
一般来讲,网关设备的性能强弱很大程度上取决于硬件性能,上层安全软件业务只要不是做的太差,并且系统稳定,当产品需要增强性
能来提升产品竞争力是,只要更换性能更强劲的硬件平台就行了,但偏偏有时候软件就做的很差,很大程度上影响了整体性能的发挥,需要
一种好的机制提供给我们对软件性能进行分析,哪地方时间耗费时间太多了,超出了预期有必要分析一下代码优化一下,哪地方耗费时间已
经够少了,再怎么优化也没多大帮助,接下来我将要逐步展开的就是这么一种机制。
我们考虑linux网络的第三层转发处理的性能,处理路径大概如下:
![](/attachment/201305/24/20556902_13693832904iCa.png)
我们现在想弄清楚,从ip_rcv到ip_out_finish中间cpu的处理所耗费的时间,很简单,仅仅需要在ip_rcv时取一下时间戳,
在ip_output_finish处再去一下时间戳,两个时间戳取差就可以得出总耗费时间。例如以下代码:
int ip_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt, struct net_device *orig_dev)
{
t1 = time_stamp_start();
……
}
int ip_finish_output(struct sk_buff *skb)
{
……
t2 = time_stamp_end();
}
t2 - t1能够得出路径耗费时间,最春的方法是打印这些时间差,看到底耗费多次时间,但有时我们需要更细化的时间统计信息,需要在
代码流程中插入非常多个时间戳统计点,如果我们都采用打印的方式,而且在网络高速转发过程中打印信息,这种工机制本身势必会引入
很大的性能问题,而且满屏幕打总不是好方法,所以我们需要想一种办法将这些时间点信息收集起来,做一些统计分析。
根据C函数的过程调用特点,实际的处理路径可能为最终如下形式:
ip_rcv
|->ip_rcv_finish
| ->ip_forward
| ->NF_INET_FORWARDING
| ->ip_forward_finish 路径一
| ->ip_output
| ->ip_output_finish
|->NF_INET_PRE_ROUTING
->ip_rcv_finish
->ip_forward
->NF_INET_FORWARDING
| ->ip_forward_finish 路径二
| ->ip_output
| ->NF_INET_POST_ROUTING
| ->ip_output_finish
|->ip_forward_finish
->ip_output 路径三
->NF_INET_POST_ROUTING
->ip_output_finish
上述处理路径,我们需要能够统计各条路径的处理时间,通过观察,其实虽然路径走的不一样,但有一点是可以肯定的,
那就是NF_INET_PRE_ROUTING在顺序上会在ip_rcv_finish之前走到,我们可以给这些点赋予唯一的由小到大的编号用于标示
时间统计点:
![](/attachment/201305/24/20556902_1369386844L6N0.png)
路径一:
1 -> 3 -> 4 -> 5 -> 6 -> 7 -> 9
路径二:
1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7 -> 8 -> 9
路径三:
1 -> 2 -> 3 -> 4 -> 6 -> 7 -> 8 -> 9
我们将路径的流程再走一遍:
1、 ip_rcv,记录时间戳 t1,next node <- 1
2、 ip_rcv_finish,记录时间戳t2,修改上一步的next node <- 3
3、 ip_forwarding,记录时间戳t3,上一步的next node <- 4
4、 NF_INET_FORWARDING,记录时间戳t4,上一步的next node <- 5
5、 ip_forwarding_finish,记录时间戳t5,上一步的next node <- 6
6、 ip_output,记录时间戳t5,上一步的next node <- 7
7、 ip_output_finish,记录时间戳t5,上一步的next node <- 9
路径一的时间统计视图示例:
节点索引 节点名称 next node 时间戳
1
|
ip_rcv
|
3
|
12323
|
2
|
NF_INET_PRE_ROUTING
|
|
|
3
|
ip_rcv_finish
|
4
|
12433
|
4
|
ip_forward
|
5
|
12453
|
5
|
NF_INET_FORWARDING
|
6
|
12485
|
6
|
ip_forward_finish
|
7
|
13251
|
7
|
ip_output
|
9
|
13294
|
8
|
NF_INET_POST_ROUTING
|
|
|
9
|
ip_output_finish
|
9
|
13400
|
接下来我们将这个时间快照,统计到一颗树上:
ip_rcv(cost :12323 - 12323 = 0)
|->ip_rcv_finish(cost : 12433 - 12323 = 110)
| ->ip_forward (cost : 12452 - 12433 = 20)
| ->NF_INET_FORWARDING (cost : 12485 - 12452 = 32)
| ->ip_forward_finish (cost : 13251 - 12485 = 766) 路径一
| ->ip_output (cost : 13294 - 13251 = 43)
| ->ip_output_finish (cost : 13400 - 13294 = 106)
|->……
需要哪些参数来评估性能,目前我能想到的有:
1、 从开始到当前节点的总平均耗时间,即total。
2、 从上个节点到当前节点的平均耗时,即avg,平均情况,优化的目标是尽量的使这个平均情况变小
3、 从上个节点到当前节点的最大耗时,即max,是最差的情况,一般是高速缓存效应最差的情况才出现的
4、 从上个节点到当前节点的最小耗时,即min,是统计过程中遇到的处理时间最小的情况,是优化向往的结果
5、 平均时间的漂移幅度,用于反映波动性,即diff,diff值越大,说明同一段代码处理时间波动很大,可能有优化的空间
6、 经过该节点的次数,用于反映路径的负荷,即pass,统计报文处理的个数,越大说明本路径的load越重,越是需要重点优化的路径
我们怎样去计算平均时间,不能累加下所有时间,然后再求平均值,性能优化最需要考虑的当前时间点的系统处理时间,所以需要用延时函数
来近视逼近平均时间:
w取值1/8,current代表当前统计值,代表当前的平均时间值,公式的意思是取当前平均值的7/8,取当前统计值的1/8,然后
相加得出新的平均时间值。
结语
请原谅本人不能将机制描述得更简单易懂些,做程序猿这多多年,一直用代码说话的,在文档方面的造诣实在是很挫,最后还是用代码和大家分享一下吧:
代码参照:
/*
* Title: vsp_time_slice.h
* Create by Lijl at 2011/06/07
* modification history
*/
#define VSP_PERFORMANCE_MONITOR
#ifndef _VSP_PERFORMANCE_MONITOR_H_
#define _VSP_PERFORMANCE_MONITOR_H_
#if defined(VSP_PERFORMANCE_MONITOR)
#include
#include
#include
#define INITIAL_NODE_ID 0
#define MAX_CHILD_NODE 48
#define DEFAULT_MONITOR_SCALE 0
#define SLICE_EXPIRE_TIME 100
#define MAX_MONITOR_INFO_BUFFLEN (1024 * 32)
/*Use a 32bits integer to log time stamp*/
#define TIME_SLICE_SHITE 32
#define TIME_SLICE_BITS 32
#define TIME_SLICE_MASK (((1ULL << TIME_SLICE_BITS) - 1) << TIME_SLICE_SHITE)
/*Ues a 32bits use to point to the next stamp*/
#define TIME_STAMP_SHITE 0
#define TIME_STAMP_BITS 32
#define TIME_STAMP_MASK (((1ULL << TIME_STAMP_BITS) - 1) << TIME_STAMP_SHITE)
#define MAX_SLICE_NODE (1024 * 4) /*maxmum node, could not exceed it*/
#define SLICE_MAX_NUMBER (1<<8) /*use the high 8bits of counter register to index*/
#define END_SLICE_ID (SLICE_MAX_NUMBER - 1)
#define TIME_SCALE 1000
#define SSP_ERR -1
#define SSP_OK 0
typedef struct performance_config
{
unsigned char m_enable;
unsigned char m_mode;
unsigned char m_trim_enable;
unsigned char m_debug;
unsigned int m_scale;
unsigned int m_expire;
}PERFORMANCE_CONFIG_S;
typedef enum {
/*****NPF_PRE_ROUTING********/
___netif_receive_skb = 0,
_fastpath_ipv4_precheck,
_precheck,
_precheck_1,
_precheck_2,
_precheck_3,
_fastpath_check,
_fastpath_check_1,
_fastpath_check_2,
_fastpath_check_3,
_fastpath_check_4,
_precheck_4,
_software_fastpath,
_software_fastpath_1,
_software_fastpath_2,
_software_fastpath_3,
_software_fastpath_4,
_br_handle_frame_finish,
_ip_route_state_backpacket,
_ip_route_state_backpacket_1,
_ip_route_state_backpacket_2,
_ip_route_state_backpacket_3,
_ip_route_state_backpacket_4,
/*****NPF_FORWARD************/
_ip_forward,
_fastpath_ipv4_register_ctinfo,
_register_ctinfo,
_register_ctinfo_1,
_register_ctinfo_2,
_register_ctinfo_3,
_register_ctinfo_4,
_record_ctinfo,
_record_ctinfo_1,
_record_ctinfo_2,
_record_ctinfo_3,
_record_ctinfo_4,
__end__ = END_SLICE_ID,
/*****NPF_LOCAL_IN*********/
/**************************/
/*****NPF_LOCAL_OUT**********/
/*****************************/
/*****NPF_POST_ROUTING******/
/****************************/
}PERFORMANCE_MONITOR_TMSID_EN;
typedef struct time_slice {
unsigned int ts_id; /*time slice id, use id to number all of the tree nodes*/
unsigned int max_ts; /*max time this slice consum in whole process*/
unsigned int min_ts; /*apparently min time */
unsigned int avg_ts; /*average time this slice consum in whole process*/
unsigned int avg_from_start; /*average time since receipt of the pkb*/
unsigned int diff_ts; /*measure the slice changeability*/
unsigned int last_ts; /*remember the last slice*/
unsigned int pass; /*how many times the pkb passby, use to most frequently path*/
unsigned int access_time; /*the last access time, used for release the tree path*/
struct time_slice *child_ts[MAX_CHILD_NODE]; /*the child time slice*/
struct time_slice *end_ts; /*the last one time slice, use for ssp_pkbfree*/
struct time_slice *next_ts; /*the single list next node*/
}TIME_SLICE_S;
typedef struct slice_id2name_table {
PERFORMANCE_MONITOR_TMSID_EN enSliceId;
char slice_name[44];
}SLICE_ID2NAME_TABLE_S;
typedef enum {
M_DEFAULT,
M_PER_BOARD,
M_PER_CORE,
M_PER_THREAD,
}PERFORMANCE_MONITOR_MODE;
typedef struct time_stamp_set {
__u32 cur_index; /*index current slice node, 0 means no information in this stamp set*/
__u64 time_stamp[SLICE_MAX_NUMBER]; /*log the time when pkb pass the node*/
}TIME_STAMP_SET_S;
typedef struct time_slice_tree {
unsigned int ts_id;
struct time_slice *child_ts[MAX_CHILD_NODE]; /*the child time slice*/
struct time_slice *end_ts; /*the last one time slice, use for ssp_pkbfree*/
rwlock_t tst_lock;
}TIME_SLICE_TREE_S;
void ts_update(unsigned int thread);
int vsp_pm_init(void);
void pm_timestamp(unsigned int stamp, unsigned int ts_id, unsigned int thread);
extern __u64 pkb_count;
extern PERFORMANCE_CONFIG_S pm_config;
extern TIME_STAMP_SET_S *ppts_percpu[NR_CPUS];
extern unsigned long monitor_percpu_flag[(NR_CPUS + BITS_PER_LONG - 1) / BITS_PER_LONG];
#define MonitorDebugPrint(ulUserId, fmt, args...) do \
{\
if(pm_config.m_debug)\
{\
printk(fmt, ##args);\
}\
}while(0)
#define pm_time_stamp_1(TIME_SLICE_NODE_ID) \
do\
{\
int thread = smp_processor_id();\
if(test_bit(thread, monitor_percpu_flag)) \
{\
pm_timestamp((unsigned int)(TIME_STAMP_MASK & native_sched_clock()), TIME_SLICE_NODE_ID, thread);\
}\
}while(0)
//插入路径中的统计点,TIME_SLICE_NODE_ID是索引值
#define pm_time_stamp(TIME_SLICE_NODE_ID) pm_time_stamp_1(TIME_SLICE_NODE_ID)
//插入待统计路径的开始
#define pm_time_stamp_start() \
do\
{ \
if((0 != pm_config.m_enable) && (0 == ((pkb_count++) & pm_config.m_scale))) \
{\
TIME_STAMP_SET_S *ts;\
int thread = smp_processor_id();\
ts = ppts_percpu[thread];\
ts->cur_index = 0; \
set_bit(thread, monitor_percpu_flag); \
pm_timestamp((unsigned int)(TIME_STAMP_MASK & native_sched_clock()), 0, thread); \
}\
}while(0)
//插入待统计路径的末尾
#define pm_statistics_update() \
do\
{\
int thread = smp_processor_id();\
if(test_bit(thread, monitor_percpu_flag)) \
{\
ts_update(thread); \
clear_bit(thread, monitor_percpu_flag);\
}\
}while(0)
#else
#define pm_time_stamp(TIME_SLICE_NODE_ID) do{;}while(0)
#define pm_time_stamp_start() do{;}while(0)
#define pm_statistics_update() do{;}while(0)
#endif
#endif
----------------------------------------------------------------------------------------------------------------------------------------------------------
/**
* Performance monitor is a tool, use to find the bottle-necks of the main routin, like slow way and
* fast way, slow way represent performance of new session creation, the later represent performance
* of forwading.
* author: create by lijl at 2012.4.12
* modify history:
*
*/
#include
#include
#include
#include
#include
#include
#if defined(VSP_PERFORMANCE_MONITOR)
unsigned long monitor_percpu_flag[(NR_CPUS + BITS_PER_LONG - 1) / BITS_PER_LONG];
EXPORT_SYMBOL(monitor_percpu_flag);
TIME_STAMP_SET_S *ppts_percpu[NR_CPUS];
TIME_SLICE_S *ts_pool;
TIME_SLICE_TREE_S ts_percpu_tree[NR_CPUS];
char *print_buffer;
__u32 pkb_count = 1;
PERFORMANCE_CONFIG_S pm_config =
{
.m_enable = 1,
.m_scale = ((1 << DEFAULT_MONITOR_SCALE) -1),
.m_expire = SLICE_EXPIRE_TIME,
.m_mode = M_PER_THREAD,
.m_trim_enable = 1,
.m_debug = 0,
};
SLICE_ID2NAME_TABLE_S *slice_name_table;
DEFINE_SPINLOCK(ts_lock);
static TIME_SLICE_S *get_ts(void)
{
TIME_SLICE_S *ts = NULL;
spin_lock(&ts_lock);
if(ts_pool)
{
ts = ts_pool;
ts_pool = ts_pool->next_ts;
}
spin_unlock(&ts_lock);
return ts;
}
static void put_ts(TIME_SLICE_S *ts)
{
memset((char *)ts, 0, sizeof(TIME_SLICE_S));
ts->min_ts = (unsigned int)(-1);
spin_lock(&ts_lock);
ts->next_ts = ts_pool;
ts_pool = ts;
spin_unlock(&ts_lock);
return;
}
static void ts_pool_init(TIME_SLICE_S *ts, unsigned int size)
{
int i;
for(i = 0; i < size -1; i++)
{
memset(ts, 0, sizeof(TIME_SLICE_S));
ts->min_ts = (unsigned int)(-1);
ts->next_ts = ts + 1;
ts++;
}
memset(ts, 0, sizeof(TIME_SLICE_S));
ts->next_ts = NULL;
}
static void ts_tree_init(TIME_SLICE_TREE_S *ts_tree, unsigned int size)
{
int i;
for(i = 0; i < size; i++)
{
memset(ts_tree, 0, sizeof(TIME_SLICE_TREE_S));
rwlock_init(&ts_tree->tst_lock);
ts_tree->ts_id= INITIAL_NODE_ID;
ts_tree++;
}
}
#define slice_to_name(SLICE_ID, FUNCTION) \
do{ \
strncpy((*ppname_table + SLICE_ID)->slice_name, FUNCTION, strlen(FUNCTION)); \
(*ppname_table + SLICE_ID)->enSliceId = SLICE_ID; \
}while(0)
static int slice_name_table_init(SLICE_ID2NAME_TABLE_S **ppname_table, unsigned int size)
{
*ppname_table = (SLICE_ID2NAME_TABLE_S *)kmalloc((size * sizeof(SLICE_ID2NAME_TABLE_S)), GFP_ATOMIC);
if(NULL == *ppname_table)
{
printk("%s %d: mem alloc failed for slice name table!\r\n", __FUNCTION__, __LINE__);
return SSP_ERR;
}
memset((char *)(*ppname_table), 0, (size * sizeof(SLICE_ID2NAME_TABLE_S)));
slice_to_name(___netif_receive_skb, "__netif_receive_skb");
slice_to_name(_fastpath_ipv4_precheck, "fastpath_ipv4_precheck");
slice_to_name(_precheck, "precheck");
slice_to_name(_precheck_1, "precheck_1");
slice_to_name(_precheck_2, "precheck_2");
slice_to_name(_precheck_3, "precheck_3");
slice_to_name(_fastpath_check, "fastpath_check");
slice_to_name(_fastpath_check_1, "fastpath_check_1");
slice_to_name(_fastpath_check_2, "fastpath_check_2");
slice_to_name(_fastpath_check_3, "fastpath_check_3");
slice_to_name(_fastpath_check_4, "fastpath_check_4");
slice_to_name(_precheck_4, "precheck_4");
slice_to_name(_br_handle_frame_finish, "br_handle_frame_finish");
slice_to_name(_software_fastpath, "software_fastpath");
slice_to_name(_software_fastpath_1, "software_fastpath_1");
slice_to_name(_software_fastpath_2, "software_fastpath_2");
slice_to_name(_software_fastpath_3, "software_fastpath_3");
slice_to_name(_software_fastpath_4, "software_fastpath_4");
slice_to_name(_ip_forward, "ip_forward");
slice_to_name(_fastpath_ipv4_register_ctinfo, "fastpath_ipv4_register_ctinfo");
slice_to_name(_register_ctinfo, "register_ctinfo");
slice_to_name(_register_ctinfo_1, "register_ctinfo_1");
slice_to_name(_register_ctinfo_2, "register_ctinfo_2");
slice_to_name(_register_ctinfo_3, "register_ctinfo_3");
slice_to_name(_register_ctinfo_4, "register_ctinfo_4");
slice_to_name(_record_ctinfo, "record_ctinfo");
slice_to_name(_record_ctinfo_1, "record_ctinfo_1");
slice_to_name(_record_ctinfo_2, "record_ctinfo_2");
slice_to_name(_record_ctinfo_3, "record_ctinfo_3");
slice_to_name(_record_ctinfo_4, "record_ctinfo_4");
slice_to_name(_ip_route_state_backpacket, "ip_route_state_backpacket");
slice_to_name(_ip_route_state_backpacket_1, "ip_route_state_backpacket_1");
slice_to_name(_ip_route_state_backpacket_2, "ip_route_state_backpacket_2");
slice_to_name(_ip_route_state_backpacket_3, "ip_route_state_backpacket_3");
slice_to_name(_ip_route_state_backpacket_4, "ip_route_state_backpacket_4");
/***************NPF_PRE_ROUTING****************/
/***********************************************/
/*****NPF_FORWARD************/
/*****************************/
/*****NPF_LOCAL_IN*********/
/**************************/
/*****NPF_LOCAL_OUT**********/
/*****************************/
/*****NPF_POST_ROUTING******/
/****************************/
return SSP_OK;
}
static int ts_init(TIME_STAMP_SET_S **ppts, unsigned int size)
{
int i;
TIME_STAMP_SET_S *pts;
pts = (TIME_STAMP_SET_S *)kmalloc((size * sizeof(TIME_STAMP_SET_S)), GFP_ATOMIC);
if(NULL == pts)
{
printk("%s %d: mem alloc failed for time stamp set!\r\n", __FUNCTION__, __LINE__);
return SSP_ERR;
}
memset((char *)pts, 0, (size * sizeof(TIME_STAMP_SET_S)));
for(i = 0; i < size; i++)
{
*ppts = pts;
ppts++;
pts++;
}
return SSP_OK;
}
int vsp_pm_init(void)
{
TIME_SLICE_S *ts_set;
if(SSP_ERR == slice_name_table_init(&slice_name_table, SLICE_MAX_NUMBER))
{
return SSP_ERR;
}
if(SSP_ERR == ts_init(ppts_percpu, NR_CPUS))
{
return SSP_ERR;
}
ts_set = (TIME_SLICE_S *)kmalloc((MAX_SLICE_NODE * sizeof(TIME_SLICE_S)), GFP_ATOMIC);
if(NULL == ts_set)
{
printk("%s %d: mem alloc failed for time slice set!\r\n", __FUNCTION__, __LINE__);
return SSP_ERR;
}
ts_pool_init(ts_set, MAX_SLICE_NODE);
ts_pool = ts_set;
ts_tree_init(ts_percpu_tree, NR_CPUS);
print_buffer = kmalloc(MAX_MONITOR_INFO_BUFFLEN, GFP_ATOMIC);
if(NULL == print_buffer)
{
printk("%s %d: mem alloc failed for monitor info buffer!\r\n", __FUNCTION__, __LINE__);
return SSP_ERR;
}
print_buffer[0] = '\0';
memset(monitor_percpu_flag, 0, sizeof(monitor_percpu_flag));
pm_config.m_enable = 1;
return SSP_OK;
}
void vsp_ts_scan(int thread);
static int pm_switch_show(struct seq_file *s, void *v)
{
int cpu;
for(cpu = 0; cpu < 8; cpu++) {
vsp_ts_scan(cpu);
seq_printf(s, "cpu = %d %s\n", cpu, (pm_config.m_enable ? print_buffer: " "));
}
return 0;
}
static ssize_t pm_switch_proc_write(struct file *file, const char __user *buffer,
size_t count, loff_t *pos)
{
static char proc_number[12];
unsigned int x;
if (!count)
return -EINVAL;
if (count > 11)
return -EINVAL;
memset(proc_number, 0, 12);
if (copy_from_user(proc_number, buffer, count))
return -EFAULT;
x = simple_strtoul(proc_number, NULL, 0);
if(x)
pm_config.m_enable = 1;
else
pm_config.m_enable = 0;
return count;
}
static int pm_switch_open(struct inode *inode, struct file *file)
{
return single_open(file, pm_switch_show, NULL);
}
static const struct file_operations pm_dbg_switch_file_ops = {
.owner = THIS_MODULE,
.open = pm_switch_open,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
.write = pm_switch_proc_write,
};
static ssize_t pm_scale_proc_write(struct file *file, const char __user *buffer,
size_t count, loff_t *pos)
{
static char proc_number[12];
unsigned int x;
if (!count)
return -EINVAL;
memset(proc_number, 0, 12);
if (copy_from_user(proc_number, buffer, (count < 12 ? count : 12)))
return -EFAULT;
x = simple_strtoul(proc_number, NULL, 0);
pm_config.m_scale = ((1 << x) -1);
return count;
}
static int pm_scale_show(struct seq_file *s, void *v)
{
seq_printf(s, "%u\n", pm_config.m_scale);
return 0;
}
static int pm_scale_open(struct inode *inode, struct file *file)
{
return single_open(file, pm_scale_show, NULL);
}
static const struct file_operations pm_dbg_scale_file_ops = {
.owner = THIS_MODULE,
.open = pm_scale_open,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
.write = pm_scale_proc_write,
};
static ssize_t pm_trim_proc_write(struct file *file, const char __user *buffer,
size_t count, loff_t *pos)
{
static char proc_number[12];
unsigned int x;
if (!count)
return -EINVAL;
if (count > 2)
return -EINVAL;
memset(proc_number, 0, 12);
if (copy_from_user(proc_number, buffer, count))
return -EFAULT;
x = simple_strtoul(proc_number, NULL, 0);
if(x)
pm_config.m_trim_enable = 1;
else
pm_config.m_trim_enable = 0;
return count;
}
static int pm_trim_show(struct seq_file *s, void *v)
{
seq_printf(s, "%u\n", pm_config.m_trim_enable);
return 0;
}
static int pm_trim_open(struct inode *inode, struct file *file)
{
return single_open(file, pm_trim_show, NULL);
}
static const struct file_operations pm_dbg_trim_file_ops = {
.owner = THIS_MODULE,
.open = pm_trim_open,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
.write = pm_trim_proc_write,
};
void pm_dbg_init(struct proc_dir_entry *ldsc_proc)
{
vsp_pm_init();
proc_create_data("pm_debug_switch", 0600, ldsc_proc,
&pm_dbg_switch_file_ops, NULL);
proc_create_data("pm_debug_scale", 0600, ldsc_proc,
&pm_dbg_scale_file_ops, NULL);
proc_create_data("pm_debug_trim", 0600, ldsc_proc,
&pm_dbg_trim_file_ops, NULL);
}
static inline void ts_cal(TIME_SLICE_S *ts, unsigned int delta_stamp, unsigned int delta_stamp_start)
{
ts->last_ts = ((ts->last_ts > delta_stamp)?(ts->last_ts - delta_stamp):(delta_stamp - ts->last_ts));
ts->diff_ts += (int)(ts->last_ts - ts->diff_ts) >> 3;
ts->avg_ts += (int)(delta_stamp - ts->avg_ts) >> 3;
ts->avg_from_start += (int)(delta_stamp_start - ts->avg_from_start) >> 3;
ts->last_ts = delta_stamp;
if(delta_stamp > ts->max_ts)
{
ts->max_ts = delta_stamp;
}
if(delta_stamp < ts->min_ts)
{
ts->min_ts = delta_stamp;
}
ts->pass++;
ts->access_time = jiffies;
}
static void ts_tree_update(TIME_SLICE_TREE_S *ts_tree, TIME_STAMP_SET_S *pstamp)
{
unsigned int n_slice;
unsigned int c_slice;
unsigned int n_stamp;
unsigned int start_stamp;
unsigned int c_stamp;
unsigned int delta_stamp;
unsigned int delta_stamp_start;
TIME_SLICE_S **p_cur_ts;
TIME_SLICE_S **p_child_ts;
TIME_SLICE_S **p_end_ts;
if(((__u32)(pstamp->time_stamp[pstamp->cur_index] >> TIME_SLICE_SHITE)) != pstamp->cur_index)
{
MonitorDebugPrint(0, "%s %d cur_index = %u cur_index= %u\r\n", __FUNCTION__, __LINE__, ((__u32)(pstamp->time_stamp[pstamp->cur_index] >> TIME_SLICE_SHITE)), pstamp->cur_index);
}
/*First need to check the stamps, return if invalid*/
c_slice = 0;
n_slice = (__u32)(pstamp->time_stamp[c_slice] >> TIME_SLICE_SHITE);
while(c_slice != n_slice)
{
if((n_slice - c_slice) >= MAX_CHILD_NODE && n_slice != END_SLICE_ID)
{
MonitorDebugPrint(0, "%s %d, ulCurSlice = %u, ulNextSlice = %u, Must be something wrong here!\r\n", __FUNCTION__, __LINE__,c_slice, n_slice);
c_slice = 0;
while(c_slice <= 8)
{
MonitorDebugPrint(0, "slice id = %u, next slice id = %u, time stamp = %u\r\n", c_slice, (__u32)(pstamp->time_stamp[c_slice] >> TIME_SLICE_SHITE), (__u32)(pstamp->time_stamp[c_slice] & TIME_STAMP_MASK));
c_slice++;
}
MonitorDebugPrint(0, "slice id = %u, next slice id = %u, time stamp = %u\r\n", 255, (__u32)(pstamp->time_stamp[255] >> TIME_SLICE_SHITE), (__u32)(pstamp->time_stamp[255] & TIME_STAMP_MASK));
MonitorDebugPrint(0, "ulCurSlice = %u, nextstamp = %u\r\n", pstamp->cur_index, ((__u32)(pstamp->time_stamp[pstamp->cur_index] >> TIME_SLICE_SHITE)));
#if 0
ulCurSlice = 0;
while(ulCurSlice < pstTmsStamp->ulSliceNumber)
{
IC_KTerminalOutString(0, "ulSliceNumber = %lu, slice function = %s\r\n", ulCurSlice, pstTmsStamp->ppcSliceName[ulCurSlice]);
ulCurSlice++;
}
#endif
return;
}
c_slice = n_slice;
n_slice = (__u32)(pstamp->time_stamp[c_slice] >> TIME_SLICE_SHITE);
}
c_slice = 0;
n_slice = (__u32)(pstamp->time_stamp[c_slice] >> TIME_SLICE_SHITE);
start_stamp = (__u32)(pstamp->time_stamp[c_slice] & TIME_STAMP_MASK);
p_child_ts = ts_tree->child_ts;
p_end_ts = &ts_tree->end_ts;
while(c_slice != n_slice)
{
c_stamp = (__u32)(pstamp->time_stamp[c_slice] & TIME_STAMP_MASK);
n_stamp = (__u32)(pstamp->time_stamp[n_slice] & TIME_STAMP_MASK);
/*May expand the new path on the tree*/
write_lock(&ts_tree->tst_lock);
p_cur_ts = ((n_slice != END_SLICE_ID) ? (p_child_ts + n_slice - c_slice - 1):p_end_ts);
if(NULL == *p_cur_ts)
{
*p_cur_ts = get_ts();
if(NULL == *p_cur_ts)
{
write_unlock(&ts_tree->tst_lock);
MonitorDebugPrint(0, "%s %d, Must be something wrong here!\r\n", __FUNCTION__, __LINE__);
return;
}
(*p_cur_ts)->ts_id = n_slice;
}
delta_stamp = n_stamp - c_stamp;
delta_stamp_start = n_stamp - start_stamp;
ts_cal(*p_cur_ts, delta_stamp, delta_stamp_start);
p_child_ts = (*p_cur_ts)->child_ts;
p_end_ts = &((*p_cur_ts)->end_ts);
write_unlock(&ts_tree->tst_lock);
c_slice = n_slice;
n_slice = (__u32)(pstamp->time_stamp[c_slice] >> TIME_SLICE_SHITE);
}
}
void ts_update(unsigned int thread)
{
TIME_STAMP_SET_S *ts;
TIME_SLICE_TREE_S *ts_tree;
ts = ppts_percpu[thread];
/*no slice node during pkb process*/
if(0 == test_bit(thread, monitor_percpu_flag) || 0 == ts->cur_index)
{
return;
}
switch(pm_config.m_mode)
{
case M_PER_THREAD:
ts_tree = &ts_percpu_tree[thread];
break;
case M_PER_CORE:
ts_tree = &ts_percpu_tree[thread /4];
break;
case M_DEFAULT:
case M_PER_BOARD:
default:
ts_tree = &ts_percpu_tree[0];
break;
}
ts_tree_update(ts_tree, ts);
ts->cur_index = 0;
return;
}
static void del_ts_branch(TIME_SLICE_S *ts)
{
unsigned int c_child = 0;
while(c_child < MAX_CHILD_NODE)
{
if(NULL == ts->child_ts[c_child])
{
c_child++;
continue;
}
del_ts_branch(ts->child_ts[c_child]);
c_child++;
}
if(ts->end_ts)
{
del_ts_branch(ts->end_ts);
}
put_ts(ts);
}
static void ts_branch_trim(TIME_SLICE_S *ts)
{
unsigned int c_child = 0;
while(c_child < MAX_CHILD_NODE)
{
if(NULL == ts->child_ts[c_child])
{
c_child++;
continue;
}
if((jiffies - ts->child_ts[c_child]->access_time) > pm_config.m_expire)
{
del_ts_branch(ts->child_ts[c_child]);
ts->child_ts[c_child] = NULL;
}
else
{
ts_branch_trim(ts->child_ts[c_child]);
}
c_child++;
}
if(ts->end_ts)
{
if((jiffies - ts->end_ts-&
阅读(451) | 评论(0) | 转发(0) |