PHP函数实例：HTML字符串过滤代码-baoluowanxiang-ChinaUnix博客

包罗万象isolated.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

baoluowanxiang

博客访问： 173565
博文数量： 64
博客积分： 3366
博客等级：中校
技术积分： 765
用户组：普通用户
注册时间： 2010-04-10 10:32

文章分类

全部博文（64）

PHP（2）
网站优化（2）
未分配的博文（60）

文章存档

2012年（5）

2011年（22）

2010年（37）

我的朋友

最近访客

推荐博文

PHP函数实例：HTML字符串过滤代码

分类：

2010-11-18 22:39:57

/********************************************************************
* 原文件名：Filter1.php
* 文件说明：过滤HTML字符串
* 文件编写：
* 流程说明：
* 当附合要求的参数传递进filter函数后,filter()函数首先
* 把要字串中所有要过滤的标签$tag通过preg_match_all()
* 取出来,然后循环preg_match_all的匹配数组,通过preg_split()
* 函数分割每个标签为 "左边属性" = "右边值"的形式,再从要保
* 留的属性数组中循环,将preg_split()匹配的内容对应取出,构成
* 可以替换的值,后最通过str_replcae()替换掉字串中相应的标签
* 函数列表：
* function filter(&$str,$tag,$keep_attribute)
* function match($reg,&$str,$arr)
* function show($str,$title='',$debug = True)
* 使用示例：
* //取得搜狐新闻首页
* $str = @file_get_content("");
* //过滤
* filter($str,'a','href,target,alt');
* filter($str,'p','align');
* show($str,'过滤后的内容');
********************************************************************/
$start_time = array_sum(explode(" ",microtime()));
$str = <<< HTML
site a
site b
site c
site d
site e

adasdfasdf

asdfasdfasdfasdf

asdfasdfasdf

asdfadsfasdf
asdfasdfadf
asdfasdf
HTML;
//显示原字串
show($str,'Html');
/***********************************************************************************************************************************************************************/
//过滤
filter($str,'a','href,target,alt');
filter($str,'p','align');
filter($str,'font','color,alt');
//显示过滤后的内容
show($str,'Result');
//脚本运行时间
$run_time = array_sum(explode(" ",microtime())) - $start_time;
echo('Script Run Time: '.$run_time.'');
/**
* 说明：过滤HTML字串
* 参数：
* $str : 要过滤的HTML字串
* $tag : 过滤的标签类型
* $keep_attribute :
* 要保留的属性,此参数形式可为
* href
* href,target,alt
* array('href','target','alt')
*/
function filter(&$str,$tag,$keep_attribute) {
//检查要保留的属性的参数传递方式
if(!is_array($keep_attribute)) {
//没有传递数组进来时判断参数是否包含,号
if(strpos($keep_attribute,',')) {
//包含,号时,切分参数串为数组
$keep_attribute = explode(',',$keep_attribute);
}else {
//纯字串,构造数组
$keep_attribute = array($keep_attribute);
}
}
echo("·过滤[$tag]标签,保留属性:".implode(',',$keep_attribute).'
');
//取得所有要处理的标记
$pattern = "/<$tag(.*)<\/$tag>/i";
preg_match_all($pattern,$str,$out);
//循环处理每个标记
foreach($out[1] as $key => $val) {
//取得a标记中有几个=
$cnt = preg_split('/ *=/i',$val);
$cnt = count($cnt) -1;
//构造匹配正则
$pattern = '';
for($i=1; $i<=$cnt; $i++) {
$pattern .= '( .*=.*)';
}
//完成正则表达式形成,如/(.*<\/a>/i的样式
$pattern = "/(<$tag)$pattern(>.*<\/$tag>)/i";
//取得保留属性
$replacement = match($pattern,$out[0][$key],$keep_attribute);
//替换
$str = str_replace($out[0][$key],$replacement,$str);
}
}
/**
* 说明：构造标签,保留要保留的属性
* 参数：$reg : pattern,preg_match的表达式
* $str : string,html字串
* $arr : array,要保留的属性
* 返回：
* 返回经保留处理后的标签,如
* e.com
*/
function match($reg,&$str,$arr) {
//match
preg_match($reg,$str,$out);
//取出保留的属性
$keep_attribute = '';
foreach($arr as $k1=>$v1) {
//定义的要保留的属性的数组
foreach($out as $k2=>$v2) {
//匹配=后的数组
$attribute = trim(substr($v2,0,strpos($v2,'=')));
//=前面的
if($v1 == $attribute) {
//要保留的属性和匹配的值的=前的部分相同
$keep_attribute .= $v2;
//保存此匹配部分的值
}
}
}
//构造返回值,结构如:aadd
$keep_attribute = $out[1].$keep_attribute.($out[count($out)-1]);
//返回值
Return $keep_attribute;
}
/**
* 显示字串内容
*/
function show($str,$title='',$debug = True) {
if($debug) {
if(is_array($str)) {
$str = print_r($str,True);
}
$txtRows = count(explode("\n",$str))+1;
echo($title.':

');
}
}
?>

本文转自 ☆★ 包罗万象网 ★☆ - 转载请注明出处，侵权必究！
原文链接：

阅读(1286) | 评论(0) | 转发(0) |

上一篇：Linux系统编程学习路线参考

下一篇：Ubuntu下执行shell脚本文件

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6