python应用-文本处理-zghover-ChinaUnix博客

zghoverzghover.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

zghover

博客访问： 846198
博文数量： 91
博客积分： 2544
博客等级：少校
技术积分： 1885
用户组：普通用户
注册时间： 2006-12-12 09:08

文章分类

全部博文（91）

文章存档

2016年（10）

2014年（2）

2013年（4）

2012年（23）

2011年（23）

2010年（13）

2009年（14）

2007年（2）

我的朋友

相关博文

python应用-文本处理

分类： Python/Ruby

2012-03-04 11:20:56

python应用--文本处理

1,字符串处理基本方法
.提取子字符串

#法1：使用切片操作

>>> 'hello world'[2:8]
'llo wo'

.合并两个字符串

#法1：使用运算符+

>>> 'hello ' + 'world'
'hello world'

#法2: 使用运算符*

>>> 'hello '*2
'hello hello '

#使用join合并多个字符串

>>> '--'.join(['a','b','c'])
'a--b--c'

.替换字符串

#法1：使用replace函数

>>> 'hello world'.replace('world', 'tom')
'hello tom'

.格式化字符串

#法1：使用格式化字符串

>>> 'hello %s %s' % ('world', 'haha')
'hello world haha'

#法2: 使用模板

>>> template = '----'
>>> template = template.replace('', 'start')
>>> template = template.replace('', 'end')
>>> print template
--start--end

#使用模板2

>>> template = "hello %(key1)s"
>>> template = "hello %(key1)s %(key2)s"
>>> vals={'key1':'value1', 'key2':'value2'}
>>> print(template % vals)
hello value1 value2

.分解字符串

>>> 'a b c d'.split() #基本分解
['a', 'b', 'c', 'd']

#指定分解符

>>> 'a+b+c+d'.split('+')
['a', 'b', 'c', 'd']

#分解和合并混用

from sys import *
stdout.write(('.' * 4).join(stdin.read().split('\t')))

实战：

>>> stdout.write(('.' * 4).join(stdin.read().split('\t')))
aa bb cc dd
aa....bb....cc....dd

应用

#文本过滤

.基本方法

#条件程序

def isCond(astr):
'find sub string @fstr from a string @astr'
return (astr.find('root') != -1)

也可以用匿名函数：

isCond lambda astr: astr.find('root') != -1

#文本过滤第一版

def filter1(filename):
    'filter every line which read from filename'
    selected = []
    try:
        fp = open(filename)
    except IOError, e:
        print 'could not open file :', e

    for line in fp.readlines():
        if isCond(line):
            selected.append(line)
    print selected

#文本过滤第2版，使用filter内建函数

def filter2(filename):
    'filter version 2'
    selected = []
    selected = filter(isCond, open(filename).readlines())
    print selected

.使用map函数

map(isCond, open(filename).readline())

待续...

阅读(4186) | 评论(0) | 转发(1) |

上一篇：python基础(5)--正则表达式

下一篇：如何用vim阅读java代码

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6