Chinaunix首页 | 论坛 | 博客
  • 博客访问: 860278
  • 博文数量: 253
  • 博客积分: 6891
  • 博客等级: 准将
  • 技术积分: 2502
  • 用 户 组: 普通用户
  • 注册时间: 2010-11-03 11:01
文章分类

全部博文(253)

文章存档

2016年(4)

2013年(3)

2012年(32)

2011年(184)

2010年(30)

分类: Python/Ruby

2011-04-20 09:19:24

  1.     >>> s='str1 | str2 & str3 &&|& str4'
  2.     >>> import re
  3.     >>> re.compile(r'[\s|&]+')
  4.     <_sre.SRE_Pattern object at 0x100461db0>
  5.     >>> regex=re.compile(r'[\s|&]+')
  6.     >>> regex.split(s)
  7.     ['str1', 'str2', 'str3', 'str4']
  8.     >>>
special characters:
'.' In the default mode, this matches any character except a newline.
'^' Matches the start of the string.
'*' Causes the resulting RE to match 0 or more repetitions of the preceding RE.
    'ab*' will match 'a','ab' or a followed by any number of 'b's.
'+‘ Causes the resulting RE to match 1 or more repetitions of the preceding RE.
    'ab+' will match 'a' followed by any non-zero number of 'b's.
'?' cause the resulting RE to match 0 or 1 repetitions of the preceding RE.
    'ab?' will match either 'a' or 'ab'
'$' match the end of the string of just the newline at the end of the string.
{m} Specifies that exactly m copies of the previous RE should be matched.
    a{6} wil
l match exactly six 'a' characters.
{m, n} match from m to n repetitions of the preceding RE, attempting to match as many
       repetitions as possible.
{m, n}? Causes the resulting RE to match from m to n repetitions of the preceding  RE, attempting to match as few repetitions as possible.this is the non-greedy version of the previous qualifier. 'aaaaaa', a{3,5} will match 5 'a' characters, whilea{3,5}? will only match 3 characters
[] Used to indicate a set of characters. Characters can be listed individually, or a range of    characters can be indicated by giving two characters and
separating them by a '-'. Special characters are not active inside sets. For example, [akm$] will match any of the characters 'a', 'k', 'm', or '$';
[a-z] will match any lowercase letter, and [a-zA-Z0-9] matches any letter or digit. Character classes such as \w or \S (defined below) are also
acceptable inside a range, although the characters they match depends on whether LOCALE or UNICODE mode is in force. If you want to include a ']' or
a '-' inside a set, precede it with a backslash, or place it as the first character. The pattern []] will match ']', for example.
You can match the characters not within a range by complementing the set. This is indicated by including a '^' as the first character of the set; '^'
elsewhere will simply match the '^' character. For example, [^5] will match any character except '5', and [^^] will match any character except '^'.
Note that inside [] the special forms and special characters lose their meanings and only the syntaxes described here are valid. For example, +, *, (, )
, and so on are treated as literals inside [], and backreferences cannot be used inside [].

阅读(773) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~