Chinaunix首页 | 论坛 | 博客
  • 博客访问: 197335
  • 博文数量: 77
  • 博客积分: 1749
  • 博客等级: 上尉
  • 技术积分: 810
  • 用 户 组: 普通用户
  • 注册时间: 2011-03-28 18:27
文章分类
文章存档

2012年(28)

2011年(49)

分类: C/C++

2012-10-16 16:32:08

A collection of symbols is known as an alphabet. We use  an upper case letter such as X and Y to denote alphabets. 

Though alphabets can be infinite,  we are concerned only with finite alphabets. For example, X={0, 1} is an alphabet consisting of two symbols 0 and 1. Another alphabet is Y={dog, cat, horse, lion}that consists of four symbols ``dog", ``cat", ``horse", and ``lion". 

A string   over an alphabet X is any sequence of zero or more symbols that belong to X. For example, 0110 is a string over the alphabet {0, 1}.  Also, dog cat dog dog lion is a string over the alphabet {dog, cat, horse, lion}. 

We will use lower case letters such as p, q, r to denote strings.  The length of a string is the number of symbols in that string. Given a string s, we denote its length by |s|. Thus |1011|=4 and |dog cat dog|=3. A string of length 0, also known as an empty string, is denoted by e.

Let s1 and s2 be two strings over alphabet X. We write s1.s2 to denote the concatenation of strings s1 and s2.

 A set L of strings over an alphabet X is known as a language. A language can be finite or infinite.
{00, 11, 0101}: A language containing three strings

Given a finite alphabet X, the following are regular expressions over X:
If  a belongs to X, then a  is a regular expression that denotes the set {a}.
Let  r1 and r2 be two  regular expressions over the alphabet X that  denote, respectively, sets L1 and L2.  Then r1.r2 is a regular expression that denotes the  set L1.L2.

If r is a regular expression that denotes the set L then r+  is a  regular expression that denotes  the set obtained by concatenating L with itself one or more times also written as L+ Also, r* known as the Kleene closure  of r, is a regular expression. If r denotes the set L then r* denotes the set {} L+. 

If r1 and r2 are regular expressions that denote, respectively, sets L1 and L2, then r1r2 is also a regular expression that denotes the set L1  L2.




阅读(1069) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~