首页　| 　博文目录　| 　关于我

博客访问： 489346
博文数量： 86
博客积分： 2010
博客等级：大尉
技术积分： 878
用户组：普通用户
注册时间： 2008-11-06 14:11

文章分类

全部博文（86）

Verilog（2）
Hardware（1）
SimpleScalar（2）
博弈论（2）
reconfigurable（1）
Symbian（3）
BootLoader（2）
论文日记（2）
Matlab（5）
F180（0）
ARM（3）
linux（19）
Memory（6）
SystemC（7）
C/C++（27）
未分配的博文（4）

文章存档

2010年（12）

2009年（60）

2008年（14）

我的朋友

lex 和 yacc

构建用于脚本和 GUI 设计的语法分析器

级别：初级

Sam Lantinga, 首席程序员, Loki Entertainment Software

2000 年 5 月 01 日

在这部分中，我们将讨论所有 Linux 程序员工具库中的两种实用具: lex 和 yacc。这些工具让我们轻松地构建了在我们基于 SDL 的 Linux 游戏Pirates Ho! 中使用的脚本语言和 GUI 框架。

在设计 Pirates Ho! 时，我们需要一种简便的方法向玩家描述界面和对话框选项。我们需要简单、一致且灵活的语言来进行描述，因此我们寻找可以帮助我们构建脚本语言的工具。

我还在学校时，就已经对 "yacc" 这个词充满恐惧。它让我想到那些头发凌乱、面色苍白的学生低声念叨着编译器和符号表。所以我非常小心，尽量避免使用编译器类。但在开发游戏时，我鼓起勇气使用 yacc，希望它可以使编写脚本变得容易些。最后，yacc 不仅使编写脚本变得更容易，还使这个过程很有趣。

回页首

yacc 实际上非常易于使用。只要提供给它一组描述语法的规则，它就可以分析标记，并根据所见到的采取操作。对于我们使用的脚本语言，我们希望由浅入深，最初只是指定一些数字及其逻辑运算：

eval.y

%{
/* This first section contains C code which will be included in the output
   file.
*/
#include 
#include 
/* Since we are using C++, we need to specify the prototypes for some 
   internal yacc functions so that they can be found at link time.
*/
extern int yylex(void);
extern void yyerror(char *msg);
%}
/* This is a union of the different types of values that a token can
   take on.  In our case we'll just handle "numbers", which are of
   C int type.
*/
%union {
 int number;
}
/* These are untyped tokens which are recognized as part of the grammar */
%token AND OR EQUALS
/* Here we are, any NUMBER token is stored in the number member of the
   union above.
*/
%token  NUMBER
/* These rules all return a numeric value */
%type  expression
%type  logical_expression and or equals
%%
/* Our language consists either of a single statement or of a list of statements.
   Notice the recursivity of the rule, this allows us to have any
   number of statements in a statement list.
*/
statement_list: statement | statement_list statement
 ;
/* A statement is simply an expression.  When the parser sees an expression
   we print out its value for debugging purposes.  Later on we'll
   have more than just expressions in our statements.
*/
statement: expression
 { printf("Expression = %d\n", $1); }
 ;
/* An expression can be a number or a logical expression. */
expression: NUMBER
 |   logical_expression
 ;
/* We have a few different types of logical expressions */
logical_expression: and
 |           or
 |           equals
 ;
/* When the parser sees two expressions surrounded by parenthesis and
   connected by the AND token, it will actually perform a C logical
   expression and store the result into
   this statement.
*/
and: '(' expression AND expression ')'
 { if ( $2 && $4 ) { $$ = 1; } else { $$ = 0; } }
 ;
or: '(' expression OR expression ')'
 { if ( $2 || $4 ) { $$ = 1; } else { $$ = 0; } }
 ;
equals: '(' expression EQUALS expression ')'
 { if ( $2 == $4 ) { $$ = 1; } else { $$ = 0; } }
 ;
%%
/* This is a sample main() function that just parses standard input
   using our yacc grammar.  It allows us to feed sample scripts in
   and see if they are parsed correctly.
*/
int main(int argc, char *argv[])
{ yyparse();
}
/* This is an error function used by yacc, and must be defined */-
void yyerror(char *message)
{
 fprintf(stderr, "%s\n", message);
}

回页首

既然我们已经有了一个可以识别标记序列的简单语法，将需要寻求一种将这些标记提供给语法分析器的方法。lex 这种工具可以接受输入，将它转换成标记，然后将这些标记传递给 yacc。下面，我们将描述 lex 要将其转换成标记的表达式：

eval.l

%{
/* Again, this is C code that is inserted into the beginning of the output */
#include 
#include "y.tab.h"  /* Include the token definitions generated by yacc */
%}
/* Prevent the need for linking with -lfl */
%option noyywrap
/* This next section is a set of regular expressions that describe input
   tokens that are passed back to yacc.  The tokens are defined in y.tab.h,
   which is generated by yacc.
 */
%%
\/\/.*  /* ignore comments */
-[0-9]+|[0-9]+ { yylval.number=atoi(yytext); return NUMBER; }
[ \t\n]  /* ignore whitespace */
&&  { return AND; }
\|\|  { return OR; }
==  { return EQUALS; }
.  return yytext[0];
%%

现在，在当前目录中已经有了分析源码，我们需要一个 Makefile 来构建它们：

Makefile

all: eval 
y.tab.c: eval.y
 yacc -d $<
lex.yy.c: eval.l
 lex $<
eval: y.tab.o lex.yy.o
 $(CC) -o $@ $^

缺省情况下，yacc 输出到 y.tab.c，lex 输出到 lex.yy.c，因此我们使用那些名称作为源文件。Makefile 包含了根据分析描述文件构建源码的规则。一切就绪之后，可以输入 "make" 来构建语法分析器。然后我们可以运行该语法分析器并输入脚本以检查逻辑。

expression: NUMBER | plus
 ;
plus: expression '+' expression
 ;

回页首

对于将 lex 和 yacc 与 C++ 一起使用，有一些忠告。lex 和 yacc 输出到 C 文件，因此对于 C++，我们使用 GNU 等价物 flex 和 bison。这些工具可以让您指定输出文件的名称。我们还将通用规则添加到 Makefile，因此 GNU Make 会根据 lex 和 yacc 源码自动构建 C++ 源文件。这要求我们将 lex 和 yacc 源码分别重命名成 "lex_eval.l" 和 "yacc_eval.y"，这样 Make 就会为它们生成不同的 C++ 源文件。还需要更改 lex 用于存储 yacc 标记定义的文件。bison 输出的头文件使用带 .h 后缀的输出文件名，而在我们这个示例中是 "yacc_eval.cpp.h"。以下就是新的 Makefile：

Makefile

all: eval 
%.cpp: %.y
 bison -d -o $@ $<
%.cpp: %.l
 flex -o$@ $<
yacc_eval.o: yacc_eval.cpp
lex_eval.o: lex_eval.cpp
eval: yacc_eval.o lex_eval.o
 $(CXX) -o $@ $^

回页首

缺省 lex 代码从标准输入读取其输入，但我们希望游戏能够分析内存中的字符串。使用 flex 很容易就能做到，只要重新定义 lex 源文件顶部的宏 YY_INPUT ：

extern int eval_getinput(char *buf, int maxlen);
#undef YY_INPUT
#define YY_INPUT(buf, retval, maxlen) (retval = eval_getinput(buf, maxlen))

我们将 eval_getinput() 的实际代码写入一个单独文件，使它变得非常灵活，这样它可以从文件指针或内存中的字符串中获取输入。为了使用实际代码，我们首先建立一个全局数据源变量，然后调用 yacc 函数 yyparse()，此函数会调用输入函数并对它进行分析。

回页首

我们希望在游戏中对脚本语言和 GUI 描述使用不同的语法分析器，因为它们使用不同的语法规则。这样做是可行的，但我们必须对 flex 和 bison 使用一些技巧。首先，需要将语法分析器的前缀由 "yy" 更改成独特的名称，以避免名称冲突。只要对 flex 和 bison 分别使用命令行选项就可以重命名语法分析器，对 flex 使用 -P ，对 bison 使用 -p 。然后，必须将代码中使用 "yy" 前缀的地方改成我们选择的前缀。这包括了对 lex 源码中 yylval 的引用，以及 yyerror() 的定义，因为我们将它放在了最后游戏的一个单独文件中。最终的 Makefile 如下所示：

Makefile

all: eval 
YY_PREFIX = eval_
%.cpp: %.y
 bison -p$(YY_PREFIX) -d -o $@ $<
%.cpp: %.l
 flex -P$(YY_PREFIX) -o$@ $<
yacc_eval.o: yacc_eval.cpp
lex_eval.o: lex_eval.cpp
eval: yacc_eval.o lex_eval.o
 $(CXX) -o $@ $^

回页首

我们从以上显示的代码（可以在参考资料中找到下载的网址）着手，继续添加对函数、变量和简单流量控制的支持，最后得到了游戏的相当完整的解释型语言。以下就是一个可能的脚本样本：

example.txt

function whitewash
{
        if ( $1 == "Blackbeard" ) {
                print("Pouring whitewash on Blackbeard!")
                if ( $rum >= 3 ) {
                        print("Pouring whitewash on Blackbeard!")
                        mood = "happy"
                } else {
                        print($1, "says Grr....")
                        mood = "angry"
                        print("Have some more rum?")
                        ++rum
                }
        }
}
pirate = "Blackbeard"
rum = 0
mood = "angry"
print($pirate, "is walking by...")
while ( $mood == "angry" ) {
        whitewash($pirate)
}
return "there was much rejoicing"

回页首

我们将 GUI 构建成一组窗口小部件，它们都从基类继承属性。这就非常好地勾画出 yacc 分析其输入的方法。我们定义了一组对应于基类属性的规则，然后为每一个窗口小部件分别定义了规则，同样也定义了基类的规则。当语法分析器与窗口小部件的规则匹配时，我们可以放心地将窗口小部件指针转换成适当的类，并设置期望的属性。以下就是简单的按钮部件示例：

yacc_gui.y

%{
#include 
#include 
#include "widget.h"
#include "widget_button.h"
#define PARSE_DEBUG(X) (printf X)
#define MAX_WIDGET_DEPTH 32
static int widget_depth = -1;
static Widget *widget_stack[MAX_WIDGET_DEPTH];
static Widget *widget;
static void StartWidget(Widget *the_widget)
{
 widget_stack[widget_depth++] = widget = the_widget;
}
static void FinishWidget(void)
{
 Widget *child;
 --widget_depth;
 if ( widget_depth >= 0 ) {
  child = widget;
  widget = widget_stack[widget_depth];
  widget->AddChild(child);
 }
}
%}
[tokens and types skipped for brevity]
%%
widget: button
 { FinishWidget();
   PARSE_DEBUG(("Completed widget\n")); }
 ;
widget_attribute:
 widget_area
 ;
/* Widget area: x, y, width, height */
widget_area:
 AREA '{' number ',' number ',' number ',' number '}'
 { widget->SetArea($3, $5, $7, $9);
   PARSE_DEBUG(("Area: %dx%d at (%d,%d)\n", $7, $9, $3, $5)); }
 ;
/* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */
/* The button widget */
button:
 button_tag '{' button_attributes '}'
 { PARSE_DEBUG(("Completed button\n")); }
 ;
button_tag:
 BUTTON name
 { StartWidget(new WidgetButton($2));
   PARSE_DEBUG(("Starting a button: %s\n", $2));
   free($2); }
 ;
/* The button widget attributes */
button_attributes:
 button_attribute
| button_attributes button_attribute
 ;
button_attribute:
 widget
| widget_attribute
| button_normal_image
| button_hover_image
 ;
button_normal_image:
 IMAGE file
 { ((WidgetButton *)widget)->LoadNormalImage($2);
   PARSE_DEBUG(("Button normal image: %s\n", $2));
   free($2); }
 ;
button_hover_image:
 HOVERIMAGE file
 { ((WidgetButton *)widget)->LoadHoverImage($2);
   PARSE_DEBUG(("Button hover image: %s\n", $2));
   free($2); }
 ;

回页首

以下是我们的主菜单，以它作为使用这种技术构建的 GUI 的示例：

main_menu.gui

background "main_menu" {
 image "main_menu"
 button "new_game" {
  area { 32, 80, 370, 64 }
  image "main_menu-new"
  hover_image "main_menu-new_hi"
  #onclick [ new_gui("new_game") ]
  onclick [ new_gui("character_screen") ]
 }
 button "load_game" {
  area { 32, 152, 370, 64 }
  image "main_menu-load"
  hover_image "main_menu-load_hi"
  onclick [ new_gui("load_game") ]
 }
 button "save_game" {
  area { 32, 224, 370, 64 }
  image "main_menu-save"
  hover_image "main_menu-save_hi"
  onclick [ new_gui("save_game") ]
 }
 button "preferences" {
  area { 32, 296, 370, 64 }
  image "main_menu-prefs"
  hover_image "main_menu-prefs_hi"
  onclick [ new_gui("preferences") ]
 }
 button "quit_game" {
  area { 32, 472, 370, 64 }
  image "main_menu-quit"
  hover_image "main_menu-quit_hi"
  onclick [ quit_game() ]
 }
}

在这个屏幕描述中，窗口小部件和属性由语法分析器进行分析，按钮回调由脚本语法分析器解释。new_gui() 和 quit_game() 是导出到脚本机制的内部函数。

回页首

在我们的脚本和 GUI 设计语言的设计过程中，lex 和 yacc 是非常重要的工具。它们起初似乎令人胆怯，但使用过一段时间后，您就会发现它们使用起来既方便有顺手。下个月请再度光临我们的专栏，届时我们将开始把它们组合在一起，并带领您进入 Pirates Ho!的世界。

您可以参阅本文在 developerWorks 全球站点上的英文原文.
请访问 Pirates Ho! 网站
可以下载该样本的源码：
- eval.tar.gz（脚本示例源码）
- snapshot-043000.tar.gz（游戏源码快照）
，O'Reilly，1992 年
developerWorks上的 Pirates Ho! 系列：

Sam Lantinga 是 Simple DirectMedia Layer (SDL) 库的作者，现在是的首席程序员，这家公司致力于生产最畅销的 Linux 游戏。他与 Linux 和游戏打交道开始于 1995 年，从事各种 DOOM! 工具移植，以及将 Macintosh 游戏 Maelstrom移植到 Linux。

Lauren MacDonell 是 SkilledNursing.com 的一位技术作家，也是 "Pirates Ho!" 的合作开发者。在工作、写书或跳舞之余，她照管着热带鱼。

阅读(1784) | 评论(0) | 转发(0) |

上一篇：深入研究 C++中的 STL Deque 容器

下一篇：setjmp and longjmp

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6