全部博文(290)
分类:
2007-12-01 17:48:28
Tutorial 2: MessageBox
第二课:消息框
In this tutorial, we will create a fully functional Windows program that displays a message box saying "Win32 assembly is great!".
在这一课,我们将要创建一个完整的windows窗口程序用来弹出一个消息框并显示“Win32 assembly is great!”
Download the example file here.
从这里下载这个例子文件
Theory:
原理:
Windows prepares a wealth of resources for Windows programs. Central to this is the Windows API (Application Programming Interface). Windows API is a huge collection of very useful functions that reside in Windows itself, ready for use by any Windows programs. These functions are stored in several dynamic-linked libraries (DLLs) such as kernel32.dll, user32.dll and gdi32.dll. Kernel32.dll contains API functions that deal with memory and process management. User32.dll controls the user interface aspects of your program. Gdi32.dll is responsible for graphics operations. Other than "the main three", there are other DLLs that your program can use, provided you have enough information about the desired API functions.
Windows 为窗口程序准备了大量的资源,Windows API (应用程序接口)是其中最重要的一种。Windows API是一个宠大的非常有用的函数集合,这些函数驻留在windows内部,并且时刻准备着被windows程序调用。这些函数被存储在几个动态链接库中(dlls )如kernel32.dll, user32.dll and gdi32.dll。 kernel32.dll包含的API函数用来处理内存和进程管理。User32。dll 控制并管理用户程序的界面外观。Gdi32。dll 为图形操作负责。除了这三个主要的,你也可以在程序中用其他的DLLS 假如你有足够的信息来描述这些API 函数。
Windows programs dynamically link to these DLLs, ie. the codes of API functions are not included in the Windows program executable file. In order for your program to know where to find the desired API functions at runtime, you have to embed that information into the executable file. The information is in import libraries. You must link your programs with the correct import libraries or they will not be able to locate API functions.
Windows 程序在执行的动态链接这些dlls 文件,动态链接库里的API函数代码并不真正的包含在windows 程序的可执行文件中。 为了让你的程序在运行时知道在那儿能找到它想要的API函数。这些信息被输入到程序库文件中。你必须将你的程序和输入库文件准确的连接,否则它们将不能当作局部API函数使用。
When a Windows program is loaded into memory, Windows reads the information stored in the program. That information includes the names of functions the program uses and the DLLs those functions reside in. When Windows finds such info in the program, it'll load the DLLs and perform function address fixups in the program so the calls will transfer control to the right function.
当一个WINDOWS程序被装进内存的时候,windows操作系统读这存储在这个程序中的信息。这些信息包括程序使用的函数名和这些函数驻留在那些dll 中,当windows操作系统在程序中找到了这些信息,windows将 装入这个dll 并且修正函数的执行地址,这样在调用时才能正确的将控制权转移到函数内部。
There are two categories of API functions: One for ANSI and the other for Unicode. The names of API functions for ANSI are postfixed with "A", eg. MessageBoxA. Those for Unicode are postfixed with "W" (for Wide Char, I think). Windows 95 natively supports ANSI and Windows NT Unicode.
We are usually familiar with ANSI strings, which are arrays of characters terminated by NULL. ANSI character is 1 byte in size. While ANSI code is sufficient for European languages, it cannot handle several oriental languages which have several thousands of unique characters. That's why UNICODE comes in. A UNICODE character is 2 bytes in size, making it possible to have 65536 unique characters in the strings.
But most of the time, you will use an include file which can determine and select the appropriate API functions for your platform. Just refer to API function names without the postfix.
这里有两种类别的API函数:一种是ANSI (美国国家标准协会)另一种是统一字符编码标准。ANSI标准的API函数名字后缀是A 如:MessageBoxA 。而Unicode的后缀是 W (因为是宽字符,我认为),windows95 天然的支持ANSI和Windows NT支持 Unicode.我们通常熟悉的是ANSI字串串是以NULL为结束符的字符数组。ANSI字符占一个字节。虽然ANSI编码对于欧洲语言来说已经足够。但对于有几千个唯一字符的东方语言体系而言,就只能用UNICODE了。一个unicode占两个字节。这样就可以在一个字串中表示65546个unicode字符了。
但是大多数时候,你将用以个include文件就能为你的平台确定并选择适当的API函数。不过访问的API函数已经没有后缀。
{
实际上是在定义 .h 头文件时,我们用了预处理命令来告诉编译器应该选择那种类别的API函数 如:
#ifdef UNICODE
#define foo() fooW()
#else
#define foo() fooA()
#endif
}
Example:
I'll present the bare program skeleton below. We will flesh it out later.
我将在下面介绍一个空的程序框架,稍后,我们再充实它。
.386
.model flat, stdcall
.data
.code
start:
end start
The execution starts from the first instruction immediately below the label specified after end directive. In the above skeleton, the execution will start at the first instruction immediately below start label. The execution will proceed instruction by instruction until some flow-control instructions such as jmp, jne, je, ret etc is found. Those instructions redirect the flow of execution to some other instructions. When the program needs to exit to Windows, it should call an API function, ExitProcess.
可执行文件从END后面那个标号指定的第一条指令处开始执行。在上面的框架中,可执行文件将立即起始于START标号后的第一条指令,然后顺序地执行后续指令直到如 JMP, JNE,JE,这样一些控制跳转指令被发现。这些指令将使程序将执行控制权转移给其它指令。 (即,跳到跳转指令后面的指令处执行)当一个程序需要退出WINDOWS时 ,它应该调用Exitprocess 这个API函数
ExitProcess proto uExitCode:DWORD
The above line is called a function prototype. A function prototype defines the attributes of a function to the assembler/linker so it can do type-checking for you. The format of a function prototype is like this:
上面这行是被调用的函数原型。一个函数的原型定义了这个函数的属性,用来告诉汇编程序或是连接器该函数的属性。这样汇编程序和编译器将为你做相关的类型检查。函数原型的格式是这样地:
FunctionName PROTO [ParameterName]:DataType,[ParameterName]:DataType,...
函数名字 PROTO [参数名]:数据类型,[参数名] :数据类型
In short, the name of the function followed by the keyword PROTO and then by the list of data types of the parameters,separated by commas.
简而言之,就是在函数名后面跟随关键字PROTO,再在PROTO后面跟随一个函数参数的数据类型链表,各个参数间用逗号隔开。,
In the ExitProcess example above, it defines ExitProcess as a function which takes only one parameter of type DWORD. Functions prototypes are very useful when you use the high-level call syntax, invoke. You can think of invoke as a simple call with type-checking. For example, if you do:
在上面ExitProcess的例子中,它定义一个ExitProcess的函数,这个函数仅接收一个DWORD类型的参数。当你使用高级调用语法INVOKE 的时候,函数原型是很有用的。你可以将INVOKE看成一个简单的类型检查的调用。例如,假设你这样写:
call ExitProcess
without pushing a dword onto the stack, the assembler/linker will not be able to catch that error for you. You'll notice it later when your program crashes. But if you use:
这种调用方式,没有将一个双字压进堆栈,你的编译器将不能为你捕获到错误。毫无疑问,你稍后就可以注意到你的程序崩溃。但是如果你这样用:
invoke ExitProcess
The linker will inform you that you forgot to push a dword on the stack thus avoiding error. I recommend you use invoke instead of simple call. The syntax of invoke is as follows:
连接器将通知你,你忘记推一个DWORD的数据进栈 ,这样就避免了错误。我推荐你使用INVOKE来替代简单调用,invoke的语法如下:
INVOKE expression [,arguments]
expression can be the name of a function or it can be a function pointer. The function parameters are separated by commas.
表达式可以是一个函数的名字也可以是一函数的指针。函数的参数被逗号隔开。
Most of function prototypes for API functions are kept in include files. If you use hutch's MASM32, they will be in MASM32/include folder. The include files have .inc extension and the function prototypes for functions in a DLL is stored in .inc file with the same name as the DLL. For example, ExitProcess is exported by kernel32.lib so the function prototype for ExitProcess is stored in kernel32.inc.
对于大多数API函数原型声明被保存在INCLUDE所包含的头文件中,如果您用的是 hutch 的 MASM32,这些头文件在文件夹MASM32/include 下,这些头文件以inc作为后缀。DLL文件中的函数原型被存储在inc.file中并且与动态连接库有着一样的名字。例如:exitprocess 被导出在kernel32。Lib(库文件)所以exitprocess这个函数原型被存储在kerner32。Inc 中。
You can also create function prototypes for your own functions.
Throughout my examples, I'll use hutch's windows.inc which you can download from
你也可以为你自己的函数创建函数原型声明。
在我所有的例子中。 我将用HUTCH’S的windows。Inc 你可以从 上下载它。
Now back to ExitProcess, uExitCode parameter is the value you want the program to return to Windows after the program terminates. You can call ExitProcess like this:
现在回到 exitprocess ,uExitcode这个参数的值是当你的程序结束时返回给windows的。你可以这样来调用这个函数:
invoke ExitProcess, 0
Put that line immediately below start label, you will get a win32 program which immediately exits to Windows, but it's a valid program nonetheless.
把这一行直接放在开始标号下面 ,你将使得这个win32 程序立即退出windows。但是它仍然是一个有效的WIN32 程序。
.386
.model flat, stdcall
option casemap:none
include \masm32\include\windows.inc
include \masm32\include\kernel32.inc
includelib \masm32\lib\kernel32.lib
.data
.code
start:
invoke ExitProcess,0
end start
option casemap:none tells MASM to make labels case-sensitive so ExitProcess and exitprocess are different. Note a new directive, include.
Option casemap:none 告诉MASM要区分大小写,所以ExitProcess 和exitprocess是不同的。请注意新的伪指令 include
This directive is followed by the name of a file you want to insert at the place the directive is. In the above example, when MASM processes the line include \masm32\include\windows.inc, it will open windows.inc which is in \MASM32\include folder and process the content of windows.inc as if you paste the content of windows.inc there. hutch's windows.inc contains definitions of constants and structures you need in win32 programming. It doesn't contain any function prototype. windows.inc is by no means comprehensive. hutch and I try to put as many constants and structures into it as possible but there are still many left to be included. It'll be constantly updated. Check out hutch's and my homepage for updates.
这条伪指令后面跟随一文件名,它的作用是告诉编译器将你想要的文件插入在这条指令所在的地方。在上面的例子中,当MASM处理 include \masm32\include\windows.inc这一行时,它将打开在\masm32\inlude 文件夹的windows.inc 文件并且处理windows.inc的内容,就好像你将windows.inc这个文件的内容粘贴在这里一样。 Hutch的windows.inc文件定义许多你需要在WIN32 程序中使用的的数据结构和常数。它并不包含任何函数原型。尽管hutch和我试图把许多的常数和数据结构尽可能多的包含进去,但仍然与许多没有被包含进去。它需要经常的更新,请点击hutch和我的主页进行更新。
From windows.inc, your program got constant and structure definitions. Now for function prototypes, you need to include other include files. They are all stored in \masm32\include folder.
从WINDOWS.inc 你的程序得到了常量和数据结构的定义。现在为了得到函数原型,你需要包含其它的一些文件。它们被存储在 \masm32\include folder
In our example above, we call a function exported by kernel32.dll, so we need to include the function prototypes from kernel32.dll. That file is kernel32.inc. If you open it with a text editor, you will see that it's full of function prototypes for kernel32.dll. If you don't include kernel32.inc, you can still call ExitProcess but only with simple call syntax. You won't be able to invoke the function. The point here is that: in order to invoke a function, you have to put its function prototype somewhere in the source code. In the above example, if you don't include kernel32.inc, you can define the function prototype for ExitProcess anywhere in the source code above the invoke command and it will work. The include files are there to save you the work of typing out the prototypes yourself so use them whenever you can.
在我们上面的例子中:我们调用一个被KERNEL32.DLL导出的函数。所以我们需要从kerner32.dll中把函数原型包含进来。这个文件就是kernel32。Inc 如果你用文本编辑软件打开它,你将看到它全部都是kernel32.dll的函数原型声明。
如果你不用include包含kernel32.inc 文件,你将不能调用Exitprocess但是你可以使用简单的调用语法来调用 call ExitProcessni 你将不能用INVOKE这个语法句型来调用函数。这里指出:为了使用invoke调用函数,你必须知道函数原型的声明放在源文件的某处。 在上面的例子中,如果你不把KERNEL32.INC这个文件包含进来,你只能为Exitprocess函数在源代码中定义函数原型并且必须在INVOKE 调用命令前。这些包含文件节在那里 节省了你自己进行函数原型声明的工作,所以你可以在任何时候使用它们 。
Now we encounter a new directive, includelib. includelib doesn't work like include. It 's only a way to tell the assembler what import library your program uses. When the assembler sees an includelib directive, it puts a linker command into the object file so that the linker knows what import libraries your program needs to link with. You're not forced to use includelib though. You can specify the names of the import libraries in the command line of the linker but believe me, it's tedious and the command line can hold only 128 characters.
现在我们遭遇一个新的指令,includelib includelib不想include一样来工作。它仅仅是告诉汇编程序你的程序用了那些导人库。当汇编程序看到一个includelib指令的时候,它把连接器命令装进目标文件,以致一让连接器知道在你的程序中那些导人库需要被连接。尽管你可以非使用includelib不可,你还能用连接器的命令来指定导人库的名字,但是相信我,这是单调乏味的,因为命令行只能传递128个字符(所以你要不停的在那敲代码。)
Now save the example under the name msgbox.asm. Assuming that ml.exe is in your path, assemble msgbox.asm with
现在用msgbox.asm这个命令保存这个例子,把ml.exe放进你的环境变量种,用下面这个命令来汇编msgbox.asm :
ml /c /coff /Cp msgbox.asm
/c 告诉MASM仅仅汇编。并没有调用LINK。Exe 在大多时候,你将不愿意自动调用link.exe 因为你必须在调用link。Exe之前完成一些任务。
/coff tells MASM to create .obj file in COFF format. MASM uses a variation of COFF (Common Object File Format) which is used under Unix as its own object and executable file format.
/coff 告诉MASM 创建一个COFF格式的文件。MASM的格式是COFF(通用目标文件格式)的一个变种,它被用在UNIX作为它自己的目标文件和可执行文件格式。
/Cp tells MASM to preserve case of user identifiers. If you use hutch's MASM32 package, you may put "option casemap:none" at the head of your source code, just below .model directive to achieve the same effect.
/CP 告诉MASM保持用户定义的标识符的状态不变。如果你用HUTCH的MASM32 软件包,你可以放置 “option casemap:none ”在你的源代码的.model 指令下面可以达到同样的效果。
After you successfully assemble msgbox.asm, you will get msgbox.obj. msgbox.obj is an object file. An object file is only one step away from an executable file. It contains the instructions/data in binary form. What is lacking is some fixups of addresses by the linker.
在你成功的汇编了MSGBOX。ASM后,你将得到msgbox。Obj 目标文件。一个目标文件只需要一步就可以成为一个可执行文件。它包含二进制格式的代码和数据,它缺少的只是被连接器组织好的地址系列。也就是重定位信息。
Then go on with link:
继续连接:
link /SUBSYSTEM:WINDOWS /LIBPATH:c:\masm32\lib msgbox.obj
/SUBSYSTEM:WINDOWS informs Link what sort of executable this program is
/SUBSYSTEM:WINDOWS 告诉连接器可执行文件运行的平台是
/LIBPATH:
/LIBPATH:<导入库文件的路径>告诉连接器导入库在那,如果你用MASM32,它们将在MASM32\LIB 文件夹中
Link reads in the object file and fixes it with addresses from the import libraries. When the process is finished you get msgbox.exe.
连接器把这个OBJ文件读进内存并组织它来自导入库的地址信息(也叫重定位信息)。当这个过程完成你得到了msgbox.exe可执行文件。
Now you get msgbox.exe. Go on, run it. You'll find that it does nothing. Well, we haven't put anything interesting into it yet. But it's a Windows program nonetheless. And look at its size! In my PC, it is 1,536 bytes.
现在你生成了msgbox.exe。继续下去,运行它,你将发现它什么都没做。好。
我们还没有放置任何有趣的东西在它里面。但是它已经是一个WINDOWS窗口程序了,(windows的程序有两种,一种是窗口程序,(run可见的),一种是控制台程序run 不可见)我们再看它的大小在我的 Pc里,它占1536个字节。
Next we're going to put in a message box. Its function prototype is:
下面我们放置一个消息框在这个程序中。它的函数原型是这样声明的:
MessageBox
PROTO hwnd:DWORD,
lpText:DWORD,
lpCaption:DWORD,
uType:DWORD
hwnd is the handle to parent window. You can think of a handle as a number that represents the window you're referrring to. Its value is not important to you. You only remember that it represents the window. When you want to do anything with the window, you must refer to it by its handle.
Hwnd 是一个指向父窗口的句柄。你可以认为句柄就是一个数值,用来表征你所涉及的窗口,它的数值对你来说并不重要,你唯一要记住的就是它表征一个窗口,当你想对窗口做任何交互的时候,你必须通过那个窗口的句柄来引用它。
lpText is a pointer to the text you want to display in the client area of the message box. A pointer is really an address of something. A pointer to text string==The address of that string.
Lptext 是一个指向你想显示在消息框客户区中的文本的指针。一个指针是某个对象的实际地址。一个指向文本字符串的指针==那个字符串的地址。
lpCaption is a pointer to the caption of the message box
Lpcaption 是一个指向 消息框标题的指针。
uType specifies the icon and the number and type of buttons on the message box
utype 指定在消息框上的一个图标或是按钮的编号类型。
Let's modify msgbox.asm to include the message box.
让我们修改MSGBOX的源程序文件让它包含这个消息框。
.386
.model flat,stdcall
option casemap:none
include \masm32\include\windows.inc
include \masm32\include\kernel32.inc
includelib \masm32\lib\kernel32.lib
include \masm32\include\user32.inc
includelib \masm32\lib\user32.lib
.data ; 数据节区定义 在8086中的数据段。
MsgBoxCaption db "Iczelion Tutorial No.2",0
MsgBoxText db "Win32 Assembly is Great!",0
.code
start:
invoke MessageBox, NULL, addr MsgBoxText, addr MsgBoxCaption, MB_OK
invoke ExitProcess, NULL
end start
Assemble and run it. You will see a message box displaying the text "Win32 Assembly is Great!".
汇编并运行它,你将看到一个消息框显示一串字符“win32 Assembly is great”
Let's look again at the source code.
让我们在看一源代码:
We define two zero-terminated strings in .data section.
我们定义两个以0 结束的字符串在 。data 节区
Remember that every ANSI string in Windows must be terminated by NULL (0 hexadecimal).
记得吗 所有windows中的ANSI字符串必须终止于NULL字符(define NULL 0X 0 )
We use two constants, NULL and MB_OK. Those constants are documented in windows.inc. So you can refer to them by name instead of the values. This improves readability of your source code.
我们用两个常量,NULL和MB_OK,这两个常量在WINDOWS。INC 文件中有定义。
所有你能用名字来代替它们的值来引用它。这样可以改进你程序的可读性。
The addr operator is used to pass the address of a label to the function. It's valid only in the context of invoke directive. You can't use it to assign the address of a label to a register/variable, for example. You can use offset instead of addr in the above example. However, there are some differences between the two:
这ADDR 操作被用来传递以个标号的地址给函数。它仅仅对有INVOKE指令的上下文环境中有效。你不能用它分配一个标号的地址给寄存器或是变量。例如:你能用offset 来替代 addr 在上面的例子中。然而它们两个之间是有一些不同的:
Addr 不能处理向前引用而offset能,例如:如果标号被定义在源代码中距离invoke这一行代码较远的地方,addr 将不能工作。
invoke MessageBox,NULL, addr MsgBoxText,addr MsgBoxCaption,MB_OK
......
MsgBoxCaption db "Iczelion Tutorial No.2",0
MsgBoxText db "Win32 Assembly is Great!",0
MASM will report error. If you use offset instead of addr in the above code snippet, MASM will assemble it happily.
MASM将报告错。如果你在上面的代码片断中用OFFSET替代了ADDR,masm 将愉快的汇编它。
2.addr can handle local variables while offset cannot. A local variable is only some reserved space in the stack. You will only know its address during runtime. offset is interpreted during assembly time by the assembler.
addr 能操作局部变量而OFFSET不能。一个局部变量是一些在堆栈中的备用空间。你只能当程序运行的时才知道这个局部变量的具体地址。Offset是伪指令,它只能在汇编程序汇编的时间才能被解释。
So it's natural that offset won't work for local variables. addr is able to handle local variables because of the fact that the assembler checks first whether the variable referred to by addr is a global or local one. If it's a global variable, it puts the address of that variable into the object file. In this regard, it works like offset. If it's a local variable, it generates an instruction sequence like this before it actually calls the function:
所有,很自然的,offset指令不能为局部变量工作。Addr能操作局部变量是因为事实上汇编程序首先检查被ADDR引用的这个变量是局部的还是全局的。如果是全局变量,它就把这个变量的地址装进OBJ目标文件。在这点上,它工作像offset。如果它是局部变量,那么在它被函数调用前将产生像这样的一系列指令。
lea eax, LocalVar
push eax
Since lea can determine the address of a label at runtime, this works fine.
既然lea能在运行时确定一个标号的地址,这个工作就是完美的。
文章下载自
风向改变翻译于2007年12.1日 下午