Chinaunix首页 | 论坛 | 博客
  • 博客访问: 3101012
  • 博文数量: 94
  • 博客积分: 2599
  • 博客等级: 少校
  • 技术积分: 990
  • 用 户 组: 普通用户
  • 注册时间: 2006-08-30 23:23
文章分类

全部博文(94)

文章存档

2012年(1)

2011年(7)

2010年(24)

2009年(61)

2008年(1)

我的朋友

分类: LINUX

2009-04-04 13:28:30

Brief overview of the Dalvik virtual machine and its insights.

Contents


Introduction

The Dalvik virtual machine is a register-based virtual machine, designed and written by Dan Bornstein with contributions from other Google engineers as part of the .

It is optimized for low memory requirements, and is designed to allow multiple VM instances to run at once, relying on the underlying operating system for process isolation, memory management and threading support. Dalvik is often referred to as a Java Virtual Machine, but this is not strictly accurate, as the bytecode on which it operates is not Java bytecode. Instead, a tool named dx, included in the Android SDK, transforms the Java Class files of Java classes compiled by a regular Java compiler into another class file format (the .dex format).

Dex File Format

Android programs are compiled into .dex (Dalvik Executable) files, which are in turn zipped into a single .apk (Android Package) file on the device. .dex files can be created by automatically translating compiled applications written in the Java programming language.

Notes: Initial reverse engineering of the Dex file format was conducted and published by Michael Pavone at his site www.retrodev.com. After some time his site went down and for the purpose to keep the information publicly available it has been reproduced here.

File Header

Dex files start with a simple header with some checksums and offsets to other structures

Offset Size Description
0x0 8 'Magic' value: "dex\n009\0"
0x8 4 Checksum
0xC 20 SHA-1 Signature
0x20 4 Length of file in bytes
0x24 4 Length of header in bytes (currently always 0x5C)
0x28 8 Padding (reserved for future use?)
0x30 4 Number of strings in the string table
0x34 4 Absolute offset of the string table
0x38 4 Not sure. String related
0x3C 4 Number of classes in the class list
0x40 4 Absolute offset of the class list
0x44 4 Number of fields in the field table
0x48 4 Absolute offset of the field table
0x4C 4 Number of methods in the method table
0x50 4 Absolute offset of the method table
0x54 4 Number of class definitions in the class definition table
0x58 4 Absolute offset of the class definition table

Notes: All non-string fields are stored in little-endian format. It would appear that the checksum and signature fields are assumed to be zero when calculating the checksum and signature.

String Table

This table stores the length and offsets for every string in the Dex file including string constants, class names, variable names and more. Each entry has the following format:

Offset Size Description
0x0 4 Absolute offset of the string data
0x4 4 Length of the string (not including the null-terminator)

Notes: Although the length of the string is stored in this table. All strings also have C-style null-terminators

 

Class List

A list of all classes referenced or conatined in this dex file. Each entry has the following format:

Offset Size Description
0x0 4 String index of the name of the class

 

Field Table

A table of fields of all classes defined in this dex file. Each entry has the following format:

Offset Size Description
0x0 4 Class index of the class this field belongs to
0x4 4 String index of the field name
0x8 4 String index of the field type descriptor

 

Method Table

A table of methods of all classes defined in this dex file. Each entry has the following format:

Offset Size Description
0x0 4 Class index of the class this field belongs to
0x4 4 String index of the method name
0x8 4 String index of the method type descriptor

 

Class Definition Table

A table of class definitions for all classes either defined in this dex file or has a method or field accessed by code in this dex file. Each entry has the following format:

Offset Size Description
0x0 4 Class index
0x4 4 Access Flags (not 100% sure what this is for, I think it has to do with private/protected/public status)
0x8 4 Index of superclass
0xC 4 Absolute offset of interface list
0x10 4 Absolute offset of static field list
0x14 4 Absolute offset of instance field list
0x18 4 Absolute offset of direct method list
0x1C 4 Absolute offset of virtual method list

Notes: Any of the list offset fields can be NULL in which case the class doesn't have any elements of that type. Not every class in the class list will necessarily have an entry in the class definition table.

 

Field List

Stores data for pre-initialized fields in a class. The list is formed of a 32-bit integer containing the number of entries followed by the entries themselves. Each field has an entry with the following format:

Offset Size Description
0x0 8 Index of string or object constant or literal "primitive" constant

Notes: If the field does not have a pre-initialized value it will be filled with 0 for primitive types and -1 for object types.

 

Method List

A list of methods for a particular class. Begins with a 32-bit integer that contains the number of items in the list followed by entries in the following format.

Offset Size Description
0x0 4 Method index
0x4 4 Access flags (not 100% sure what this is for, I think it has to do with private/protected/public status)
0x8 4 Throws list off (no idea what this is)
0xC 4 Absolute offset of header for code that implements the method

 

Code Header

This header contains information about the code that implements a method.

Offset Size Description
0x0 2 Number of registers used by this method
0x2 2 Number of inputs this method takes (includes "this" pointer for non-static methods)
0x4 2 Output size? (presumably the size of whatever object the method returns)
0x6 2 Padding
0x8 4 String index of the source file name this method is implemented in
0xC 4 Absolute offset of the actual code that implements this method
0x10 4 Absolute offset of the list of exceptions this method can throw (not 100% sure)
0x14 4 Absolute offset of the list of address and line number pairs for debugging purposes
0x1C 4 Absolute offset of the local variable list of this method (includes arguments to the method and "this")

Notes: The code offset field actually points to a 32-bit integer that contains the number of 16-bit words in the instruction stream. The actual VM instructions follow this integer.

 

Local Variable List

A list of local variables for a particular method. Begins with a 32-bit integer that contains the number of items in the list. Each entry has the following format:

Offset Size Description
0x0 4 Start (not a clue)
0x4 4 End (not a clue)
0x8 4 String index of variable name
0xC 4 String index of variable type descriptor
0x10 4 Register number this variable will be stored in (not 100% sure)

Notes: This list will include local variables that are arguments to the method as well as the "this" variable for non-static methods.

Links

阅读(958) | 评论(0) | 转发(0) |
0

上一篇:Android编译环境

下一篇:C代码优化方案

给主人留下些什么吧!~~