分类: C/C++
2009-02-20 11:52:03
Keywords: marshal, demarshal, alignment, byte order, CDR, TAO.
Abstract: CORBA client application and server application can be distributed on different hardware platforms, different operating systems, even the client application and server application can be developed using different CORBA implementation. To get this kind of interoperability,marshal and demarshal play an important role, this paper research the marshal and demarshal mechanism in TAO, a popular open source CORBA implementation.
1. What is marshal and demarshal
CORBA provides platform-independent programming interfaces and models for portable distributed object-oriented computing applications. Its independence from programming languages, computing platforms, and networking protocols makes it highly suitable for the development of new applications and their integration into existing distributed systems.
CORBA application usually consists of client and server which cross heterogeneous computer network. In order to exchange data between distributed client and server, there are many aspects needed to be considered, for example, byte ordering, data alignment, efficiency and so on. To solve these problems, CORBA defines CDR to be a common data format which can be exchanged between client and server crossing various computer networks. Marshal means client or server convert the native format data to CDR format, Demarshal means client or server converts CDR format data to native format data. CDR plays an important role in marshal and demarshal, next section we’ll learn about CDR in detail.
2. CDR concept
CDR(Common Data Representation) determines the binary layout of IDL types for transmission. CDR has the following main characteristics.
CDR supports both big-endian and little-endian representation.
CDR-encoded data is tagged to indicate the byte ordering of the data. This means that both big-endian and little-endian machines can send data in their native format. If the sender and receiver use different byte ordering, the receiver is responsible for byteswapping. This model, called receiver makes it right, has the advantage that if both sender and receiver have the same endianness, they can communicate using the native data representation of their respective machines. This is preferable to encodings such as XDR, which require big-endian encoding on the wire and therefore penalize communication if both sender and receiver use little-endian machines.
CDR aligns primitive types on natural boundaries.
CDR aligns primitive data types on byte boundaries that are natural for most machine architectures. For example, short values are aligned on a 2-byte boundary, long values are aligned on a 4-byte boundary, and double values are aligned on an 8-byte boundary. Encoding data according to these alignments wastes some bandwidth because part of a CDR-encoded byte stream consists of padding bytes. However, despite the padding, CDR is more efficient than a more compact encoding because, in many cases, data can be marshaled and demarshaled simply by pointing at a value that is stored in memory in its natural binary representation. This approach avoids expensive data copying during marshaling.
CDR-encoded data is not self-identifying.
CDR is a binary encoding that is not self-identifying. For example, if an operation requires two in parameters, a long followed by a double, the marshaled data consists of 16 bytes. The first 4 bytes contain the long value, the next 4 bytes are padding with undefined contents to maintain alignment, and the final 8 bytes contain the double value. The receiver simply sees 16 bytes of data and must know in advance that these 16 bytes contain a long followed by a double in order to correctly demarshal the parameters. This means that CDR encoding requires an agreement between sender and receiver about the types of data that are to be exchanged. This agreement is established by the IDL definitions that are used to define the interface between sender and receiver. The receiver has no way to prevent misinterpretation of data if the agreement is violated. For example, if the sender sends two double values instead of a long followed by a double, the receiver still gets 16 bytes of data but will silently misinterpret the first 4 bytes of the first double value as a long value.
3. Implementation detail
Let’s take TAO (The ACE ORB) as an example to study how marshal and demarshal is implemented. TAO is a famous open source CORBA implementation, which has been used by many industries and companies. You can refer to learn more about TAO.
TAO provides two classes TAO_OutputCDR and TAO_InputCDR to perform marshal and demarshal work, these two classes are derived from ACE_OutputCDR and ACE_InputCDR. Because derivation classes TAO_OutputCDR and TAO_InputCDR are almost the same as their base classes, that is, they have no new public methods comparing with base classes, let’s dig into the ACE_OutputCDR and ACE_InputCDR.
The ACE_OutputCDR and ACE_InputCDR classes provide a highly optimized, portable, and convenient means to marshal and demarshal data using the standard CORBA Common Data Representation (CDR) format. ACE_Output CDR creates a CDR buffer from a data structure (marshaling) and ACE_lnputCDR extracts data from a CDR buffer (demarshaling).
The ACE_OutputCDR and ACE_lnputCDR classes support the following features:
• They provide operations to (de)marshal the following types:
- Primitive types, for example, booleans; 16-, 32-, and 64-bit integers; 8-bit octets; single and double precision floating point numbers; characters; and strings
- Arrays of primitive types
• The insertion (<<) and extraction (>>) operators can be used to marshal and demarshal primitive types, using the same syntax as the C++ iostream components.
• They take advantage of CORBA CDR alignment and byte-ordering rules to avoid expensive memory copying and byte-swapping operations, respectively.
• They provide optimized byte swapping code that uses inline assembly language instructions for common hardware platforms, such as Intel x86, and the standard htons ( ) , htonl ( ) , ntohs ( ) , and ntohl ( ) macros/functions on other platforms.
• They support zero copy marshaling and demarshaling of octet buffers.
• Users can define custom character set translators for platforms that do not use ASCII or UNICODE as their native character sets.
Here is an example about how to use ACE_OutputCDR and ACE_InputCDR, it is very straightforward. First, to marshal data to CDR stream, we just need use insertion (<<) operator for primitive types and write_*_array functions for array..
then, to demarshal data from CDR stream, we can use extraction (>>) operator for primitive types and read_*_array functions for array.
3.1 Data alignment
Actually, there are two problems needed to be solved during marshal and demarshal: memory alignment and byte order. Let’s first focus on the memory alignment problem. The insertion(<<) operator is overloaded for each primitive types, these operator actually call the corresponding member functions of ACE_OutputCDR, for example,
When write data to CDR stream, such as char, short, long, float, we must write data on aligned boundary, CORBA IDL defines the alignment for each data type, char is one byte size, it can be written on any address boundary, short should be written on 2 byte address boundary, long and float should be written on 4 bytes address boundary, double should be aligned on 8 bytes address boundary. How to get a memory address that is met the specific alignment requirement? There is a general solution, which can efficiently align "value" up to "alignment". The condition is that all such boundaries are binary powers and that we're using two's complement arithmetic.
Since the alignment is a power of two, its binary representation is:
alignment = 0...010...0
hence
alignment - 1 = 0...001...1 = T1
so the complement is:
~(alignment - 1) = 1...110...0 = T2
Notice that there is a multiple of
Then
because the & operator only changes the last bits, and since X is a multiple of
So we have the following macro that return the next integer aligned to a required boundary
Now we can get the aligned address to a required boundary easily by using ACE_align_binary. For instance, if we specify the alignment is 1, then ACE_align_binary(ptr, alignment) is extended as ((ptr + ((ptrdiff_t)(1-1))) & (~((ptrdiff_t)(1-1)))), that is ptr. That means we can put one byte size data (for example, char) on any address. If we specify the alignment is 2 and ptr is 21, then ACE_align_binary(ptr, alignment) is ((21 + ((ptrdiff_t)(2-1))) & (~((ptrdiff_t)(2-1)))), which is 22 & ~1, so we get the result 22.
3.2 Byte order
The second problem is byte order. This problem need to be solved when demarshal the CDR stream. That is, when the receiver read data from CDR stream. The extraction (>>) operator is overloaded for each primitive types, these operator actually call the corresponding member functions of ACE_InputCDR, for example,
The byte order problem does not exist for char, because it has only one byte. When read data bigger than one byte (for example, 2 bytes, 4 bytes …), the receiver need consider whether bye order of the CDR stream data is same as its native byte order, if not, then swapping the order of the data bytes is needed.
First we see how to swap 2 bytes data, such as short, unsigned short. In below code snippet, orig is CDR stream buffer will be read, target is place will hold the result after swap. The code is straightforward, let’s take 0x1234 as example, usrc << 8 has the result 0x3400. and usrc >> 8 has the result 0x0012, so (usrc << 8) | (usrc >> 8) has result 0x3412, the two bytes swapped.
Below code is how to swap 4 bytes data, such as long, float. Let’s take 0x12345678 as example. x << 24 gets 0x78000000, (x & 0xff00) << 8 gets 0x00560000, (x & 0xff0000) >> 8 gets 0x00003400, x >> 24 gets 00000012, so the final result is 0x78563412.
Below code is how to swap 8 bytes data, such as long long, double. On the 64 bit architecture platform, there is 64 bit data, but on 32 bit architecture platform, two 32 bit data can be used to form a 64 bit data, following code distinguish these two cases. We’ll focus on the 64 bit architecture case, because 32 bit architecture case is similar with above swap_4 function. Let’s take 0x12345678 AABBCCDD as example. (x & 0x000000ff000000ffUL) << 24 gets 0x78000000 DD000000. (x & 0x0000ff000000ff00UL) << 8 gets 0x00560000 00CC0000, (x & 0x00ff000000ff0000UL) >> 8 gets 0x00003400 0000BB00, (x & 0xff000000ff000000UL) >> 24 gets 0x00000012 000000AA, then BIT OR result is 0x78563412DDCCBBAA, so the final result is 0xDDCCBBAA7856341.
At last, below code show how to swap 16 bytes data, base on the above swap_8 function, it is easy to do.
4. Summary
Marshal and demarshal play an important role in CORBA interoperability. Understanding how to align data and how to solve byte order problem are key points to comprehend the marshal and demarshal mechanism. When get understanding this, CORBA interoperability is not mysterious any more.
References.
[1] Addison.Wesley.Advanced.CORBA.Programming.with.C++
[2] ~schmidt/TAO.html