Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1100700
  • 博文数量: 242
  • 博客积分: 10209
  • 博客等级: 上将
  • 技术积分: 3028
  • 用 户 组: 普通用户
  • 注册时间: 2008-03-12 09:27
文章分类

全部博文(242)

文章存档

2014年(1)

2013年(1)

2010年(51)

2009年(65)

2008年(124)

我的朋友

分类: C/C++

2010-07-28 17:41:52

什么是two-phase construction ?
它用于什么情况?

一、two-phase initialization 的介绍和其存在的必要性分析

摘自http://blogs.microsoft.co.il/blogs/sasha/archive/2008/08/19/two-phase-initialization.aspx

Two-phase initialization is an architectural pattern for artificially breaking and managing coupling between strongly coupled components.  The motivation and implementation of this pattern are not always obvious, so I will give a couple of examples to demonstrate.

Let’s take an operating system as an example.  Some of the components involved in the initialization of the operating system are the I/O manager, the memory manager, the object manager and many others.  At runtime, the strong coupling between the various components is obvious and beneficial – they tend to use each other, all the time.

However, during system startup, these dependencies (especially if startup is performed synchronously) can lead to a dead end.  For example:

  1. The memory manager initializes.  It needs to create shared memory objects (section objects) to represent binary images being loaded during system startup.  This requires a trip to the object manager.
  2. The object manager initializes.  It needs to allocate memory for the system handle table and for the actual resources being created.  This requires a trip to the memory manager.

Another example can be taken from an ESB infrastructure I have been implementing lately.  The infrastructure services include a configuration service, a publish/subscribe service and a “DNS”-style service.  These services are typically used by other system components, but they also need each other:

  1. The configuration service initializes.  It needs to register itself in the “DNS” service to be accessible by other system components.
  2. The “DNS” service initializes.  It needs to obtain its configuration and use the pub/sub service to register for configuration change notifications.
  3. The pub/sub service initializes.  It needs to obtain its configuration and register itself in the “DNS” service to be accessible by other system components.

Disentangling these dependencies can be done in various ways.  For example, we could say that the infrastructure services are not allowed to use each other – the pub/sub service will use local configuration, the “DNS” service will have a predefined list of registered endpoints, etc.

However, in an operating system we can’t resort to a solution in which the object manager manages its own memory, and the memory manager manages its own objects.

The only feasible alternative is two-phase initialization.

When using two-phase initialization, infrastructure components initialize in two phases.  In the first phase, they do not rely on any other components to reach a stable state in which they are able to provide basic services to the rest of the system.  In the second phase, they transition to a fully-functional state in which they rely on other components (which have not necessarily reached the second phase yet).

Using this model in our example, the “DNS” service can start with a predefined list of endpoints that will be used to communicate with the infrastructure services while they are in the first phase.  In the second phase, these predefined endpoints will be replaced by the actual endpoints for the actual services.  The pub/sub service can start with a local configuration during the first phase, and retrieve its configuration when the configuration service becomes available (enters the first phase), and so on.

Providing a generic implementation for all infrastructure and non-infrastructure services to account for two-phase initialization is exceptionally difficult, but achievable if the proper metadata is in place.  Components must provide metadata regarding their explicit dependencies and ways to make forward progress while these dependent components are not yet available.

This sounds simple, but in reality it really isn’t.  Multiple issues plague the two-phase initialization pattern, but do not undermine its principal validity:

  • Transitioning between initialization phases might require a significant amount of work.  For example, the pub/sub service might use a database to store the subscription information, and when transitioning to the second phase (by talking to the configuration service) the connection string to the database might have changed.
  • Deadlocks can be introduced into the startup sequence if initialization is not carefully asynchronous.
  • Terrible race conditions can be introduced into the startup sequence if it is not carefully synchronized for multiple threads of execution.
  • Lots of noise is generated in the system while it’s restarting or when some components are being reinitialized.

The two-phase initialization approach is used by Windows.  In the first phase (called phase 0), initialization proceeds in a single thread and bring up only the minimal services required for the second phase.  In the second phase (called phase 1), system components can rely on other components being present to start transitioning into their fully-functional state.

To summarize, two-phase initialization is difficult to manage and implement, but in the real world where components circularly depend on each other there is rarely a better alternative.


二、在c++的构造函数中使用该技术的例子: two-phase construction
摘自
>, Pete Isensee

An object with one-phase construction is fully "built" with the constructor. An object with two-phase construction is minimally initialized in the constructor and fully "built" using a class method. Frequently copied objects with expensive constructors and destructors can be serious bottlenecks and are great candidates for two-phase construction. Designing your classes to support two-phase construction, even if internally they use one-phase, will make future optimizations easy.

The following code shows two different objects, OnePhase and TwoPhase, based on a Bitmap class. They both have the same external interface. Their internals are quite different. The OnePhase object is fully initialized in the constructor. The code for OnePhase is very simple. The code for TwoPhase, on the other hand, is more complicated. The TwoPhase constructor simply initializes a pointer. The TwoPhase methods have to check the pointer and allocate the Bitmap object if necessary.

        
class OnePhase
{
private:
Bitmap m_bMap; // Bitmap is a "one-phase" constructed object
public:
bool Create(int nWidth, int nHeight)
{
return (m_bMap.Create(nWidth, nHeight));
}
int GetWidth() const
{
return (m_bMap.GetWidth());
}
};

class TwoPhase
{
private:
Bitmap* m_pbMap; // Ptr lends itself to two-phase construction
public:
TwoPhase()
{
m_pbMap = NULL;
}
~TwoPhase()
{
delete m_pbMap;
}
bool Create(int nWidth, int nHeight)
{
if (m_pbMap == NULL)
m_pbMap = new Bitmap;
return (m_pbMap->Create(nWidth, nHeight));
}
int GetWidth() const
{
return (m_pbMap == NULL ? 0 : m_pbMap->GetWidth());
}
};

What kind of savings can you expect? It depends. If you copy many objects, especially "empty" objects, the savings can be significant. If you don't do a lot of copying, two-phase construction can have a negative impact, because it adds a new level of indirection.


三、使用two-phase construction 解决calling virtual during initialization 问题

参考:

1.%2B%2B_Idioms/Calling_Virtuals_During_Initialization

2.

the Dynamic Binding During Initialization idiom (AKA Calling Virtuals During Initialization).

To clarify, we're talking about this situation:

 class Base {
 public:
   Base();
   
...
   virtual void foo(int n) const; 
// often 
   virtual double bar() const;    
// often 
   
// if you don't want outsiders calling these, make them protected
 };
 
 Base::Base()
 {
   
... foo(42) ... bar() ...
   
// these will  use dynamic binding
   
// goal: simulate dynamic binding in those calls
 }
 
 class Derived : public Base {
 public:
   
...
   virtual void foo(int n) const;
   virtual double bar() const;
 };

This FAQ shows some ways to simulate dynamic binding as if the calls made in Base's constructor dynamically bound to the this object's derived class. The ways we'll show have tradeoffs, so choose the one that best fits your needs, or make up another.

The first approach is a two-phase initialization. In Phase I, someone calls the actual constructor; in Phase II, someone calls an "init" method on the object. Dynamic binding on the this object works fine during Phase II, and Phase II is conceptually part of construction, so we simply move some code from the original Base::Base() into Base::init().

 class Base {
 public:
   void init();  
// may or may not be virtual
   
...
   virtual void foo(int n) const; 
// often 
   virtual double bar() const;    
// often 
 };
 
 void Base::init()
 {
   
... foo(42) ... bar() ...
   
// most of this is copied from the original Base::Base()
 }
 
 class Derived : public Base {
 public:
   
...
   virtual void foo(int n) const;
   virtual double bar() const;
 };

The only remaining issues are determining where to call Phase I and where to call Phase II. There are many variations on where these calls can live; we will consider two.

The first variation is simplest initially, though the code that actually wants to create objects requires a tiny bit of programmer self-discipline, which in practice means you're doomed. Seriously, if there are only one or two places that actually create objects of this hierarchy, the programmer self-discipline is quite localized and shouldn't cause problems.

In this variation, the code that is creating the object explicitly executes both phases. When executing Phase I, the code creating the object either knows the object's exact class (e.g., new Derived() or perhaps a local Derived object), or doesn't know the object's exact class (e.g., or some other factory). The "doesn't know" case is strongly preferred when you want to make it easy to plug-in new derived classes.

Note: Phase I often, but not always, allocates the object from the heap. When it does, you should store the pointer in some sort of , such as a , a , or some other object whose . This is the best way to prevent memory leaks when Phase II might . The following example assumes Phase I allocates the object from the heap.

 #include 
 
 void joe_user()
 {
   std::auto_ptr p(
/*...somehow create a Derived object via new...*/);
   p->init();
   
...
 }

The second variation is to combine the first two lines of the joe_user function into some create function. That's almost always the right thing to do when there are lots of joe_user-like functions. For example, if you're using some kind of factory, such as a registry and , you could move those two lines into a static method called Base::create():

 #include 
 
 class Base {
 public:
   
...
   typedef std::auto_ptr Ptr;  
// typedefs simplify the code
  

    template <class D, class Parameter>
static Ptr Create (Parameter p)
{
std::auto_ptr <Base> ptr (new D (p));
ptr->init ();
return ptr;
}
   ...
 };
 

This simplifies all the joe_user-like functions (a little), but more importantly, it reduces the chance that any of them will create a Derived object without also calling init() on it.

 void joe_user()
 {

      Base::Ptr b = Base::Create <Derived> ("para");
}

If you're sufficiently clever and motivated, you can even eliminate the chance that someone could create a Derived object without also calling init() on it. An important step in achieving that goal is to .



阅读(1199) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~