dlib人脸对齐源码详解-mingfei10-ChinaUnix博客

mingfei10的ChinaUnix博客

首页　| 　博文目录　| 　关于我

mingfei10

博客访问： 1535297
博文数量： 230
博客积分： 474
博客等级：下士
技术积分： 1955
用户组：普通用户
注册时间： 2010-03-19 18:40

文章分类

全部博文（230）

python（21）

人工智能（10）
java（1）
SDN&NFV（2）
linux Manag（8）
linux Dev（8）
cloud（10）
Storage（10）
未分配的博文（170）

文章存档

2020年（3）

2019年（3）

2018年（12）

2017年（13）

2016年（11）

2015年（55）

2014年（74）

2013年（39）

2012年（2）

2011年（18）

我的朋友

cj83226

相关博文

dlib人脸对齐源码详解

分类： C/C++

2018-08-30 23:09:27

转自：https://blog.csdn.net/qq_29573053/article/details/78517167

一般的人脸识别应用通常都包括三个过程：

1 人脸detect，这一步主要是定位人脸在图像中的位置，输出人脸的位置矩形框

2 人脸shape predictor，这一步主要是找出眼睛眉毛鼻子嘴巴的68个定位点

3 人脸对齐alignment，这一步主要是通过投影几何变换出一张标准脸

4 人脸识别，这一步主要在对齐的人脸图像上提取128维的特征向量，根据特征向量间的距离来进行判断识别。

本编文章主要想解读一下dlib中关于人脸对齐的源码。人脸对齐主要是一个affline transform即仿射变换，我们在detect到人脸后会得到一个矩形位置框，需要将这个矩形里面的人脸变换到150*150大小的标准人脸，英文叫做chip，首先看一下一个结构体，它包含变换的一些基本信息chip_detail:


	
	
		
		
			
			
				
			

		

		
			
			
				struct chip_details
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				{
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				chip_details() : angle(0), rows(0), cols(0) {}
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				chip_details(const rectangle& rect_) : rect(rect_),angle(0), rows(rect_.height()), cols(rect_.width()) {}
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				chip_details(const drectangle& rect_) : rect(rect_),angle(0),
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				rows((unsigned long)(rect_.height()+0.5)), cols((unsigned long)(rect_.width()+0.5)) {}
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				chip_details(const drectangle& rect_, unsigned long size) : rect(rect_),angle(0)
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				{ compute_dims_from_size(size); }
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				chip_details(const drectangle& rect_, unsigned long size, double angle_) : rect(rect_),angle(angle_)
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				{ compute_dims_from_size(size); }
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				chip_details(const drectangle& rect_, const chip_dims& dims) :
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				rect(rect_),angle(0),rows(dims.rows), cols(dims.cols) {}
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				chip_details(const drectangle& rect_, const chip_dims& dims, double angle_) :
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				rect(rect_),angle(angle_),rows(dims.rows), cols(dims.cols) {}
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				template <typename T>
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				chip_details(
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				const std::vectorvector2> >& chip_points,
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				const std::vectorvector2> >& img_points,
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				const chip_dims& dims
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				) :
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				rows(dims.rows), cols(dims.cols)
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				{
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				DLIB_CASSERT( chip_points.size() == img_points.size() && chip_points.size() >= 2,
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				"\t chip_details::chip_details(chip_points,img_points,dims)"
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				<< "\n\t Invalid inputs were given to this function."
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				<< "\n\t chip_points.size(): " << chip_points.size()
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				<< "\n\t img_points.size():  " << img_points.size()
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				const point_transform_affine tform = find_similarity_transform(chip_points,img_points);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				dlib::vector<double,2> p(1,0);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				p = tform.get_m()*p;
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				// There are only 3 things happening in a similarity transform.  There is a
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				// rescaling, a rotation, and a translation.  So here we pick out the scale and
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				// rotation parameters.
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				angle = std::atan2(p.y(),p.x());
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				// Note that the translation and scale part are represented by the extraction
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				// rectangle.  So here we build the appropriate rectangle.
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				const double scale = length(p);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				rect = centered_drect(tform(point(dims.cols,dims.rows)/2.0),
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				dims.cols*scale,
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				dims.rows*scale);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				}
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				drectangle rect;//chip在原图中的位置大小
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				double angle;//chip和原图目标间的角度
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				unsigned long rows; //chip的行列大小
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				unsigned long cols;
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				inline unsigned long size() const 
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				{
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				return rows*cols;
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				}
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				private:
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				void compute_dims_from_size (
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				 unsigned long size
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				 ) 
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				{
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				const double relative_size = std::sqrt(size/(double)rect.area());
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				rows = static_cast<unsigned long>(rect.height()*relative_size + 0.5);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				cols  = static_cast<unsigned long>(size/(double)rows + 0.5);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				rows = std::max(1ul,rows);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				cols = std::max(1ul,cols);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				}
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				};

这个结构体的构造函数，主要传入chip上的特征点和实际人脸的特征点，找到一个相似变换，然后提取角度,平移，缩放三个方面的信息，角度放在angle，平移和缩放放在rect


	
	
		
		
			
			
				
			

		

		
			
			
				template <typename T>
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				chip_details(
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				const std::vectorvector2> >& chip_points,
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				const std::vectorvector2> >& img_points,
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				const chip_dims& dims
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				)

需要注意的是，这里的变换是从chip-》img的（img指原图），意思给定chip上一个点，可以找到img上面对应点。
接下来就是获取这个人脸的chip_detail方法了:


	
	
		
		
			
			
				
			

		

		
			
			
				inline chip_details get_face_chip_details (
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				const full_object_detection& det,
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				const unsigned long size = 200,
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				const double padding = 0.2
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				)
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				{
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				DLIB_CASSERT(det.num_parts() == 68,
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				"\t chip_details get_face_chip_details()"
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				<< "\n\t You must give a detection with exactly 68 parts in it."
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				<< "\n\t det.num_parts(): " << det.num_parts()
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				DLIB_CASSERT(padding >= 0 && size > 0,
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				"\t chip_details get_face_chip_details()"
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				<< "\n\t Invalid inputs were given to this function."
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				<< "\n\t padding: " << padding
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				<< "\n\t size:    " << size
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				// Average positions of face points 17-67
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				const double mean_face_shape_x[] = {
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				0.000213256, 0.0752622, 0.18113, 0.29077, 0.393397, 0.586856, 0.689483, 0.799124,
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				0.904991, 0.98004, 0.490127, 0.490127, 0.490127, 0.490127, 0.36688, 0.426036,
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				0.490127, 0.554217, 0.613373, 0.121737, 0.187122, 0.265825, 0.334606, 0.260918,
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				0.182743, 0.645647, 0.714428, 0.793132, 0.858516, 0.79751, 0.719335, 0.254149,
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				0.340985, 0.428858, 0.490127, 0.551395, 0.639268, 0.726104, 0.642159, 0.556721,
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				0.490127, 0.423532, 0.338094, 0.290379, 0.428096, 0.490127, 0.552157, 0.689874,
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				0.553364, 0.490127, 0.42689
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				};
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				const double mean_face_shape_y[] = {
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				0.106454, 0.038915, 0.0187482, 0.0344891, 0.0773906, 0.0773906, 0.0344891,
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				0.0187482, 0.038915, 0.106454, 0.203352, 0.307009, 0.409805, 0.515625, 0.587326,
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				0.609345, 0.628106, 0.609345, 0.587326, 0.216423, 0.178758, 0.179852, 0.231733,
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				0.245099, 0.244077, 0.231733, 0.179852, 0.178758, 0.216423, 0.244077, 0.245099,
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				0.780233, 0.745405, 0.727388, 0.742578, 0.727388, 0.745405, 0.780233, 0.864805,
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				0.902192, 0.909281, 0.902192, 0.864805, 0.784792, 0.778746, 0.785343, 0.778746,
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				0.784792, 0.824182, 0.831803, 0.824182
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				};
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				COMPILE_TIME_ASSERT(sizeof(mean_face_shape_x)/sizeof(double) == 68-17);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				std::vectorvector<double,2> > from_points, to_points;
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				for (unsigned long i = 17; i < det.num_parts(); ++i)
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				{
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				// Ignore the lower lip
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				if ((55 <= i && i <= 59) || (65 <= i && i <= 67))
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				continue;
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				// Ignore the eyebrows 
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				if (17 <= i && i <= 26)
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				continue;
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				dlib::vector<double,2> p;
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				p.x() = (padding+mean_face_shape_x[i-17])/(2*padding+1);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				p.y() = (padding+mean_face_shape_y[i-17])/(2*padding+1);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				from_points.push_back(p*size);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				to_points.push_back(det.part(i));
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				}
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				return chip_details(from_points, to_points, chip_dims(size,size));
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				}

这个首先定义标准脸特征点的位置，再解析出了chip_detail结构体。

再接下来就是提取人脸图片了：


	
	
		
		
			
			
				
			

		

		
			
			
				template <
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				typename image_type1,
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				typename image_type2
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				>
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				void extract_image_chips (
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				 const image_type1& img,
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				 const std::vector& chip_locations,
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				 dlib::array& chips
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				 )
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				{
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				// make sure requires clause is not broken
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				#ifdef ENABLE_ASSERTS
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				for (unsigned long i = 0; i < chip_locations.size(); ++i)
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				{
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				DLIB_CASSERT(chip_locations[i].size() != 0 &&
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				chip_locations[i].rect.is_empty() == false,
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				"\t void extract_image_chips()"
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				<< "\n\t Invalid inputs were given to this function."
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				<< "\n\t chip_locations["<"].size():            " << chip_locations[i].size()
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				<< "\n\t chip_locations["<"].rect.is_empty(): " << chip_locations[i].rect.is_empty()
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				}
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				#endif 
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				pyramid_down<2> pyr;
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				long max_depth = 0;
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				// If the chip is supposed to be much smaller than the source subwindow then you
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				// can't just extract it using bilinear interpolation since at a high enough
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				// downsampling amount it would effectively turn into nearest neighbor
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				// interpolation.  So we use an image pyramid to make sure the interpolation is
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				// fast but also high quality.  The first thing we do is figure out how deep the
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				// image pyramid needs to be.
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				rectangle bounding_box;
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				for (unsigned long i = 0; i < chip_locations.size(); ++i)
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				{
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				long depth = 0;
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				double grow = 2;
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				drectangle rect = pyr.rect_down(chip_locations[i].rect);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				while (rect.area() > chip_locations[i].size())
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				{
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				rect = pyr.rect_down(rect);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				++depth;
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				// We drop the image size by a factor of 2 each iteration and then assume a
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				// border of 2 pixels is needed to avoid any border effects of the crop.
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				grow = grow*2 + 2;
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				}
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				drectangle rot_rect;
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				const vector<double,2> cent = center(chip_locations[i].rect);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				rot_rect += rotate_point<double>(cent,chip_locations[i].rect.tl_corner(),chip_locations[i].angle);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				rot_rect += rotate_point<double>(cent,chip_locations[i].rect.tr_corner(),chip_locations[i].angle);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				rot_rect += rotate_point<double>(cent,chip_locations[i].rect.bl_corner(),chip_locations[i].angle);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				rot_rect += rotate_point<double>(cent,chip_locations[i].rect.br_corner(),chip_locations[i].angle);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				bounding_box += grow_rect(rot_rect, grow).intersect(get_rect(img));
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				max_depth = std::max(depth,max_depth);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				}
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				//std::cout << "max_depth: " << max_depth << std::endl;
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				//std::cout << "crop amount: " << bounding_box.area()/(double)get_rect(img).area() << std::endl;
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				// now make an image pyramid
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				dlib::arraytypename image_traits::pixel_type> > levels(max_depth);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				if (levels.size() != 0)
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				pyr(sub_image(img,bounding_box),levels[0]);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				for (unsigned long i = 1; i < levels.size(); ++i)
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				pyr(levels[i-1],levels[i]);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				std::vectorvector<double,2> > from, to;
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				// now pull out the chips
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				chips.resize(chip_locations.size());
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				for (unsigned long i = 0; i < chips.size(); ++i)
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				{
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				// If the chip doesn't have any rotation or scaling then use the basic version
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				// of chip extraction that just does a fast copy.
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				if (chip_locations[i].angle == 0 &&
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				chip_locations[i].rows == chip_locations[i].rect.height() &&
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				chip_locations[i].cols == chip_locations[i].rect.width())
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				{
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				impl::basic_extract_image_chip(img, chip_locations[i].rect, chips[i]);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				}
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				else
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				{
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				set_image_size(chips[i], chip_locations[i].rows, chip_locations[i].cols);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				// figure out which level in the pyramid to use to extract the chip
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				int level = -1;
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				drectangle rect = translate_rect(chip_locations[i].rect, -bounding_box.tl_corner());
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				while (pyr.rect_down(rect).area() > chip_locations[i].size())
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				{
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				++level;
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				rect = pyr.rect_down(rect);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				}
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				// find the appropriate transformation that maps from the chip to the input
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				// image
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				from.clear();
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				to.clear();
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				from.push_back(get_rect(chips[i]).tl_corner());  to.push_back(rotate_point<double>(center(rect),rect.tl_corner(),chip_locations[i].angle));
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				from.push_back(get_rect(chips[i]).tr_corner());  to.push_back(rotate_point<double>(center(rect),rect.tr_corner(),chip_locations[i].angle));
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				from.push_back(get_rect(chips[i]).bl_corner());  to.push_back(rotate_point<double>(center(rect),rect.bl_corner(),chip_locations[i].angle));
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				point_transform_affine trns = find_affine_transform(from,to);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				// now extract the actual chip
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				if (level == -1)
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				transform_image(sub_image(img,bounding_box),chips[i],interpolate_bilinear(),trns);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				else
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				transform_image(levels[level],chips[i],interpolate_bilinear(),trns);
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				}
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				}
			

		

	

	
		
		
			
			
				
			

		

		
			
			
				}

上面函数中，首先需要构造一个图像金字塔用于图像缩放，因为如果我们的chip是150*150的，映射到原图上的1000*1000，这个就需要有个缩放的过程，如果我们直接从缩放到150*150得到的图像质量不好，因此采用一级级下采样来缩放，函数中首先寻找到对应的目标下采样深度，进行图像下采样，接着通过find_affine_transform来计算到仿射变换矩阵，得到矩阵后直接transform_image就好了。transform_image里面就不分析了，基本就是一个个像素位置变换填充就可以了。

阅读(5030) | 评论(0) | 转发(0) |

上一篇： dlib人脸检测源码解析

下一篇：深入理解python多进程编程

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6