Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1520419
  • 博文数量: 230
  • 博客积分: 474
  • 博客等级: 下士
  • 技术积分: 1955
  • 用 户 组: 普通用户
  • 注册时间: 2010-03-19 18:40
文章分类

全部博文(230)

文章存档

2020年(3)

2019年(3)

2018年(12)

2017年(13)

2016年(11)

2015年(55)

2014年(74)

2013年(39)

2012年(2)

2011年(18)

我的朋友

分类: C/C++

2018-08-30 23:09:27

转自:https://blog.csdn.net/qq_29573053/article/details/78517167

一般的人脸识别应用通常都包括三个过程:

1 人脸detect,这一步主要是定位人脸在图像中的位置,输出人脸的位置矩形框

2 人脸shape predictor,这一步主要是找出眼睛眉毛鼻子嘴巴的68个定位点

3 人脸对齐alignment,这一步主要是通过投影几何变换出一张标准脸

4 人脸识别,这一步主要在对齐的人脸图像上提取128维的特征向量,根据特征向量间的距离来进行判断识别。


本编文章主要想解读一下dlib中关于人脸对齐的源码。人脸对齐主要是一个affline transform即仿射变换,我们在detect到人脸后会得到一个矩形位置框,需要将这个矩形里面的人脸变换到150*150大小的标准人脸,英文叫做chip,首先看一下一个结构体,它包含变换的一些基本信息chip_detail:


  1. struct chip_details
  2. {
  3. chip_details() : angle(0), rows(0), cols(0) {}
  4. chip_details(const rectangle& rect_) : rect(rect_),angle(0), rows(rect_.height()), cols(rect_.width()) {}
  5. chip_details(const drectangle& rect_) : rect(rect_),angle(0),
  6. rows((unsigned long)(rect_.height()+0.5)), cols((unsigned long)(rect_.width()+0.5)) {}
  7. chip_details(const drectangle& rect_, unsigned long size) : rect(rect_),angle(0)
  8. { compute_dims_from_size(size); }
  9. chip_details(const drectangle& rect_, unsigned long size, double angle_) : rect(rect_),angle(angle_)
  10. { compute_dims_from_size(size); }
  11. chip_details(const drectangle& rect_, const chip_dims& dims) :
  12. rect(rect_),angle(0),rows(dims.rows), cols(dims.cols) {}
  13. chip_details(const drectangle& rect_, const chip_dims& dims, double angle_) :
  14. rect(rect_),angle(angle_),rows(dims.rows), cols(dims.cols) {}
  15. template <typename T>
  16. chip_details(
  17. const std::vectorvector2> >& chip_points,
  18. const std::vectorvector2> >& img_points,
  19. const chip_dims& dims
  20. ) :
  21. rows(dims.rows), cols(dims.cols)
  22. {
  23. DLIB_CASSERT( chip_points.size() == img_points.size() && chip_points.size() >= 2,
  24. "\t chip_details::chip_details(chip_points,img_points,dims)"
  25. << "\n\t Invalid inputs were given to this function."
  26. << "\n\t chip_points.size(): " << chip_points.size()
  27. << "\n\t img_points.size(): " << img_points.size()
  28. );
  29. const point_transform_affine tform = find_similarity_transform(chip_points,img_points);
  30. dlib::vector<double,2> p(1,0);
  31. p = tform.get_m()*p;
  32. // There are only 3 things happening in a similarity transform. There is a
  33. // rescaling, a rotation, and a translation. So here we pick out the scale and
  34. // rotation parameters.
  35. angle = std::atan2(p.y(),p.x());
  36. // Note that the translation and scale part are represented by the extraction
  37. // rectangle. So here we build the appropriate rectangle.
  38. const double scale = length(p);
  39. rect = centered_drect(tform(point(dims.cols,dims.rows)/2.0),
  40. dims.cols*scale,
  41. dims.rows*scale);
  42. }
  43. drectangle rect;//chip在原图中的位置大小
  44. double angle;//chip和原图目标间的角度
  45. unsigned long rows; //chip的行列大小
  46. unsigned long cols;
  47. inline unsigned long size() const
  48. {
  49. return rows*cols;
  50. }
  51. private:
  52. void compute_dims_from_size (
  53. unsigned long size
  54. )
  55. {
  56. const double relative_size = std::sqrt(size/(double)rect.area());
  57. rows = static_cast<unsigned long>(rect.height()*relative_size + 0.5);
  58. cols = static_cast<unsigned long>(size/(double)rows + 0.5);
  59. rows = std::max(1ul,rows);
  60. cols = std::max(1ul,cols);
  61. }
  62. };
这个结构体的构造函数,主要传入chip上的特征点和实际人脸的特征点,找到一个相似变换,然后提取角度,平移,缩放三个方面的信息,角度放在angle,平移和缩放放在rect


  1. template <typename T>
  2. chip_details(
  3. const std::vectorvector2> >& chip_points,
  4. const std::vectorvector2> >& img_points,
  5. const chip_dims& dims
  6. )
需要注意的是,这里的变换是从chip-》img的(img指原图),意思给定chip上一个点,可以找到img上面对应点。
接下来就是获取这个人脸的chip_detail方法了:


  1. inline chip_details get_face_chip_details (
  2. const full_object_detection& det,
  3. const unsigned long size = 200,
  4. const double padding = 0.2
  5. )
  6. {
  7. DLIB_CASSERT(det.num_parts() == 68,
  8. "\t chip_details get_face_chip_details()"
  9. << "\n\t You must give a detection with exactly 68 parts in it."
  10. << "\n\t det.num_parts(): " << det.num_parts()
  11. );
  12. DLIB_CASSERT(padding >= 0 && size > 0,
  13. "\t chip_details get_face_chip_details()"
  14. << "\n\t Invalid inputs were given to this function."
  15. << "\n\t padding: " << padding
  16. << "\n\t size: " << size
  17. );
  18. // Average positions of face points 17-67
  19. const double mean_face_shape_x[] = {
  20. 0.000213256, 0.0752622, 0.18113, 0.29077, 0.393397, 0.586856, 0.689483, 0.799124,
  21. 0.904991, 0.98004, 0.490127, 0.490127, 0.490127, 0.490127, 0.36688, 0.426036,
  22. 0.490127, 0.554217, 0.613373, 0.121737, 0.187122, 0.265825, 0.334606, 0.260918,
  23. 0.182743, 0.645647, 0.714428, 0.793132, 0.858516, 0.79751, 0.719335, 0.254149,
  24. 0.340985, 0.428858, 0.490127, 0.551395, 0.639268, 0.726104, 0.642159, 0.556721,
  25. 0.490127, 0.423532, 0.338094, 0.290379, 0.428096, 0.490127, 0.552157, 0.689874,
  26. 0.553364, 0.490127, 0.42689
  27. };
  28. const double mean_face_shape_y[] = {
  29. 0.106454, 0.038915, 0.0187482, 0.0344891, 0.0773906, 0.0773906, 0.0344891,
  30. 0.0187482, 0.038915, 0.106454, 0.203352, 0.307009, 0.409805, 0.515625, 0.587326,
  31. 0.609345, 0.628106, 0.609345, 0.587326, 0.216423, 0.178758, 0.179852, 0.231733,
  32. 0.245099, 0.244077, 0.231733, 0.179852, 0.178758, 0.216423, 0.244077, 0.245099,
  33. 0.780233, 0.745405, 0.727388, 0.742578, 0.727388, 0.745405, 0.780233, 0.864805,
  34. 0.902192, 0.909281, 0.902192, 0.864805, 0.784792, 0.778746, 0.785343, 0.778746,
  35. 0.784792, 0.824182, 0.831803, 0.824182
  36. };
  37. COMPILE_TIME_ASSERT(sizeof(mean_face_shape_x)/sizeof(double) == 68-17);
  38. std::vectorvector<double,2> > from_points, to_points;
  39. for (unsigned long i = 17; i < det.num_parts(); ++i)
  40. {
  41. // Ignore the lower lip
  42. if ((55 <= i && i <= 59) || (65 <= i && i <= 67))
  43. continue;
  44. // Ignore the eyebrows
  45. if (17 <= i && i <= 26)
  46. continue;
  47. dlib::vector<double,2> p;
  48. p.x() = (padding+mean_face_shape_x[i-17])/(2*padding+1);
  49. p.y() = (padding+mean_face_shape_y[i-17])/(2*padding+1);
  50. from_points.push_back(p*size);
  51. to_points.push_back(det.part(i));
  52. }
  53. return chip_details(from_points, to_points, chip_dims(size,size));
  54. }
这个首先定义标准脸特征点的位置,再解析出了chip_detail结构体。

再接下来就是提取人脸图片了:


  1. template <
  2. typename image_type1,
  3. typename image_type2
  4. >
  5. void extract_image_chips (
  6. const image_type1& img,
  7. const std::vector& chip_locations,
  8. dlib::array& chips
  9. )
  10. {
  11. // make sure requires clause is not broken
  12. #ifdef ENABLE_ASSERTS
  13. for (unsigned long i = 0; i < chip_locations.size(); ++i)
  14. {
  15. DLIB_CASSERT(chip_locations[i].size() != 0 &&
  16. chip_locations[i].rect.is_empty() == false,
  17. "\t void extract_image_chips()"
  18. << "\n\t Invalid inputs were given to this function."
  19. << "\n\t chip_locations["<"].size(): " << chip_locations[i].size()
  20. << "\n\t chip_locations["<"].rect.is_empty(): " << chip_locations[i].rect.is_empty()
  21. );
  22. }
  23. #endif
  24. pyramid_down<2> pyr;
  25. long max_depth = 0;
  26. // If the chip is supposed to be much smaller than the source subwindow then you
  27. // can't just extract it using bilinear interpolation since at a high enough
  28. // downsampling amount it would effectively turn into nearest neighbor
  29. // interpolation. So we use an image pyramid to make sure the interpolation is
  30. // fast but also high quality. The first thing we do is figure out how deep the
  31. // image pyramid needs to be.
  32. rectangle bounding_box;
  33. for (unsigned long i = 0; i < chip_locations.size(); ++i)
  34. {
  35. long depth = 0;
  36. double grow = 2;
  37. drectangle rect = pyr.rect_down(chip_locations[i].rect);
  38. while (rect.area() > chip_locations[i].size())
  39. {
  40. rect = pyr.rect_down(rect);
  41. ++depth;
  42. // We drop the image size by a factor of 2 each iteration and then assume a
  43. // border of 2 pixels is needed to avoid any border effects of the crop.
  44. grow = grow*2 + 2;
  45. }
  46. drectangle rot_rect;
  47. const vector<double,2> cent = center(chip_locations[i].rect);
  48. rot_rect += rotate_point<double>(cent,chip_locations[i].rect.tl_corner(),chip_locations[i].angle);
  49. rot_rect += rotate_point<double>(cent,chip_locations[i].rect.tr_corner(),chip_locations[i].angle);
  50. rot_rect += rotate_point<double>(cent,chip_locations[i].rect.bl_corner(),chip_locations[i].angle);
  51. rot_rect += rotate_point<double>(cent,chip_locations[i].rect.br_corner(),chip_locations[i].angle);
  52. bounding_box += grow_rect(rot_rect, grow).intersect(get_rect(img));
  53. max_depth = std::max(depth,max_depth);
  54. }
  55. //std::cout << "max_depth: " << max_depth << std::endl;
  56. //std::cout << "crop amount: " << bounding_box.area()/(double)get_rect(img).area() << std::endl;
  57. // now make an image pyramid
  58. dlib::arraytypename image_traits::pixel_type> > levels(max_depth);
  59. if (levels.size() != 0)
  60. pyr(sub_image(img,bounding_box),levels[0]);
  61. for (unsigned long i = 1; i < levels.size(); ++i)
  62. pyr(levels[i-1],levels[i]);
  63. std::vectorvector<double,2> > from, to;
  64. // now pull out the chips
  65. chips.resize(chip_locations.size());
  66. for (unsigned long i = 0; i < chips.size(); ++i)
  67. {
  68. // If the chip doesn't have any rotation or scaling then use the basic version
  69. // of chip extraction that just does a fast copy.
  70. if (chip_locations[i].angle == 0 &&
  71. chip_locations[i].rows == chip_locations[i].rect.height() &&
  72. chip_locations[i].cols == chip_locations[i].rect.width())
  73. {
  74. impl::basic_extract_image_chip(img, chip_locations[i].rect, chips[i]);
  75. }
  76. else
  77. {
  78. set_image_size(chips[i], chip_locations[i].rows, chip_locations[i].cols);
  79. // figure out which level in the pyramid to use to extract the chip
  80. int level = -1;
  81. drectangle rect = translate_rect(chip_locations[i].rect, -bounding_box.tl_corner());
  82. while (pyr.rect_down(rect).area() > chip_locations[i].size())
  83. {
  84. ++level;
  85. rect = pyr.rect_down(rect);
  86. }
  87. // find the appropriate transformation that maps from the chip to the input
  88. // image
  89. from.clear();
  90. to.clear();
  91. from.push_back(get_rect(chips[i]).tl_corner()); to.push_back(rotate_point<double>(center(rect),rect.tl_corner(),chip_locations[i].angle));
  92. from.push_back(get_rect(chips[i]).tr_corner()); to.push_back(rotate_point<double>(center(rect),rect.tr_corner(),chip_locations[i].angle));
  93. from.push_back(get_rect(chips[i]).bl_corner()); to.push_back(rotate_point<double>(center(rect),rect.bl_corner(),chip_locations[i].angle));
  94. point_transform_affine trns = find_affine_transform(from,to);
  95. // now extract the actual chip
  96. if (level == -1)
  97. transform_image(sub_image(img,bounding_box),chips[i],interpolate_bilinear(),trns);
  98. else
  99. transform_image(levels[level],chips[i],interpolate_bilinear(),trns);
  100. }
  101. }
  102. }
上面函数中,首先需要构造一个图像金字塔用于图像缩放,因为如果我们的chip是150*150的,映射到原图上的1000*1000,这个就需要有个缩放的过程,如果我们直接从缩放到150*150得到的图像质量不好,因此采用一级级下采样来缩放,函数中首先寻找到对应的目标下采样深度,进行图像下采样,接着通过find_affine_transform来计算到仿射变换矩阵,得到矩阵后直接transform_image就好了。transform_image里面就不分析了,基本就是一个个像素位置变换填充就可以了。
阅读(5001) | 评论(0) | 转发(0) |
给主人留下些什么吧!~~