Understanding OpenCV LBP implementation

半城伤御伤魂 提交于 2019-11-30 05:21:05
Iwillnotexist Idonotexist

I refer you to my own answer from the past which lightly touches on the topic, but didn't explain the XML cascade format.

Let's look at a fake, modified for clarity example of a cascade with only a single stage, and three features.

<!-- stage 0 -->
<_>
  <maxWeakCount>3</maxWeakCount>
  <stageThreshold>-0.75</stageThreshold>
  <weakClassifiers>
    <!-- tree 0 -->
    <_>
      <internalNodes>
        0 -1 3 -67130709 -21569 -1426120013 -1275125205 -21585
        -16385 587145899 -24005</internalNodes>
      <leafValues>
        -0.65 0.88</leafValues></_>
    <!-- tree 1 -->
    <_>
      <internalNodes>
        0 -1 0 -163512766 -769593758 -10027009 -262145 -514457854
        -193593353 -524289 -1</internalNodes>
      <leafValues>
        -0.77 0.72</leafValues></_>
    <!-- tree 2 -->
    <_>
      <internalNodes>
        0 -1 2 -363936790 -893203669 -1337948010 -136907894
        1088782736 -134217726 -741544961 -1590337</internalNodes>
      <leafValues>
        -0.71 0.68</leafValues></_></weakClassifiers></_>

Somewhat later....

<features>
  <_>
    <rect>
      0 0 3 5</rect></_>
  <_>
    <rect>
      0 0 4 2</rect></_>
  <_>
    <rect>
      0 0 6 3</rect></_>
  <_>
    <rect>
      0 1 4 3</rect></_>
  <_>
      <rect>
      0 1 3 3</rect></_>

...

Let us look first at the tags of a stage:

  • The maxWeakCount for a stage is the number of weak classifiers in the stage, what is called in the comments a <!-- tree --> and what I call an LBP feature.
    • In this example, the number of LBP features in stage 0 is 3.
  • The stageThreshold is what the weights of the features must add up to at least for the stage to pass.
    • In this example the stage threshold is -0.75.

Turning to the tags describing an LBP feature:

  • The internalNodes are an array of 11 integers. The first two are meaningless for LBP cascades. The third is the index into the <features> table of <rect>s at the end of the XML file (A <rect> describes the geometry of the feature). The last 8 values are eight 32-bit values which together constitute the 256-bit LUT I spoke of in my earlier answer. This LUT is computed by the training process, which I don't fully understand myself.
    • In this example, the first feature of the stage references rectangle 3, which is described by the four integers 0 1 4 3.
  • The leafValues are the two weights (pass/fail) associated with a feature. Depending on the bit selected from the internalNodes during feature evaluation, one of those two weights is added to a total. This total is compared to the stage's <stageThreshold>. Then, bool stagePassed = (sum >= stageThreshold - EPS);, where EPS is 1e-5, determines whether the stage has passed or failed. The weights are also determined by the training process.
    • In this example the first feature's fail weight is -0.65 and the pass weight is 0.88.

Lastly, the <feature> tag. It consists of an array of <rect> tags which contain 4 integers describing the geometry of the feature. Given a processing window (24x24 in your case), the first two integers describe its x and y integer pixel offset within the processing window, and the next two integers describe the width and height of one subrectangle out of the 9 that are needed for the LBP feature to be evaluated.

In essence then, a tag <rect> ft.x ft.y ft.width ft.height </rect> situated within a processing window pW.widthxpW.height checking whether a face is present at pW.xxpW.y corresponds to...

To evaluate the LBP then, it suffices to read the integral image at points p[0..15] and use p[BR]+p[TL]-p[TR]-p[BL] to compute the integral of the nine subrectangles. The central subrectangle, R4, is compared that of the eight others, clockwise starting from R0, to produce an 8-bit LBP (the bits are packed [msb 01258763 lsb]).

This 8-bit LBP is then used as an index into the feature's (2^8 = 256)-bit LUT (the <internalNodes>), selecting a single bit. If this bit is 1, the feature is inconsistent with a face; if 0, it is consistent with a face. The appropriate weight (<leafNode>) is then returned and added with the weights of all other features to produce an overall stage sum. This is then compared to <stageThreshold> to determine whether the stage passed or failed.

If there's something else I didn't explain well enough I can clarify.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!