Understanding OpenCV LBP implementation

后端 未结 1 495
伪装坚强ぢ
伪装坚强ぢ 2020-12-29 14:26

I need some help on LBP based face detection and that is why I am writing this.

I have the following questions related to face detection implemented on OpenCV:

相关标签:
1条回答
  • 2020-12-29 14:57

    I refer you to my own answer from the past which lightly touches on the topic, but didn't explain the XML cascade format.

    Let's look at a fake, modified for clarity example of a cascade with only a single stage, and three features.

    <!-- stage 0 -->
    <_>
      <maxWeakCount>3</maxWeakCount>
      <stageThreshold>-0.75</stageThreshold>
      <weakClassifiers>
        <!-- tree 0 -->
        <_>
          <internalNodes>
            0 -1 3 -67130709 -21569 -1426120013 -1275125205 -21585
            -16385 587145899 -24005</internalNodes>
          <leafValues>
            -0.65 0.88</leafValues></_>
        <!-- tree 1 -->
        <_>
          <internalNodes>
            0 -1 0 -163512766 -769593758 -10027009 -262145 -514457854
            -193593353 -524289 -1</internalNodes>
          <leafValues>
            -0.77 0.72</leafValues></_>
        <!-- tree 2 -->
        <_>
          <internalNodes>
            0 -1 2 -363936790 -893203669 -1337948010 -136907894
            1088782736 -134217726 -741544961 -1590337</internalNodes>
          <leafValues>
            -0.71 0.68</leafValues></_></weakClassifiers></_>
    

    Somewhat later....

    <features>
      <_>
        <rect>
          0 0 3 5</rect></_>
      <_>
        <rect>
          0 0 4 2</rect></_>
      <_>
        <rect>
          0 0 6 3</rect></_>
      <_>
        <rect>
          0 1 4 3</rect></_>
      <_>
          <rect>
          0 1 3 3</rect></_>
    

    ...

    Let us look first at the tags of a stage:

    • The maxWeakCount for a stage is the number of weak classifiers in the stage, what is called in the comments a <!-- tree --> and what I call an LBP feature.
      • In this example, the number of LBP features in stage 0 is 3.
    • The stageThreshold is what the weights of the features must add up to at least for the stage to pass.
      • In this example the stage threshold is -0.75.

    Turning to the tags describing an LBP feature:

    • The internalNodes are an array of 11 integers. The first two are meaningless for LBP cascades. The third is the index into the <features> table of <rect>s at the end of the XML file (A <rect> describes the geometry of the feature). The last 8 values are eight 32-bit values which together constitute the 256-bit LUT I spoke of in my earlier answer. This LUT is computed by the training process, which I don't fully understand myself.
      • In this example, the first feature of the stage references rectangle 3, which is described by the four integers 0 1 4 3.
    • The leafValues are the two weights (pass/fail) associated with a feature. Depending on the bit selected from the internalNodes during feature evaluation, one of those two weights is added to a total. This total is compared to the stage's <stageThreshold>. Then, bool stagePassed = (sum >= stageThreshold - EPS);, where EPS is 1e-5, determines whether the stage has passed or failed. The weights are also determined by the training process.
      • In this example the first feature's fail weight is -0.65 and the pass weight is 0.88.

    Lastly, the <feature> tag. It consists of an array of <rect> tags which contain 4 integers describing the geometry of the feature. Given a processing window (24x24 in your case), the first two integers describe its x and y integer pixel offset within the processing window, and the next two integers describe the width and height of one subrectangle out of the 9 that are needed for the LBP feature to be evaluated.

    In essence then, a tag <rect> ft.x ft.y ft.width ft.height </rect> situated within a processing window pW.widthxpW.height checking whether a face is present at pW.xxpW.y corresponds to...

    http://i.stack.imgur.com/NL0XX.png

    To evaluate the LBP then, it suffices to read the integral image at points p[0..15] and use p[BR]+p[TL]-p[TR]-p[BL] to compute the integral of the nine subrectangles. The central subrectangle, R4, is compared that of the eight others, clockwise starting from R0, to produce an 8-bit LBP (the bits are packed [msb 01258763 lsb]).

    This 8-bit LBP is then used as an index into the feature's (2^8 = 256)-bit LUT (the <internalNodes>), selecting a single bit. If this bit is 1, the feature is inconsistent with a face; if 0, it is consistent with a face. The appropriate weight (<leafNode>) is then returned and added with the weights of all other features to produce an overall stage sum. This is then compared to <stageThreshold> to determine whether the stage passed or failed.

    If there's something else I didn't explain well enough I can clarify.

    0 讨论(0)
提交回复
热议问题