Categorizing Words and Category Values

前端 未结 21 1702
温柔的废话
温柔的废话 2021-01-31 05:49

We were set an algorithm problem in class today, as a \"if you figure out a solution you don\'t have to do this subject\". SO of course, we all thought we will give it a go.

21条回答
  •  难免孤独
    2021-01-31 06:32

    So it seems you have a couple options here, but for the most part I think if you want accurate data you are going to need to use some outside help. Two options that I can think of would be to make use of a dictionary search, or crowd sourcing.

    In regards to a dictionary search, you could just go through the database, query it and parse the results to see if one of the category names is displayed on the page. For example, if you search "red" you will find "color" on the page and likewise, searching for "fishing" returns "sport" on the page.

    Another, slightly more outside the box option would be to make use of crowd sourcing, consider the following:

    1. Start by more or less randomly assigning name-value pairs.
    2. Output the results.
    3. Load the results up on Amazon Mechanical Turk (AMT) to get feedback from humans on how well the pairs work.
    4. Input the results of the AMT evaluation back into the system along with the random assignments.
    5. If everything was approved, then we are done.
    6. Otherwise, retain the correct hits and process them to see if any pattern can be established, generate a new set of name-value pairs.
    7. Return to step 3.

    Granted this would entail some financial outlay, but it might also be one of the simplest and accurate versions of the data you are going get on a fairly easy basis.

提交回复
热议问题