PHP/MySQL better user searching

ε祈祈猫儿з 提交于 2019-12-13 16:00:47

问题


In trying to rejuvinate code I wrote mostly 14+ years ago. I've come to see that the lovely little setup I wrote then was... lacking in certain places, namely handling user inputs.

Lesson: Never underestimate users ability to inject trash, typos, and dupes past your validators.

The old way is reaching critical mass as there are 470 items in a SELECT dropdown now. I want to reinvent this part of the process so I don't have to worry about it hitting a breaking point.

So the idea is to build a fuzzy search method so that after the typist enters the search string, we check against five pieces of data, all of which reside in the same row.

I need to check the name submitted against the Stage Name, two also-known-as names, as well as their legal name and as a final check against a soundex() index based on their Stage Name (this catches a few spelling errors missed otherwise)

I've tried a complicated block of code to check these things (and it doesn't work, mostly because I think I coded the comparisons too strict) as part of a do/while loop.

In the below, var $Rin would contain the user supplied name.

$setr = mysql_query("SELECT ID,StageName,AKA1,AKA2,LegalName,SoundEx FROM performers");
IF ($R = mysql_fetch_array($setr)) {
    do {
        $RT = substr(trim($Rin), 5);
        $RT1 = substr($R[1], 5);
        $RT2 = substr($R[2], 5);
        $RT3 = substr($R[3], 5);
        $RT4 = substr($R[4], 5);
        $RTx = soundex($RT);
        IF ($RT == $RT1) {
            $RHits[] = $R[0];
        }
        IF ($RT == $RT2) {
            $RHits[] = $R[0];
        }
        IF ($RT == $RT3) {
            $RHits[] = $R[0];
        }
        IF ($RT == $RT4) {
            $RHits[] = $R[0];
        }
        IF ($RTx == $R[5]) {
            $RHits[] = $R[0];
        }
    } while ($R = mysql_fetch_array($setr));
}

The idea being that I'll build an array of the ID#'s of the near hits, which I'll populate into a select dropdown that has only hopefully fewer hits that the whole table. Which means querying for a result set from the contents of that array, in order to display the Performer's name in the SELECT dropdown and pass the ID# as the value for those choices.

Thats when I hit the 'I need to use an array in my WHERE clause' problem, and after finding that answer, I am starting to suspect I'm out of luck due to Stipulation #2 below. So I started looking at alternate search methods and I'm not sure I've gotten anywhere but more confused.

So, is there a better way to scan a single table for six fields, checking five against user input and noting the sixth for display in a subset of the original table?

Thought process:

Against the whole table, per record, test $Rin against these tests in this order:

$Rin -> StageName
$Rin -> AKA1
$Rin -> AKA2
$Rin -> LegalName
soundex($Rin) -> SoundEx

where a hit on any of the five operations adds the ID# to a result array that is used to narrow the results from 470 performers down to a reasonable list to choose from.

Stipulations:

1) As written, I know this is vulnerable to an SQL injection attack.

2) Server runs PHP 4.4.9 and MySQL 4.0.27-Standard, I can't upgrade it. I've got to prove it works before money will be spent.

3) This is hobby-level stuff, not my day job.

4) Performers often use non-English names or elements in their names, and this has led to typos and duplication by the data entry typists.

I've found a lot of mysqli and PDO answers for this sort of thing, and I'm seeing a lot of things that only half make sense (like link #4 below). I'm working on getting up to speed on these things as I try and fix whats become broken.

Places already looked:

  1. PHP mysql using an array in WHERE clause
  2. PHP/MySQL small-scale fuzzy search
  3. MySQL SubString Fuzzy Search
  4. Sophisticated Name Lookup

回答1:


I mentioned in the comments that a Javascript typeahead library might be a good choice for you. I've found Twitter's Typeahead library and Bloodhound engine to be pretty robust. Unfortunately, the documentation is a mixed bag: so long as what you need is very similar to their examples, you're golden, but certain details (explanations of the tokenizers, for example) are missing.

In one of the several questions re Typeahead here on Stack Overflow, @JensAKoch says:

To be honest, I think twitter gave up on typeahead.js. We look at 13000 stars, a full bugtracker with no maintainer and a broken software, last release 2015. I think that speaks for itself, or not? ... So, try one of the forks: github.com/corejavascript/typeahead.js

Frankly, in a brief check, the documentation at the fork looks a bit better, if nothing else. You may wish to check it out.

Server-side code:

All of the caveats of using an old version of PHP apply. I highly recommend retooling to use PDO with PHP 5, but this example uses PHP 4 as requested.

Completely untested PHP code. json_encode() would be better, but it doesn't appear until PHP 5. Your endpoint would be something like:

headers("Content-Type: application/json");
$results = mysql_query(
   "SELECT ID,StageName,AKA1,AKA2,LegalName,SoundEx FROM performers"
);

$fields = array("ID","StageName","AKA1","AKA2","LegalName","SoundEx");

echo "[";

$first =  true;
while ($row = mysql_fetch_array($results)) {
    ($first) ? $first = false : echo ',';

    echo "\n\t,{";
    foreach($fields as $f) {
        echo "\n\t\t\"{$f}\": \"".$row[$f]."\"";
    }
    echo "\n\t}";
}
echo "]";

Client-side code:

This example uses a static JSON file as a stub for all of the results. If you anticipate your result set going over 1,000 entries, you should look into the remote option of Bloodhound. This would require you to write some custom PHP code to handle the query, but it would look largely similar to the end point that dumps all (or at least your most common) data.

var actors = new Bloodhound({
  // Each row is an object, not a single string, so we have to modify the
  // default datum tokenizer. Pass in the list of object fields to be
  // searchable.
  datumTokenizer: Bloodhound.tokenizers.obj.nonword(
    'StageName','AKA1','AKA2','LegalName','SoundEx'
  ),
  queryTokenizer: Bloodhound.tokenizers.whitespace,
  // URL points to a json file that contains an array of actor JSON objects
  // Visit the link to see details 
  prefetch: 'https://gist.githubusercontent.com/tag/81e4450de8eca805f436b72e6d7d1274/raw/792b3376f63f89d86e10e78d387109f0ad7903fd/dummy_actors.json'
});

// passing in `null` for the `options` arguments will result in the default
// options being used
$('#prefetch .typeahead').typeahead(
    {
        highlight: true
    },
   {
        name: 'actors',
        source: actors,
        templates: {
        empty: "<div class=\"empty-message\">No matches found.</div>",
        
        // This is simply a function that accepts an object.
        // You may wish to consider Handlebars instead.
        suggestion: function(obj) {
            return '<div class="actorItem">'
                + '<span class="itemStageName">'+obj.StageName+"</span>"
                + ', <em>legally</em> <span class="itemLegalName">'+obj.LegalName+"</span>"
        }
        //suggestion: Handlebars.compile('<div><strong>{{value}}</strong> – {{year}}</div>')
      },
      display: "LegalName" // name of object key to display when selected
      // Instead of display, you can use the 'displayKey' option too:
      // displayKey: function(actor) {
      //     return actor.LegalName;
      // }
});
/* These class names can me specified in the Typeahead options hash. I use the defaults here. */
    .tt-suggestion {
        border: 1px dotted gray;
        padding: 4px;
        min-width: 100px;
    }
    .tt-cursor {
        background-color: rgb(255,253,189);
    }
    
    /* These classes are used in the suggestion template */
    .itemStageName {
        font-size: 110%;
    }
    .itemLegalName {
        font-size: 110%;
        color: rgb(51,42,206);
    }
<script src="https://code.jquery.com/jquery-3.1.1.min.js"></script>
<script src="https://twitter.github.io/typeahead.js/releases/latest/typeahead.bundle.js"></script>


<p>Type something here. A good search term might be 'C'.</p>
<div id="prefetch">
  <input class="typeahead" type="text" placeholder="Name">
</div>

For ease, here is the Gist of the client-side code.



来源:https://stackoverflow.com/questions/41687712/php-mysql-better-user-searching

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!