JavaScript fuzzy search

安稳与你 提交于 2019-12-18 10:52:32

问题


I'm working on this filtering thing where I have about 50-100 list items. And each items have markup like this:

<li>
  <input type="checkbox" name="services[]" value="service_id" />
  <span class="name">Restaurant in NY</span>
  <span class="filters"><!-- hidden area -->
    <span class="city">@city: new york</span>
    <span class="region">@reg: ny</span>
    <span class="date">@start: 02/05/2012</span>
    <span class="price">@price: 100</span>
  </span>
</li>

I created markup like this because I initally used List.js

So, probably you guessed already, what I want is to do searches like this:@region: LA @price: 124 and so on. The problem is that i also want to display more than one item, in order to select more than... one :)

I assume this needs fuzzy search, but the problem is that i didn't find anything functional.

Any idea or starting point?

// edit: because I have a fairly small amount of items, I would like a client side solution.


回答1:


I was looking for "fuzzy search" in javascript but haven't found a solution here, so I wrote my own function that does what I need.

The algorithm is very simple: loop through needle letters and check if they occur in the same order in the haystack:

String.prototype.fuzzy = function (s) {
    var hay = this.toLowerCase(), i = 0, n = -1, l;
    s = s.toLowerCase();
    for (; l = s[i++] ;) if (!~(n = hay.indexOf(l, n + 1))) return false;
    return true;
};

e.g.:

('a haystack with a needle').fuzzy('hay sucks');    // false
('a haystack with a needle').fuzzy('sack hand');    // true



回答2:


One year later, List.js got a nice plugin for fuzzy search that works pretty great.




回答3:


Another (simple) solution. Non case-sensitive and ignores order of the letters.

It performs a check for every letter of the search-term. If the original string contains that letter, it will count up (or down if it doesn't). Based on the ratio of matches / string-length it will return true or false.

String.prototype.fuzzy = function(term, ratio) {
    var string = this.toLowerCase();
    var compare = term.toLowerCase();
    var matches = 0;
    if (string.indexOf(compare) > -1) return true; // covers basic partial matches
    for (var i = 0; i < compare.length; i++) {
        string.indexOf(compare[i]) > -1 ? matches += 1 : matches -=1;
    }
    return (matches/this.length >= ratio || term == "")
};

Examples:

("Test").fuzzy("st", 0.5) // returns true
("Test").fuzzy("tes", 0.8) // returns false cause ratio is too low (0.75)
("Test").fuzzy("stet", 1) // returns true
("Test").fuzzy("zzzzzest", 0.75) // returns false cause too many alien characters ("z")
("Test").fuzzy("es", 1) // returns true cause partial match (despite ratio being only 0.5)



回答4:


I have a little function, searching a string in an array ( at least for me it produces better results than levenshtein ):

function fuzzy(item,arr) {
  function oc(a) {
    var o = {}; for (var i=0; i<a.length; i++) o[a[i]] = ""; return o;
  }
  var test = [];
  for (var n=1; n<=item.length; n++)
    test.push(item.substr(0,n) + "*" + item.substr(n+1,item.length-n));
  var result = [];
  for (var r=0; r<test.length; r++) for (var i=0; i<arr.length; i++) {
    if (arr[i].toLowerCase().indexOf(test[r].toLowerCase().split("*")[0]) != -1)
    if (arr[i].toLowerCase().indexOf(test[r].toLowerCase().split("*")[1]) != -1)
    if (0 < arr[i].toLowerCase().indexOf(test[r].toLowerCase().split("*")[1]) 
          - arr[i].toLowerCase().indexOf(test[r].toLowerCase().split("*")[0] < 2 ) )
    if (!(arr[i] in oc(result)))  result.push(arr[i]);
  }
  return result;
}



回答5:


And I made my own. It uses regex and serves more like a proof-of-concept as it is completely not stress-tested.

enjoy the javascript fuzzy search/fuzzy match http://unamatasanatarai.github.io/FuzzyMatch/test/index.html




回答6:


I wasn't satisfied with list.js, so I created my own. This is probably not exactly fuzzy search, but I don't know what to call it. I simply wanted it to match a query without regard to the order of my words in the query.

Consider the following scenario:

  • a collection of articles in memory exists
  • order of query words appearance doesn't matter (e.g. "hello world" vs "world hello")
  • The code should be easily readable

Here is an example:

var articles = [{
  title: '2014 Javascript MVC Frameworks Comparison',
  author: 'Guybrush Treepwood'
}, {
  title: 'Javascript in the year 2014',
  author: 'Herman Toothrot'
},
{
  title: 'Javascript in the year 2013',
  author: 'Rapp Scallion'
}];

var fuzzy = function(items, key) {
  // Returns a method that you can use to create your own reusable fuzzy search.

  return function(query) {
    var words  = query.toLowerCase().split(' ');

    return items.filter(function(item) {
      var normalizedTerm = item[key].toLowerCase();

      return words.every(function(word) {
        return (normalizedTerm.indexOf(word) > -1);
      });
    });
  };
};


var searchByTitle = fuzzy(articles, 'title');

searchByTitle('javascript 2014') // returns the 1st and 2nd items

Well, I hope this helps someone out there.




回答7:


Solutions provided here returns true/false and no information about what part was matched and what part was not.

In some cases, you might need to know it eg. to make parts of your input bold in the search results

I've created my own solution in typescript (if you want to use it - I've published it here - https://github.com/pie6k/fuzzystring) and demo here https://pie6k.github.io/fuzzystring/

It works like:

fuzzyString('liolor', 'lorem ipsum dolor sit');

// returns
{
  parts: [
    { content: 'l', type: 'input' },
    { content: 'orem ', type: 'fuzzy' },
    { content: 'i', type: 'input' },
    { content: 'psum d', type: 'fuzzy' },
    { content: 'olor', type: 'input' },
    { content: ' sit', type: 'suggestion' },
  ],
  score: 0.87,
}

Here is full implementation (Typescript)

type MatchRoleType = 'input' | 'fuzzy' | 'suggestion';

interface FuzzyMatchPart {
  content: string;
  type: MatchRoleType;
}

interface FuzzyMatchData {
  parts: FuzzyMatchPart[];
  score: number;
}

interface FuzzyMatchOptions {
  truncateTooLongInput?: boolean;
  isCaseSesitive?: boolean;
}

function calculateFuzzyMatchPartsScore(fuzzyMatchParts: FuzzyMatchPart[]) {
  const getRoleLength = (role: MatchRoleType) =>
    fuzzyMatchParts
      .filter((part) => part.type === role)
      .map((part) => part.content)
      .join('').length;

  const fullLength = fuzzyMatchParts.map((part) => part.content).join('')
    .length;
  const fuzzyLength = getRoleLength('fuzzy');
  const inputLength = getRoleLength('input');
  const suggestionLength = getRoleLength('suggestion');

  return (
    (inputLength + fuzzyLength * 0.7 + suggestionLength * 0.9) / fullLength
  );
}

function compareLetters(a: string, b: string, isCaseSensitive = false) {
  if (isCaseSensitive) {
    return a === b;
  }
  return a.toLowerCase() === b.toLowerCase();
}

function fuzzyString(
  input: string,
  stringToBeFound: string,
  { truncateTooLongInput, isCaseSesitive }: FuzzyMatchOptions = {},
): FuzzyMatchData | false {
  // make some validation first

  // if input is longer than string to find, and we dont truncate it - it's incorrect
  if (input.length > stringToBeFound.length && !truncateTooLongInput) {
    return false;
  }

  // if truncate is enabled - do it
  if (input.length > stringToBeFound.length && truncateTooLongInput) {
    input = input.substr(0, stringToBeFound.length);
  }

  // if input is the same as string to be found - we dont need to look for fuzzy match - return it as match
  if (input === stringToBeFound) {
    return {
      parts: [{ content: input, type: 'input' }],
      score: 1,
    };
  }

  const matchParts: FuzzyMatchPart[] = [];

  const remainingInputLetters = input.split('');

  // let's create letters buffers
  // it's because we'll perform matching letter by letter, but if we have few letters matching or not matching in the row
  // we want to add them together as part of match
  let ommitedLettersBuffer: string[] = [];
  let matchedLettersBuffer: string[] = [];

  // helper functions to clear the buffers and add them to match
  function addOmmitedLettersAsFuzzy() {
    if (ommitedLettersBuffer.length > 0) {
      matchParts.push({
        content: ommitedLettersBuffer.join(''),
        type: 'fuzzy',
      });
      ommitedLettersBuffer = [];
    }
  }

  function addMatchedLettersAsInput() {
    if (matchedLettersBuffer.length > 0) {
      matchParts.push({
        content: matchedLettersBuffer.join(''),
        type: 'input',
      });
      matchedLettersBuffer = [];
    }
  }

  for (let anotherStringToBeFoundLetter of stringToBeFound) {
    const inputLetterToMatch = remainingInputLetters[0];

    // no more input - finish fuzzy matching
    if (!inputLetterToMatch) {
      break;
    }

    const isMatching = compareLetters(
      anotherStringToBeFoundLetter,
      inputLetterToMatch,
      isCaseSesitive,
    );

    // if input letter doesnt match - we'll go to the next letter to try again
    if (!isMatching) {
      // add this letter to buffer of ommited letters
      ommitedLettersBuffer.push(anotherStringToBeFoundLetter);
      // in case we had something in matched letters buffer - clear it as matching letters run ended
      addMatchedLettersAsInput();
      // go to the next input letter
      continue;
    }

    // we have input letter matching!

    // remove it from remaining input letters
    remainingInputLetters.shift();

    // add it to matched letters buffer
    matchedLettersBuffer.push(anotherStringToBeFoundLetter);
    // in case we had something in ommited letters buffer - add it to the match now
    addOmmitedLettersAsFuzzy();

    // if there is no more letters in input - add this matched letter to match too
    if (!remainingInputLetters.length) {
      addMatchedLettersAsInput();
    }
  }

  // if we still have letters left in input - means not all input was included in string to find - input was incorrect
  if (remainingInputLetters.length > 0) {
    return false;
  }

  // lets get entire matched part (from start to last letter of input)
  const matchedPart = matchParts.map((match) => match.content).join('');

  // get remaining part of string to be found
  const suggestionPart = stringToBeFound.replace(matchedPart, '');

  // if we have remaining part - add it as suggestion
  if (suggestionPart) {
    matchParts.push({ content: suggestionPart, type: 'suggestion' });
  }
  const score = calculateFuzzyMatchPartsScore(matchParts);

  return {
    score,
    parts: matchParts,
  };
}


来源:https://stackoverflow.com/questions/9206013/javascript-fuzzy-search

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!