How to find a word from arrays of characters?

前端 未结 4 1425
感动是毒
感动是毒 2020-12-14 04:55

What is the best way to solve this:

I have a group of arrays with 3-4 characters inside each like so:

{p,     {a,    {t,    {m,
 q,      b,     u,            


        
相关标签:
4条回答
  • 2020-12-14 05:10

    There are probably many way of solving this.

    What you are interested in is the number of each character you have available to form a word, and how many of each character is required for each dictionary word. The trick is how to efficiently look up this information in the dictionary.

    Perhaps you can use a prefix tree (a trie), some kind of smart hash table, or similar.

    Anyway, you will probably have to try out all your possibilities and check them against the dictionary. I.e., if you have three arrays of three values each, there will be 3^3+3^2+3^1=39 combinations to check out. If this process is too slow, then perhaps you could stick a Bloom filter in front of the dictionary, to quickly check if a word is definitely not in the dictionary.

    EDIT: Anyway, isn't this essentially the same as Scrabble? Perhaps try Googling for "scrabble algorithm" will give you some good clues.

    0 讨论(0)
  • 2020-12-14 05:14

    I had some Scrabble code laying around, so I was able to throw this together. The dictionary I used is sowpods (267751 words). The code below reads the dictionary as a text file with one uppercase word on each line.

    The code is C#:

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.IO;
    using System.Diagnostics;
    
    namespace SO_6022848
    {
      public struct Letter
      {
        public const string Chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
        public static implicit operator Letter(char c)
        {
          return new Letter() { Index = Chars.IndexOf(c) };
        }
        public int Index;
        public char ToChar()
        {
          return Chars[Index];
        }
        public override string ToString()
        {
          return Chars[Index].ToString();
        }
      }
    
      public class Trie
      {
        public class Node
        {
          public string Word;
          public bool IsTerminal { get { return Word != null; } }
          public Dictionary<Letter, Node> Edges = new Dictionary<Letter, Node>();
        }
    
        public Node Root = new Node();
    
        public Trie(string[] words)
        {
          for (int w = 0; w < words.Length; w++)
          {
            var word = words[w];
            var node = Root;
            for (int len = 1; len <= word.Length; len++)
            {
              var letter = word[len - 1];
              Node next;
              if (!node.Edges.TryGetValue(letter, out next))
              {
                next = new Node();
                if (len == word.Length)
                {
                  next.Word = word;
                }
                node.Edges.Add(letter, next);
              }
              node = next;
            }
          }
        }
    
      }
    
      class Program
      {
        static void GenWords(Trie.Node n, HashSet<Letter>[] sets, int currentArrayIndex, List<string> wordsFound)
        {
          if (currentArrayIndex < sets.Length)
          {
            foreach (var edge in n.Edges)
            {
              if (sets[currentArrayIndex].Contains(edge.Key))
              {
                if (edge.Value.IsTerminal)
                {
                  wordsFound.Add(edge.Value.Word);
                }
                GenWords(edge.Value, sets, currentArrayIndex + 1, wordsFound);
              }
            }
          }
        }
    
        static void Main(string[] args)
        {
          const int minArraySize = 3;
          const int maxArraySize = 4;
          const int setCount = 10;
          const bool generateRandomInput = true;
    
          var trie = new Trie(File.ReadAllLines("sowpods.txt"));
          var watch = new Stopwatch();
          var trials = 10000;
          var wordCountSum = 0;
          var rand = new Random(37);
    
          for (int t = 0; t < trials; t++)
          {
            HashSet<Letter>[] sets;
            if (generateRandomInput)
            {
              sets = new HashSet<Letter>[setCount];
              for (int i = 0; i < setCount; i++)
              {
                sets[i] = new HashSet<Letter>();
                var size = minArraySize + rand.Next(maxArraySize - minArraySize + 1);
                while (sets[i].Count < size)
                {
                  sets[i].Add(Letter.Chars[rand.Next(Letter.Chars.Length)]);
                }
              }
            }
            else
            {
              sets = new HashSet<Letter>[] { 
              new HashSet<Letter>(new Letter[] { 'P', 'Q', 'R', 'S' }), 
              new HashSet<Letter>(new Letter[] { 'A', 'B', 'C' }), 
              new HashSet<Letter>(new Letter[] { 'T', 'U', 'V' }), 
              new HashSet<Letter>(new Letter[] { 'M', 'N', 'O' }) };
            }
    
            watch.Start();
            var wordsFound = new List<string>();
            for (int i = 0; i < sets.Length - 1; i++)
            {
              GenWords(trie.Root, sets, i, wordsFound);
            }
            watch.Stop();
            wordCountSum += wordsFound.Count;
            if (!generateRandomInput && t == 0)
            {
              foreach (var word in wordsFound)
              {
                Console.WriteLine(word);
              }
            }
          }
          Console.WriteLine("Elapsed per trial = {0}", new TimeSpan(watch.Elapsed.Ticks / trials));
          Console.WriteLine("Average word count per trial = {0:0.0}", (float)wordCountSum / trials);
        }
    
      }
    
    }
    

    Here is the output when using your test data:

    PA
    PAT
    PAV
    QAT
    RAT
    RATO
    RAUN
    SAT
    SAU
    SAV
    SCUM
    AT
    AVO
    BUM
    BUN
    CUM
    TO
    UM
    UN
    Elapsed per trial = 00:00:00.0000725
    Average word count per trial = 19.0
    

    And the output when using random data (does not print each word):

    Elapsed per trial = 00:00:00.0002910
    Average word count per trial = 62.2
    

    EDIT: I made it much faster with two changes: Storing the word at each terminal node of the trie, so that it doesn't have to be rebuilt. And storing the input letters as an array of hash sets instead of an array of arrays, so that the Contains() call is fast.

    0 讨论(0)
  • 2020-12-14 05:17

    I just made a very large nested for loop like this:

    for(NSString*s1 in [letterList objectAtIndex:0]{
        for(NSString*s2 in [letterList objectAtIndex:1]{
           8 more times...
        }
    }
    

    Then I do a binary search on the combination to see if it is in the dictionary and add it to an array if it is

    0 讨论(0)
  • 2020-12-14 05:31

    The reformulated question can be answered just by generating and testing. Since you have 4 letters and 10 arrays, you've only got about 1 million possible combinations (10 million if you allow a blank character). You'll need an efficient way to look them up, use a BDB or some sort of disk based hash.

    The trie solution previously posted should work as well, you are just restricted more by what characters you can choose at each step of the search. It should be faster as well.

    0 讨论(0)
提交回复
热议问题