RegEx: Compare two strings to find Alliteration and Assonance

后端 未结 2 687
我寻月下人不归
我寻月下人不归 2021-01-12 14:15

would be possible to Compare two strings to find Alliteration and Assonance?

i use mainly javascript or php

相关标签:
2条回答
  • 2021-01-12 15:12

    To find alliterations in a text you simply iterate over all words, omitting too short and too common words, and collect them as long as their initial letters match.

    text = ''
    +'\nAs I looked to the east right into the sun,'
    +'\nI saw a tower on a toft worthily built;'
    +'\nA deep dale beneath a dungeon therein,'
    +'\nWith deep ditches and dark and dreadful of sight'
    +'\nA fair field full of folk found I in between,'
    +'\nOf all manner of men the rich and the poor,'
    +'\nWorking and wandering as the world asketh.'
    
    skipWords = ['the', 'and']
    curr = []
    
    text.toLowerCase().replace(/\b\w{3,}\b/g, function(word) {
        if (skipWords.indexOf(word) >= 0)
            return;
        var len = curr.length
        if (!len || curr[len - 1].charAt(0) == word.charAt(0))
            curr.push(word)
        else {
            if (len > 2)
                console.log(curr)
            curr = [word]
        }
    })
    

    Results:

    ["deep", "ditches", "dark", "dreadful"]
    ["fair", "field", "full", "folk", "found"]
    ["working", "wandering", "world"]
    

    For more advanced parsing and also to find assonances and rhymes you first have to translate a text into phonetic spelling. You didn't say which language you're targeting, for English there are some phonetic dictionaries available online, for example from Carnegie Mellon: ftp://ftp.cs.cmu.edu/project/fgdata/dict

    0 讨论(0)
  • 2021-01-12 15:14

    I'm not sure that a regex would be the best way of building a robust comparison tool. A simple regex might be part of a larger solution that used more sophisticated algorithms for non-exact matching.

    There are a variety of readily-available options for English, some of which could be extended fairly simply to languages that use the Latin alphabet. Most of these algorithms have been around for years or even decades and are well-documented, though they all have limits.

    I imagine that there are similar algorithms for non-Latin alphabets but I can't comment on their availability firsthand.

    Phonetic Algorithms

    The Soundex algorithm is nearly 100 years old and has been implemented in multiple programming languages. It is used to determine a numeric value based on the pronunciation of a string. It is not precise but it may be useful for identifying similar sounding words/syllables. I've experimented with it in MS SQL Server and it is available in PHP.

    http://php.net/manual/en/function.soundex.php

    General consensus (including the PHP docs) is that Metaphone is much more accurate than Soundex when dealing with the English language. There are numerous implementations available (Wikipedia has a long list at the end of the article) and it is included in PHP.

    http://www.php.net/manual/en/function.metaphone.php

    Double Metahpone supports a second encoding of a word corresponding to an alternate pronunciation of the word.

    As with Metaphone, Double Metaphone has been implemented in many programming languages (example).

    Word Deconstruction

    Levenshtein can be used to suggest alternate spellings (for example, to normalize user input) and might be useful as part of a more granular algorithm for alliteration and assonance.

    http://www.php.net/manual/en/function.levenshtein.php

    Logically, it would help to understand the syllabication of the words in the string so that each word could be deconstructed. The syllable break could resolve ambiguity as to how two adjacent letters should be pronounced. This thread has a few links:

    PHP Syllable Detection

    0 讨论(0)
提交回复
热议问题