Programmatically determine whether to describe an object with “a” or “an”?

后端未结

关注

 8  1491

I have a database of nouns (ex \"house\", \"exclamation point\", \"apple\") that I need to output and describe in my application. It\'s hard to put together a natural-soundi

相关标签:

8条回答

春和景丽

2020-12-01 21:45

Make an array with vowels in it. Check if the first letter of the word you are checking is in the vowel array. Will work except when dealing with acronyms.

0 讨论(0)
发布评论:

提交评论
- 加载中...
一整个雨季

2020-12-01 21:46
I've written a PHP port of the popular JS a-vs-an code as described in this stackoverflow post https://stackoverflow.com/a/1288473/1526020.

Github page: https://github.com/UseAllFive/a-vs-an.

E.g.
```
$result = $aVsAn->query('0800 number');
print_r($result);
```
Returns
```
Array
(
    [aCount] => 8
    [anCount] => 25
    [prefix] => 08
    [article] => an
)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

清酒与你

2020-12-01 21:52

I needed this for a C# project so here's the C# port of the Python code mentioned above. Make sure to include using System.Text.RegularExpressions; in your source file.

private string GetIndefiniteArticle(string noun_phrase)
{
    string word = null;
    var m = Regex.Match(noun_phrase, @"\w+");
    if (m.Success)
        word = m.Groups[0].Value;
    else
        return "an";

    var wordi = word.ToLower();
    foreach (string anword in new string[] { "euler", "heir", "honest", "hono" })
        if (wordi.StartsWith(anword))
            return "an";

    if (wordi.StartsWith("hour") && !wordi.StartsWith("houri"))
        return "an";

    var char_list = new char[] { 'a', 'e', 'd', 'h', 'i', 'l', 'm', 'n', 'o', 'r', 's', 'x' };
    if (wordi.Length == 1)
    {
        if (wordi.IndexOfAny(char_list) == 0)
            return "an";
        else
            return "a";
    }

    if (Regex.Match(word, "(?!FJO|[HLMNS]Y.|RY[EO]|SQU|(F[LR]?|[HL]|MN?|N|RH?|S[CHKLMNPTVW]?|X(YL)?)[AEIOU])[FHLMNRSX][A-Z]").Success)
        return "an";

    foreach (string regex in new string[] { "^e[uw]", "^onc?e\b", "^uni([^nmd]|mo)", "^u[bcfhjkqrst][aeiou]" })
    {
        if (Regex.IsMatch(wordi, regex))
            return "a";
    }

    if (Regex.IsMatch(word, "^U[NK][AIEO]"))
        return "a";
    else if (word == word.ToUpper())
    {
        if (wordi.IndexOfAny(char_list) == 0)
            return "an";
        else
            return "a";
    }

    if (wordi.IndexOfAny(new char[] { 'a', 'e', 'i', 'o', 'u' }) == 0)
        return "an";

    if (Regex.IsMatch(wordi, "^y(b[lor]|cl[ea]|fere|gg|p[ios]|rou|tt)"))
        return "an";

    return "a";
}

0 讨论(0)

傲寒

2020-12-01 21:53

It should be pretty easy to write from scratch, tbh. If a word starts with a vowel, it gets an 'a'; if it begins with a consonant, it gets an 'an'. Programmatically it's easy to do - if you have any edge cases (for eg you might use the BBC english-style 'an historic occasion') you can handle them individually.

Kind of like using an inflector, only with the 'a'/'an' grammar rule instead of plurals. Look into how CakePHP or Rails handle inflection for a more thorough discussion of the concept, including how to handle edge cases - you don't want to inflect 'deer' as 'deers' in the plural, for example, or 'goose' as 'gooses', so they need to be handled individually, just like your own edge cases like 'universe' or aspirated/non-aspirated 'H's.

0 讨论(0)
发布评论:

提交评论
- 加载中...

别跟我提以往

2020-12-01 21:55

Was looking for just such a solution so thanks marcog. Here's an attempt to port your friend's python version (I don't know python or perl so there's probably some mistakes):

function indefinite_article($word) {
    // Lowercase version of the word
    $word_lower = strtolower($word);

    // An 'an' word (specific start of words that should be preceeded by 'an')
    $an_words = array('euler', 'heir', 'honest', 'hono');
    foreach($an_words as $an_word) {
            if(substr($word_lower,0,strlen($an_word)) == $an_word) return "an";
    }
    if(substr($word_lower,0,4) == "hour" and substr($word_lower,0,5) != "houri") return "an";

    // An 'an' letter (single letter word which should be preceeded by 'an')
    $an_letters = array('a','e','f','h','i','l','m','n','o','r','s','x');
    if(strlen($word) == 1) {
            if(in_array($word_lower,$an_letters)) return "an";
            else return "a";
    }

    // Capital words which should likely by preceeded by 'an'
    if(preg_match('/(?!FJO|[HLMNS]Y.|RY[EO]|SQU|(F[LR]?|[HL]|MN?|N|RH?|S[CHKLMNPTVW]?|X(YL)?)[AEIOU])[FHLMNRSX][A-Z]/', $word)) return "an";

    // Special cases where a word that begins with a vowel should be preceeded by 'a'
    $regex_array = array('^e[uw]','^onc?e\b','^uni([^nmd]|mo)','^u[bcfhjkqrst][aeiou]');
    foreach($regex_array as $regex) {
            if(preg_match('/'.$regex.'/',$word_lower)) return "a";        
    }

    // Special capital words
    if(preg_match('/^U[NK][AIEO]/',$word)) return "a";
    // Not sure what this does
    else if($word == strtoupper($word)) {
            $array = array('a','e','d','h','i','l','m','n','o','r','s','x');
            if(in_array($word_lower[0],$array)) return "an";
            else return "a";
    }

    // Basic method of words that begin with a vowel being preceeded by 'an'
    $vowels = array('a','e','i','o','u');
    if(in_array($word_lower[0],$vowels)) return "an";

    // Instances where y follwed by specific letters is preceeded by 'an'
    if(preg_match('/^y(b[lor]|cl[ea]|fere|gg|p[ios]|rou|tt)/', $word_lower)) return "an";

    // Default to 'a'
    return "a";
}

There's one bit (below the comment "// Not sure what this does") that I was unsure of what it did. If anyone can figure it out, I'd be happy to know.

0 讨论(0)

迷失自我

2020-12-01 21:56

I was also looking for such solution but in JavaScript. So I ported it over to JS, you can check out the actual project in github https://github.com/rigoneri/indefinite-article.js

Here is the code snippet:

 function indefinite_article(phrase) {

    // Getting the first word 
    var match = /\w+/.exec(phrase);
    if (match)
        var word = match[0];
    else
        return "an";

    var l_word = word.toLowerCase();
    // Specific start of words that should be preceeded by 'an'
    var alt_cases = ["honest", "hour", "hono"];
    for (var i in alt_cases) {
        if (l_word.indexOf(alt_cases[i]) == 0)
            return "an";
    }

    // Single letter word which should be preceeded by 'an'
    if (l_word.length == 1) {
        if ("aedhilmnorsx".indexOf(l_word) >= 0)
            return "an";
        else
            return "a";
    }

    // Capital words which should likely be preceeded by 'an'
    if (word.match(/(?!FJO|[HLMNS]Y.|RY[EO]|SQU|(F[LR]?|[HL]|MN?|N|RH?|S[CHKLMNPTVW]?|X(YL)?)[AEIOU])[FHLMNRSX][A-Z]/)) {
        return "an";
    }

    // Special cases where a word that begins with a vowel should be preceeded by 'a'
    regexes = [/^e[uw]/, /^onc?e\b/, /^uni([^nmd]|mo)/, /^u[bcfhjkqrst][aeiou]/]
    for (var i in regexes) {
        if (l_word.match(regexes[i]))
            return "a"
    }

    // Special capital words (UK, UN)
    if (word.match(/^U[NK][AIEO]/)) {
        return "a";
    }
    else if (word == word.toUpperCase()) {
        if ("aedhilmnorsx".indexOf(l_word[0]) >= 0)
            return "an";
        else 
            return "a";
    }

    // Basic method of words that begin with a vowel being preceeded by 'an'
    if ("aeiou".indexOf(l_word[0]) >= 0)
        return "an";

    // Instances where y follwed by specific letters is preceeded by 'an'
    if (l_word.match(/^y(b[lor]|cl[ea]|fere|gg|p[ios]|rou|tt)/))
        return "an";

    return "a";
}

0 讨论(0)

1 2 下一页