Is there a way to convert number words to Integers?

前端 未结 16 1975
北恋
北恋 2020-11-22 06:14

I need to convert one into 1, two into 2 and so on.

Is there a way to do this with a library or a class or anythi

相关标签:
16条回答
  • 2020-11-22 06:42

    I have just released a python module to PyPI called word2number for the exact purpose. https://github.com/akshaynagpal/w2n

    Install it using:

    pip install word2number
    

    make sure your pip is updated to the latest version.

    Usage:

    from word2number import w2n
    
    print w2n.word_to_num("two million three thousand nine hundred and eighty four")
    2003984
    
    0 讨论(0)
  • 2020-11-22 06:42

    This could be easily be hardcoded into a dictionary if there's a limited amount of numbers you'd like to parse.

    For slightly more complex cases, you'll probably want to generate this dictionary automatically, based on the relatively simple numbers grammar. Something along the lines of this (of course, generalized...)

    for i in range(10):
       myDict[30 + i] = "thirty-" + singleDigitsDict[i]
    

    If you need something more extensive, then it looks like you'll need natural language processing tools. This article might be a good starting point.

    0 讨论(0)
  • 2020-11-22 06:42

    This code works for a series data:

    import pandas as pd
    mylist = pd.Series(['one','two','three'])
    mylist1 = []
    for x in range(len(mylist)):
        mylist1.append(w2n.word_to_num(mylist[x]))
    print(mylist1)
    
    0 讨论(0)
  • 2020-11-22 06:47

    Here's the trivial case approach:

    >>> number = {'one':1,
    ...           'two':2,
    ...           'three':3,}
    >>> 
    >>> number['two']
    2
    

    Or are you looking for something that can handle "twelve thousand, one hundred seventy-two"?

    0 讨论(0)
  • 2020-11-22 06:48

    I needed something a bit different since my input is from a speech-to-text conversion and the solution is not always to sum the numbers. For example, "my zipcode is one two three four five" should not convert to "my zipcode is 15".

    I took Andrew's answer and tweaked it to handle a few other cases people highlighted as errors, and also added support for examples like the zipcode one I mentioned above. Some basic test cases are shown below, but I'm sure there is still room for improvement.

    def is_number(x):
        if type(x) == str:
            x = x.replace(',', '')
        try:
            float(x)
        except:
            return False
        return True
    
    def text2int (textnum, numwords={}):
        units = [
            'zero', 'one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight',
            'nine', 'ten', 'eleven', 'twelve', 'thirteen', 'fourteen', 'fifteen',
            'sixteen', 'seventeen', 'eighteen', 'nineteen',
        ]
        tens = ['', '', 'twenty', 'thirty', 'forty', 'fifty', 'sixty', 'seventy', 'eighty', 'ninety']
        scales = ['hundred', 'thousand', 'million', 'billion', 'trillion']
        ordinal_words = {'first':1, 'second':2, 'third':3, 'fifth':5, 'eighth':8, 'ninth':9, 'twelfth':12}
        ordinal_endings = [('ieth', 'y'), ('th', '')]
    
        if not numwords:
            numwords['and'] = (1, 0)
            for idx, word in enumerate(units): numwords[word] = (1, idx)
            for idx, word in enumerate(tens): numwords[word] = (1, idx * 10)
            for idx, word in enumerate(scales): numwords[word] = (10 ** (idx * 3 or 2), 0)
    
        textnum = textnum.replace('-', ' ')
    
        current = result = 0
        curstring = ''
        onnumber = False
        lastunit = False
        lastscale = False
    
        def is_numword(x):
            if is_number(x):
                return True
            if word in numwords:
                return True
            return False
    
        def from_numword(x):
            if is_number(x):
                scale = 0
                increment = int(x.replace(',', ''))
                return scale, increment
            return numwords[x]
    
        for word in textnum.split():
            if word in ordinal_words:
                scale, increment = (1, ordinal_words[word])
                current = current * scale + increment
                if scale > 100:
                    result += current
                    current = 0
                onnumber = True
                lastunit = False
                lastscale = False
            else:
                for ending, replacement in ordinal_endings:
                    if word.endswith(ending):
                        word = "%s%s" % (word[:-len(ending)], replacement)
    
                if (not is_numword(word)) or (word == 'and' and not lastscale):
                    if onnumber:
                        # Flush the current number we are building
                        curstring += repr(result + current) + " "
                    curstring += word + " "
                    result = current = 0
                    onnumber = False
                    lastunit = False
                    lastscale = False
                else:
                    scale, increment = from_numword(word)
                    onnumber = True
    
                    if lastunit and (word not in scales):                                                                                                                                                                                                                                         
                        # Assume this is part of a string of individual numbers to                                                                                                                                                                                                                
                        # be flushed, such as a zipcode "one two three four five"                                                                                                                                                                                                                 
                        curstring += repr(result + current)                                                                                                                                                                                                                                       
                        result = current = 0                                                                                                                                                                                                                                                      
    
                    if scale > 1:                                                                                                                                                                                                                                                                 
                        current = max(1, current)                                                                                                                                                                                                                                                 
    
                    current = current * scale + increment                                                                                                                                                                                                                                         
                    if scale > 100:                                                                                                                                                                                                                                                               
                        result += current                                                                                                                                                                                                                                                         
                        current = 0                                                                                                                                                                                                                                                               
    
                    lastscale = False                                                                                                                                                                                                              
                    lastunit = False                                                                                                                                                
                    if word in scales:                                                                                                                                                                                                             
                        lastscale = True                                                                                                                                                                                                         
                    elif word in units:                                                                                                                                                                                                             
                        lastunit = True
    
        if onnumber:
            curstring += repr(result + current)
    
        return curstring
    

    Some tests...

    one two three -> 123
    three forty five -> 345
    three and forty five -> 3 and 45
    three hundred and forty five -> 345
    three hundred -> 300
    twenty five hundred -> 2500
    three thousand and six -> 3006
    three thousand six -> 3006
    nineteenth -> 19
    twentieth -> 20
    first -> 1
    my zip is one two three four five -> my zip is 12345
    nineteen ninety six -> 1996
    fifty-seventh -> 57
    one million -> 1000000
    first hundred -> 100
    I will buy the first thousand -> I will buy the 1000  # probably should leave ordinal in the string
    thousand -> 1000
    hundred and six -> 106
    1 million -> 1000000
    
    0 讨论(0)
  • 2020-11-22 06:48
    This code works only for numbers below 99.
    both word to Int and int to word.
    (for rest need to implement 10-20 lines of code and simple logic. This is just simple code for beginners)
    
    
    num=input("Enter the number you want to convert : ")
    mydict={'1': 'One', '2': 'Two', '3': 'Three', '4': 'Four', '5': 'Five','6': 'Six', '7': 'Seven', '8': 'Eight', '9': 'Nine', '10': 'Ten','11': 'Eleven', '12': 'Twelve', '13': 'Thirteen', '14': 'Fourteen', '15': 'Fifteen', '16': 'Sixteen', '17': 'Seventeen', '18': 'Eighteen', '19': 'Nineteen'}
    mydict2=['','','Twenty','Thirty','Fourty','fifty','sixty','Seventy','Eighty','Ninty']
    if num.isdigit():
        if(int(num)<20):
            print(" :---> "+mydict[num])
        else:
                var1=int(num)%10
                var2=int(num)/10
                print(" :---> "+mydict2[int(var2)]+mydict[str(var1)])
    else:
        num=num.lower();
        dict_w={'one':1,'two':2,'three':3,'four':4,'five':5,'six':6,'seven':7,'eight':8,'nine':9,'ten':10,'eleven':11,'twelve':12,'thirteen':13,'fourteen':14,'fifteen':15,'sixteen':16,'seventeen':'17','eighteen':'18','nineteen':'19'}
        mydict2=['','','twenty','thirty','fourty','fifty','sixty','seventy','eighty','ninty']
        divide=num[num.find("ty")+2:]
        if num:
            if(num in dict_w.keys()):
                print(" :---> "+str(dict_w[num]))
            elif divide=='' :
                    for i in range(0, len(mydict2)-1):
                       if mydict2[i] == num:
                          print(" :---> "+str(i*10))
            else :
                str3=0
                str1=num[num.find("ty")+2:]
                str2=num[:-len(str1)]
                for i in range(0, len(mydict2) ):
                    if mydict2[i] == str2:
                        str3=i;
                if str2 not in mydict2:
                    print("----->Invalid Input<-----")                
                else:
                    try:
                        print(" :---> "+str((str3*10)+dict_w[str1]))
                    except:
                        print("----->Invalid Input<-----")
        else:
                print("----->Please Enter Input<-----")
    
    0 讨论(0)
提交回复
热议问题