How can I consistently convert strings like “3.71B” and “4M” to numbers in Python?

点点圈 提交于 2019-12-10 14:27:43

问题


I have some rather mangled code that almost produces the tangible price/book from Yahoo Finance for companies (a nice module called ystockquote gets the intangible price/book value already).

My problem is this:

For one of the variables in the calculation, shares outstanding I'm getting strings like 10.89B and 4.9M, where B and M stand respectively for billion and million. I'm having trouble converting them to numbers, here's where I'm at:

shares=''.join(node.findAll(text=True)).strip().replace('M','000000').replace('B','000000000').replace('.','') for node in soup2.findAll('td')[110:112]

Which is pretty messy, but I think it would work if instead of

.replace('M','000000').replace('B','000000000').replace('.','') 

I was using a regular expression with variables. I guess the question is simply which regular expression and variables. Other suggestions are also good.

EDIT:

To be specific I'm hoping to have something that works for numbers with zero, one, or two decimals but these answers all look helpful.


回答1:


>>> from decimal import Decimal
>>> d = {
        'M': 6,
        'B': 9
}
>>> def text_to_num(text):
        if text[-1] in d:
            num, magnitude = text[:-1], text[-1]
            return Decimal(num) * 10 ** d[magnitude]
        else:
            return Decimal(text)

>>> text_to_num('3.17B')
Decimal('3170000000.00')
>>> text_to_num('4M')
Decimal('4000000')
>>> text_to_num('4.1234567891234B')
Decimal('4123456789.1234000000000')

You can int() the result if you want too




回答2:


Parse the numbers as floats, and use a multiplier mapping:

multipliers = dict(M=10**6, B=10**9)
def sharesNumber(nodeText):
    nodeText = nodeText.strip()
    mult = 1
    if nodeText[-1] in multipliers:
        mult = multipliers[nodeText[-1]]
        nodeText = nodeText[:-1]
    return float(nodeText) * mult



回答3:


num_replace = {
    'B' : 1000000000,
    'M' : 1000000,
}

a = "4.9M" 
b = "10.89B" 

def pure_number(s):
    mult = 1.0
    while s[-1] in num_replace:
        mult *= num_replace[s[-1]]
        s = s[:-1]
    return float(s) * mult 

pure_number(a) # 4900000.0
pure_number(b) # 10890000000.0

This will work with idiocy like:

pure_number("5.2MB") # 5200000000000000.0

and because of the dictionary approach, you can add as many suffixes as you want in an easy to maintain way, and you can make it more lenient by expressing your dict keys in one capitalisation form and then doing a .lower() or .upper() to make it match.




回答4:


num_replace = {
    'B' : 'e9',
    'M' : 'e6',
}

def str_to_num(s):
    if s[-1] in num_replace:
        s = s[:-1]+num_replace[s[-1]]
    return int(float(s))

>>> str_to_num('3.71B')
3710000000L
>>> str_to_num('4M')
4000000

So '3.71B' -> '3.71e9' -> 3710000000L etc.




回答5:


This could be an opportunity to safely use eval!! :-)

Consider the following fragment:

>>> d = { "B" :' * 1e9', "M" : '* 1e6'}
>>> s = "1.493B"
>>> ll = [d.get(c, c) for c in s]
>>> eval(''.join(ll), {}, {})
1493000000.0

Now put it all together into a neat one liner:

d = { "B" :' * 1e9', "M" : '* 1e6'}

def human_to_int(s):
    return eval(''.join([d.get(c, c) for c in s]), {}, {})

print human_to_int('1.439B')
print human_to_int('1.23456789M')

Gives back:

1439000000.0
1234567.89


来源:https://stackoverflow.com/questions/11896560/how-can-i-consistently-convert-strings-like-3-71b-and-4m-to-numbers-in-pytho

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!