问题
I have some rather mangled code that almost produces the tangible price/book from Yahoo Finance for companies (a nice module called ystockquote
gets the intangible price/book value already).
My problem is this:
For one of the variables in the calculation, shares outstanding I'm getting strings like 10.89B and 4.9M, where B and M stand respectively for billion and million. I'm having trouble converting them to numbers, here's where I'm at:
shares=''.join(node.findAll(text=True)).strip().replace('M','000000').replace('B','000000000').replace('.','') for node in soup2.findAll('td')[110:112]
Which is pretty messy, but I think it would work if instead of
.replace('M','000000').replace('B','000000000').replace('.','')
I was using a regular expression with variables. I guess the question is simply which regular expression and variables. Other suggestions are also good.
EDIT:
To be specific I'm hoping to have something that works for numbers with zero, one, or two decimals but these answers all look helpful.
回答1:
>>> from decimal import Decimal
>>> d = {
'M': 6,
'B': 9
}
>>> def text_to_num(text):
if text[-1] in d:
num, magnitude = text[:-1], text[-1]
return Decimal(num) * 10 ** d[magnitude]
else:
return Decimal(text)
>>> text_to_num('3.17B')
Decimal('3170000000.00')
>>> text_to_num('4M')
Decimal('4000000')
>>> text_to_num('4.1234567891234B')
Decimal('4123456789.1234000000000')
You can int()
the result if you want too
回答2:
Parse the numbers as floats, and use a multiplier mapping:
multipliers = dict(M=10**6, B=10**9)
def sharesNumber(nodeText):
nodeText = nodeText.strip()
mult = 1
if nodeText[-1] in multipliers:
mult = multipliers[nodeText[-1]]
nodeText = nodeText[:-1]
return float(nodeText) * mult
回答3:
num_replace = {
'B' : 1000000000,
'M' : 1000000,
}
a = "4.9M"
b = "10.89B"
def pure_number(s):
mult = 1.0
while s[-1] in num_replace:
mult *= num_replace[s[-1]]
s = s[:-1]
return float(s) * mult
pure_number(a) # 4900000.0
pure_number(b) # 10890000000.0
This will work with idiocy like:
pure_number("5.2MB") # 5200000000000000.0
and because of the dictionary approach, you can add as many suffixes as you want in an easy to maintain way, and you can make it more lenient by expressing your dict keys in one capitalisation form and then doing a .lower()
or .upper()
to make it match.
回答4:
num_replace = {
'B' : 'e9',
'M' : 'e6',
}
def str_to_num(s):
if s[-1] in num_replace:
s = s[:-1]+num_replace[s[-1]]
return int(float(s))
>>> str_to_num('3.71B')
3710000000L
>>> str_to_num('4M')
4000000
So '3.71B'
-> '3.71e9'
-> 3710000000L
etc.
回答5:
This could be an opportunity to safely use eval!! :-)
Consider the following fragment:
>>> d = { "B" :' * 1e9', "M" : '* 1e6'}
>>> s = "1.493B"
>>> ll = [d.get(c, c) for c in s]
>>> eval(''.join(ll), {}, {})
1493000000.0
Now put it all together into a neat one liner:
d = { "B" :' * 1e9', "M" : '* 1e6'}
def human_to_int(s):
return eval(''.join([d.get(c, c) for c in s]), {}, {})
print human_to_int('1.439B')
print human_to_int('1.23456789M')
Gives back:
1439000000.0
1234567.89
来源:https://stackoverflow.com/questions/11896560/how-can-i-consistently-convert-strings-like-3-71b-and-4m-to-numbers-in-pytho