I\'m trying to write a function in python, which will determine what type of value is in string; for example
if in string is 1 or 0 or True or False the value is BI
Before you go too far down the regex route, have you considered using ast.literal_eval
In [35]: ast.literal_eval('1')
Out[35]: 1
In [36]: type(ast.literal_eval('1'))
Out[36]: int
In [38]: type(ast.literal_eval('1.0'))
Out[38]: float
In [40]: type(ast.literal_eval('[1,2,3]'))
Out[40]: list
May as well use Python to parse it for you!
OK, here is a bigger example:
import ast, re
def dataType(str):
if len(str) == 0: return 'BLANK'
except ValueError:
return 'TEXT'
except SyntaxError:
return 'TEXT'
if type(t) in [int, long, float, bool]:
if t in set((True,False)):
return 'BIT'
if type(t) is int or type(t) is long:
return 'INT'
if type(t) is float:
return 'FLOAT'
return 'TEXT'
testSet=[' 1 ', ' 0 ', 'True', 'False', #should all be BIT
'12', '34l', '-3','03', #should all be INT
'1.2', '-20.4', '1e66', '35.','- .2','-.2e6', #should all be FLOAT
'10-1', 'def', '10,2', '[1,2]','35.9.6','35..','.']
for t in testSet:
print "{:10}:{}".format(t,dataType(t))
1 :BIT
0 :BIT
True :BIT
False :BIT
12 :INT
34l :INT
-3 :INT
03 :INT
1.2 :FLOAT
-20.4 :FLOAT
1e66 :FLOAT
35. :FLOAT
- .2 :FLOAT
-.2e6 :FLOAT
10-1 :TEXT
def :TEXT
10,2 :TEXT
[1,2] :TEXT
35.9.6 :TEXT
35.. :TEXT
And if you positively MUST have a regex solution, which produces the same results, here it is:
def regDataType(str):
if len(str) == 0: return 'BLANK'
if re.match(r'True$|^False$|^0$|^1$', str):
return 'BIT'
if re.match(r'([-+]\s*)?\d+[lL]?$', str):
return 'INT'
if re.match(r'([-+]\s*)?[1-9][0-9]*\.?[0-9]*([Ee][+-]?[0-9]+)?$', str):
return 'FLOAT'
if re.match(r'([-+]\s*)?[0-9]*\.?[0-9][0-9]*([Ee][+-]?[0-9]+)?$', str):
return 'FLOAT'
return 'TEXT'
I cannot recommend the regex over the ast version however; just let Python do the interpretation of what it thinks these data types are rather than interpret them with a regex...