I\'m doing static code analysis using openstack/bandit. Do have a lot of repositories, some of those are in python 2 other in python 3. How can I detect if code is syntactically
Basic validation would be if the 2to3 tool prints any diffs (s. https://docs.python.org/3/library/2to3.html for basic usage)
on a simple file like a.py:
import urllib2
print "printing something"
you'd get:
> 2to3 a.py
RefactoringTool: Skipping optional fixer: buffer
RefactoringTool: Skipping optional fixer: idioms
RefactoringTool: Skipping optional fixer: set_literal
RefactoringTool: Skipping optional fixer: ws_comma
RefactoringTool: Refactored a.py
--- a.py (original)
+++ a.py (refactored)
@@ -1,4 +1,4 @@
-import urllib2
+import urllib.request, urllib.error, urllib.parse
-print "printing something"
+print("printing something")
RefactoringTool: Files that need to be modified:
RefactoringTool: a.py
The most basic validation that you could do ensure your code is python3 compatible syntaxically is to run pylint3 for that particular module and look for errors.
Install pylint3
sudo apt-get install pylint3
Run pylint3 for a python module
pylint3 -E <module.py>
The above could be used to catch syntax errors in a module with respect to python3.
You can use the "compileall" module like so:
python3.6 -m compileall -q .
Modify appropriately for the python version you want to use.
Python3 (since 3.something) puts compiled modules into a __pycache__
directory, with a arch-specific extension, so they won't conflict with Python2 or other Python3 versions.
The given command will just show errors, and will recurse from the current directory. Use python3.6 -m compileall --help
to reveal ALL the OPTIONS.
Here's one thing you might want to do. I think it's the easiest way you can know if code is compatible at least syntaxically.
Have a python3 program load the python modules (without executing them). If the code is compatible, it will load the module, if it isn't... it will raise a syntax error.
Use the ast
module.
import ast
def test_source_code_compatible(code_data):
try:
return ast.parse(code_data)
except SyntaxError as exc:
return False
ast_tree = test_source_code_compatible(open('file.py', 'rb').read())
if not ast_tree:
print("File couldn't get loaded")
If the code can't be loaded it will raise a SyntaxError
error.
Documentation of the Ast Module
If the abstract syntax tree can't be loaded, then you may have to check for python2 methods that don't exists in python3 or methods that changed their behaviour.
For example the division in python3 and python2 works differently. In python2, the division divide in integers so the result of a division will be different if you don't use the same division scheme. In that case, you'll have to look if the module is importing from __future__ import division
to have the same behaviour in python2 and python3.
Here's an exhaustive list of things that you might want to handle:
http://python-future.org/compatible_idioms.html
Loading the ast of the module will give you right away things that absolutely can't work.. but knowing if code that can be parsed will work in python3 is subject to many false positive. It's hard even impossible to accurately detect if code will actually work 100% in python2 and 3 without actually running it and comparing the results.
You can use Pycharm IDE for this. Just open the python files in the pycharm editor, it will show warnings if the code is not compatible to Python2 or Python3.
Here is the screenshot where it shows print command syntax warning.