Static analysis for Python

ven. 13 mai 2016 by Rémi Duraffort

I've always been a big fan of static analysis. While I was working with C and C++, I was scanning my code with:

For a longer list of static analyzers for each language, look at wikipedia

As I'm now mostly programming in Python (I'm working on LAVA), I had a closer look at the available static analyzers for this language.

I haven't found any static analyzer for Python as powerful as the one available for C or /C++. However the ones listed in this article must be used on your Python code. That's the bare minimum.

PEP

The first tool to run on any Python source code would be pep8. You can also give a try to pep256.

pep8 is a simple tool that only check for coding style. Python does advice the use of a standardize code style: PEP8. It's recommended to use the same coding style in order to keep a common style among the Python community.

However, you can obviously configure pep8 to match your coding style.

def main():
  print("Hello" + " World")

if __name__ == "__main__":
    main()

If you run pep8 on this file you will get:

$ pep8 code.py
code.py:2:3: E111 indentation is not a multiple of four

Pylint

Pylint is more of a static analyzer than pep8 as it actually check for some Python common errors. For instance, on this example:

def append(data=[]):
    data.append(1)
    return data

print(append())
[1]
print(append())
[1, 1]
print(append())
[1, 1, 1]

This is usually not what your are expecting. Pylint will warn you:

$ pylint code.py
No config file found, using default configuration
************* Module code
C:  1, 0: Missing module docstring (missing-docstring)
W:  1, 0: Dangerous default value [] as argument (dangerous-default-value)
C:  1, 0: Missing function docstring (missing-docstring)
[...]
Global evaluation
-----------------
Your code has been rated at 0.00/10

As you can see, Pylint is rating the source code (and show the evolution of this rate).

Vulture

Vulture is specialized in dead code elimination. Due to the dynamic nature of Python, such task is not as easy as it is with C. So don't expect any tool to find all dead code.

def append(data=[]):
    data.append(1)
    return data

def unused():
    for i in range(0, 10):
        append()

def main():
    append()

if __name__ == '__main__':
    main()

Running vulture will give:

$ vulture code.py
code.py:5: Unused function 'unused'
code.py:6: Unused variable 'i'

Pychecker

Another useful tool: Pychecker does provide some interesting information. However, it only works with Python2.

$ pychecker code.py
code.py:2: Modifying parameter (data) with a default value may have unexpected consequences
code.py:6: Local variable (i) not used

Prospector

I discovered prospector some weeks ago and that's now the only one I'm using. In fact, it uses all the other static analyzers (dodgy, pep257, pep8, pyflakes, pylint, vulture, pyroma, frosted) and provides a nice report.

$ prospector code.py
Messages
========

code.py
  Line: 1
    pylint: dangerous-default-value / Dangerous default value [] as argument
  Line: 6
    pylint: unused-variable / Unused variable 'i' (col 8)



Check Information
=================
         Started: 2016-05-13 17:29:31.466860
        Finished: 2016-05-13 17:29:31.553182
      Time Taken: 0.09 seconds
       Formatter: grouped
        Profiles: default, no_doc_warnings, no_test_warnings, strictness_medium, strictness_high, strictness_veryhigh, no_member_warnings
      Strictness: None
  Libraries Used: 
       Tools Run: dodgy, mccabe, pep8, profile-validator, pyflakes, pylint
  Messages Found: 2

My advice would be to run prospector regularly to track common mistakes. It does sometime found real bugs.