Build a Call graph in python including modules and functions?

33,705

Solution 1

The best tool I've found is called pyan, and was originally written by Edmund Horner, improved by him, and then given colorization and other features by Juha Jeronen. That version has useful commandline options:

Usage: pyan.py FILENAME... [--dot|--tgf]

Analyse one or more Python source files and generate an approximate call graph
of the modules, classes and functions within them.

Options:
  -h, --help           show this help message and exit
  --dot                output in GraphViz dot format
  --tgf                output in Trivial Graph Format
  -v, --verbose        verbose output
  -d, --defines        add edges for 'defines' relationships [default]
  -n, --no-defines     do not add edges for 'defines' relationships
  -u, --uses           add edges for 'uses' relationships [default]
  -N, --no-uses        do not add edges for 'uses' relationships
  -c, --colored        color nodes according to namespace [dot only]
  -g, --grouped        group nodes (create subgraphs) according to namespace
                       [dot only]
  -e, --nested-groups  create nested groups (subgraphs) for nested namespaces
                       (implies -g) [dot only]

Here's the result of running pyan.py --dot -c -e pyan.py | fdp -Tpng:

pyan's output on itself

Edmund Horner's original code is now best found in his github repository, and somebody has also made a repository with both versions, from where you can download Juha Jeronen's version. I've made a clean version combining their contributions into my own repository just for pyan, since both repositories have lots of other software.

Solution 2

You might want to check out pycallgraph:

pycallgraph

Also in this link a more manual approach is described:

generating-call-graphs-for-understanding-and-refactoring-python-code

Solution 3

In short, no such tool exists. Python is far too dynamic of a language to be able to generate a call graph without executing the code.

Here's some code which clearly demonstrates some of the very dynamic features of python:

class my_obj(object):
    def __init__(self, item):
        self.item = item
    def item_to_power(self, power):
        return self.item ** power

def strange_power_call(obj):
    to_call = "item_to_power"
    return getattr(obj, to_call)(4)

a = eval("my" + "_obj" + "(12)")
b = strange_power_call(a)

Note that we're using eval to create an instance of my_obj and also using getattr to call one of its methods. These are both methods that would make it extremely difficult to create a static call graph for python. Additionally, there are all sorts of difficult to analyze ways of importing modules.

I think your best bet is going to be to sit down with the code base and a pad of paper, and start taking notes by hand. This will have the dual benefit of making you more familiar with the code base, and will not be easily tricked by difficult to parse scenarios.

Solution 4

SourceTrail will help you here. https://www.sourcetrail.com/

Sourcetrail is a free and open-source cross-platform source explorer that helps you get productive on unfamiliar source code. Supports C, C++, Java and Python

https://github.com/CoatiSoftware/Sourcetrail

enter image description here

Here is a link to the documentation

https://www.sourcetrail.com/documentation/

Please note that Python support is relatively new, so please don't expect it to work perfectly yet.

Solution 5

the working version of pyan3 i found is 1.1.1 (pip install pyan3==1.1.1) and its documentation is here

Share:
33,705
JohnnyDH
Author by

JohnnyDH

Updated on July 19, 2022

Comments

  • JohnnyDH
    JohnnyDH almost 2 years

    I have a bunch of scripts to perform a task. And I really need to know the call graph of the project because it is very confusing. I am not able to execute the code because it needs extra HW and SW to do so. However, I need to understand the logic behind it. So, I need to know if there is a tool (which do not require any python file execution) that can build a call graph using the modules instead of the trace or python parser. I have such tools for C but not for python.
    Thank you.

  • JohnnyDH
    JohnnyDH over 11 years
    Yes, I have seen this pages during my research but I am looking for a "professional" solution. I am afraid such thing does not exist... New start-up idea? Hehe
  • JohnnyDH
    JohnnyDH over 11 years
    I know. At most, one could search for import, def and func() statements within the modules. I think I will write a program to do exactly that. Of course, it will work only on simple source codes.
  • Wilduck
    Wilduck over 11 years
    Only extremely simple ones. You'll also need to parse comments, strings, and docstrings, lest you be fooled by those. I've edited my answer to include what I think you should actually do.
  • JohnnyDH
    JohnnyDH over 11 years
    Yes, I am doing it manually... There are 14 referenced scripts... Wish me luck :)
  • amwinter
    amwinter almost 10 years
    @Wilduck Static analyzers can be useful without being complete. Any language can obfuscate its call graph. For example, I can use a dictionary in C++ to look up function pointers and call those. Static call graphs are a quick way to get a high-level overview before diving into a new codebase.
  • chiffa
    chiffa about 9 years
    Pycallgraph doesn't digest packages well unfortunately
  • David Fraser
    David Fraser over 8 years
    pycallgraph is running the code, which is what he asked not to do. pyan does static analysis (see my answer below)
  • codeshot
    codeshot almost 8 years
    I took at look at your own repository. The code doesn't come with a copyright license so there's no verifiable relaxation of the reserved rights - that means it is forbidden for people to use it as is... Are you able to add a license like the MIT license so this technique can spread and set a baseline for python code reports?
  • David Fraser
    David Fraser almost 8 years
    Good point. They were originally published under the GPL v2, so I've updated the code to show this, and left a blog comment to verify this
  • Alexander Reshytko
    Alexander Reshytko over 7 years
    @DavidFraser is it compatible with Python 3.x?
  • David Fraser
    David Fraser over 7 years
    @AlexanderReshytko Unfortunately not. I've pushed a branch called py3-compat to my github repository which makes the most minimal changes. But this uses the compiler module, which was removed in Python 3. The code would need to be restructured to use ast.NodeVisitor subclasses; this shouldn't be too hard, but I don't have time to do it right now. (It would still be compatible with Python 2.6+)
  • Alexander Reshytko
    Alexander Reshytko over 7 years
    @DavidFraser agree. The same for me. I've looked yesterday at the code. Yes and besides compiler a lot of classes from compiler.symbols.* are missing (SymbolVisitor and it's dependencies) maybe they can be adapted easily for ast. Don't know yet. Hopefully I'll have some time to have a look at it.
  • Charlie Parker
    Charlie Parker about 7 years
    second link is dead
  • AlexLordThorsen
    AlexLordThorsen over 6 years
    Looking like the output isn't compatible with Graphviz anymore. =(
  • David Fraser
    David Fraser over 6 years
    It should be Graphviz compatible; that syntax hasn't changed. What error are you getting?
  • Bryce Guinta
    Bryce Guinta about 6 years
    pycallgraph is now unmaintained
  • David Fraser
    David Fraser about 6 years
    A note to anyone following this ; various users including Technologicat have now contributed Python 3 support
  • Pro Q
    Pro Q almost 6 years
    This works wonderfully. I'm on windows, and I found it helped to make a bash command that did python "C:\path\to\pyan.py" %1 --uses --defines --colored --grouped --annotated --dot >pyan_output.dot && clip < pyan_output.dot so that I could I could just paste into webgraphviz.com and see the output. Thank you for helping create this and keeping it updated!
  • XoXo
    XoXo over 5 years
    besides the dot command, its manual points out other commands that takes in the generated *.dot file, including circo and fdp
  • Florent
    Florent almost 5 years
    @ProQ thanks for the windows equivalent. You saved this command into a .bat file? then how do you call the script from the root directory of a project? I was using this command to get a list of all python files in a project dir . /s/b | findstr ".r*.py" but then don't really how to pass it to the script to obtain the final graph output
  • Pro Q
    Pro Q almost 5 years
    @Florent I don't currently have access to that computer to see exactly how I did it, but I believe I followed this: stackoverflow.com/a/39459404/5049813
  • 6005
    6005 almost 5 years
    Sadly it's not running for me: self.visit_Assign(self, node) # TODO: alias for now; add the annotations to output in a future version? TypeError: visit_Assign() takes 2 positional arguments but 3 were given
  • 6005
    6005 almost 5 years
    I tried removing "self", that still results in other errors
  • 6005
    6005 almost 5 years
    I guess it's because I installed from github.com/ttylec/pyan instead of using your branch
  • Kaz
    Kaz over 4 years
    pycallgraph is dynamic; question asks for a simple static call graph which function calls which other function, even if we don't reach that line of code.
  • Kaz
    Kaz over 4 years
    Questions says that OP has such a tool for C. Gee, how can that be? C has function pointers ...
  • DataNoob
    DataNoob about 4 years
    Does this pyan do call graph recursively? I have a modules with several folders and py files, so when I ran pyan it only generating py files in root level
  • David Fraser
    David Fraser about 4 years
    You can use shell scripting to generate a list of all the files and pass them to pyan - e.g. pyan.py *.py subfolder1/*.py subfolder2/*.py
  • astrojuanlu
    astrojuanlu almost 4 years
    As of right now, it seems that the best maintained fork is github.com/Technologicat/pyan, although the related PyPI package pypi.org/project/pyan3 has not been updated in a while.
  • DaCruzR
    DaCruzR about 3 years
    @amwinter Newbie here, in layman terms could you please briefly say what obfuscating in the context of call graph involves?
  • DaCruzR
    DaCruzR about 3 years
    @Kaz By dynamic do you mean pycallgraph runs the code in order to generate the call graph, and static is where it doesn't need to run the code to generate the call graph?
  • DaCruzR
    DaCruzR about 3 years
    @amwinter by obfuscate are you referring to the fact that languages make you use a sugar coated syntax which make you're source code less verbose but also arguably introduces ambiguity? For example in Python to instantiate a class you would say Banana() but actually it's being translated to Banana._ _ init _ _(), so a static analysis wouldn't pick that up unless the programmer explicity wrote code to translate such cases.
  • Karol Zlot
    Karol Zlot over 2 years
    Original pycallgraph is not maintained, use fork instead, you can read more in this answer: stackoverflow.com/a/69866174/8896457
  • pnovotnak
    pnovotnak over 2 years
    Sadly this project has been discontinued :(