How to create a code object in python?

前端 未结 2 1387
醉话见心
醉话见心 2021-01-31 05:15

I\'d like to create a new code object with the function types.CodeType() .
There is almost no documentation about this and the existing one says \"not for faint of heart\"

相关标签:
2条回答
  • 2021-01-31 05:51

    –––––––––––
    Disclaimer :
    Documentation in this answer is not official and may be incorrect.

    This answer is valid only for python version 3.x

    –––––––––––

    In order to create a code object you have to pass to the function CodeType() the following arguments:

    CodeType(
            argcount,             #   integer
            kwonlyargcount,       #   integer
            nlocals,              #   integer
            stacksize,            #   integer
            flags,                #   integer
            codestring,           #   bytes
            consts,               #   tuple
            names,                #   tuple
            varnames,             #   tuple
            filename,             #   string
            name,                 #   string
            firstlineno,          #   integer
            lnotab,               #   bytes
            freevars,             #   tuple
            cellvars              #   tuple
            )
    

    Now i will try to explain what is the meaning of each argument.

    argcount
    Number of arguments to be passed to the function (*args and **kwargs are not included).

    kwonlyargcount
    Number of keyword-only arguments.

    nlocals
    Number of local variables ,
    namely all variables and parameters(*args and **kwargs included) except global names.

    stacksize The amount of stack (virtual machine stack) required by the code ,
    if you want to understand how it works , see official Documentation.

    flags
    A bitmap that says something about the code object:
    1 –> code was optimized
    2 –> newlocals: there is a new local namespace(for example a function)
    4 –> the code accepts an arbitrary number of positional arguments (*args is used)
    8 –> the code accepts an arbitrary number of keyworded arguments (*kwargs is used)
    32 –> the code is a generator

    othes flags are used in older python versions or are activated to say what is imported from __ future __

    codestring
    A sequence of bytes representing bytecode instructions
    if you want a better understanding , see Documentation (same as above)

    consts
    A tuple containing literals used by the bytecode (for example pre-computed numbers, tuples,and strings)

    names
    A tuple containing names used by the bytecode
    this names are global variables, functions and classes or also attributes loaded from objects

    varnames
    A tuple containing local names used by the bytecode (arguments first, then local variables)

    filename
    It is the filename from which the code was compiled.
    It can be whatever you want,you are free to lie about this. ;)

    name
    It gives the name of the function. Also this can be whatever you want,but be careful:
    this is the name shown in the traceback,if the name is unclear,the traceback could be unclear,
    just think about how lambdas can be annoying.

    firstlineno
    The first line of the function (for debug purpose if you compiled source code)

    lnotab
    A mapping of bytes that correlates bytecode offsets to line numbers.
    (i think also this is for debug purpose,there is few documentation about this)

    freevars
    A tuple containing the names of free variables.
    Free variables are variables declared in the namespace where the code object was defined, they are used when nested functions are declared;
    this doesn't happen at module level because in that case free variables are also global variables.

    cellvars
    A tuple containing names of local variables referenced by nested functions.

    ––––––––––––
    Examples :
    following examples should clarify the meaning of what has been said above.

    Note: in finished code objects attributes mentioned above have the co_ prefix,
    and a function stores its executable body in the __code__ attribute

    ––––––––––––
    1st Example

    def F(a,b):
        global c
        k=a*c
        w=10
        p=(1,"two",3)
    
    print(F.__code__.co_argcount)
    print(F.__code__.co_nlocals , F.__code__.co_varnames)
    print(F.__code__.co_stacksize)
    print(F.__code__.co_flags)
    print(F.__code__.co_names)
    print(F.__code__.co_consts)
    

    Output:

    2
    5 ('a', 'b', 'k', 'w', 'p')
    3
    67
    ('c' ,)
    (None, 10, 1, 'two'. 3, (1, 'two', 3))
    
    1. there are two arguments passed to this function ("a","b")

    2. this function has two parameters("a","b") and three local variables("k","w","p")

    3. disassembling the function bytecode we obtain this:

      3         0 LOAD_FAST                0 (a)             #stack:  ["a"] 
                3 LOAD_GLOBAL              0 (c)             #stack:  ["a","c"]
                6 BINARY_MULTIPLY                            #stack:  [result of a*c]
                7 STORE_FAST               2 (k)             #stack:  []
      
      4        10 LOAD_CONST               1 (10)            #stack:  [10]
               13 STORE_FAST               3 (w)             #stack:  []
      
      5        16 LOAD_CONST               5 ((1, 'two', 3)) #stack:  [(1,"two",3)]
               19 STORE_FAST               4 (p)             #stack:  []
               22 LOAD_CONST               0 (None)          #stack:  [None]
               25 RETURN_VALUE                               #stack:  []
      

      as you can notice chile executing the function we never have more than three elements in the stack (tuple counts as its lenght in this case)

    4. flag's value is dec 67 = bin 1000011 = bin 1000000 +10 +1 = dec 64 +2 +1 ,so we understand that

      • the code is optimized(as most of the automatically generated code is)
      • while executing the function bytecode local namespace changes
      • 64? Actually i don't know what is its meaning
    5. the only global name that is used in the function is "c" , it is stored in co_names

    6. every explicit literal we use is stored in co_consts:

      • None is the return value of the function
      • we explicitly assign the number 10 to w
      • we explicitly assign (1, 'two', 3) to p
      • if the tuple is a constant each element of that tuple is a constant,so 1,"two",3 are constants

    ––––––––––––
    2nd example

    ModuleVar="hi"
    
    def F():
        FunctionVar=106
        UnusedVar=ModuleVar
    
        def G():
            return (FunctionVar,ModuleVar)
    
        print(G.__code__.co_freevars)
        print(G.__code__.co_names)
    
    F()
    print(F.__code__.co_cellvars)
    print(F.__code__.co_freevars)
    print(F.__code__.co_names)
    

    Output:

    ('FunctionVar',)
    ('ModuleVar',)
    ('FunctionVar',)
    ()
    ('print', '__code__', 'co_freevars', 'co_names', 'ModuleVar')
    

    the meaning of the output is this:

    first and second line are printed when F is executed,so they show co_freevars and co_names of G code:
    "FunctionVar" is in the namespace of F function,where G was created,
    "ModuleVar" instead is a module variable,so it is considered as global.

    following three lines are about co_cellvars,co_freevars and co_names attributes of F code:
    "FunctionVar" is referenced in the G nested function ,so it is marked as a cellvar,
    "ModuleVar" is in the namespace where F was created,but it is a module variable,
    so it is not marked as freevar,but it is found in global names.
    also the builtin function print is marked in names , and all the names of attributes used in F.

    ––––––––––––
    3rd example

    This is a working code object initialization,
    this is unuseful but you can do everything you want with this function.

    MyCode= CodeType(
            0,
            0,
            0,
            3,
            64,
            bytes([101, 0, 0,    #Load print function
                   101, 1, 0,    #Load name 'a'
                   101, 2, 0,    #Load name 'b'
                   23,           #Take first two stack elements and store their sum
                   131, 1, 0,    #Call first element in the stack with one positional argument
                   1,            #Pop top of stack
                   101, 0, 0,    #Load print function
                   101, 1, 0,    #Load name 'a'
                   101, 2, 0,    #Load name 'b'
                   20,           #Take first two stack elements and store their product
                   131, 1, 0,    #Call first element in the stack with one positional argument
                   1,            #Pop top of stack
                   100, 0, 0,    #Load constant None
                   83]),         #Return top of stack
            (None,),
            ('print', 'a', 'b'),
            (),
            'PersonalCodeObject',
            'MyCode',
            1,
            bytes([14,1]),
            (),
            () )
    
    a=2
    b=3
    exec(MyCode) # code prints the sum and the product of "a" and "b"
    

    Output:

    5
    6
    
    0 讨论(0)
  • 2021-01-31 05:56

    Example usage of the CodeType constructor may be found in the standard library, specifically Lib/modulefinder.py. If you look there, you'll see it being used to redefine the read-only co_filename attribute on all the code objects in a file.

    I recently ran into a similar use case where I had a function factory, but the generated functions always had the "generic" name in the traceback, so I had to regenerate the code objects to contain the desired name.

    >>> def x(): raise NotImplementedError
    ...
    >>> x.__name__
    'x'
    >>> x.__name__ = 'y'
    >>> x.__name__
    'y'
    >>> x()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 1, in x
    NotImplementedError
    
    >>> x.__code__.co_name
    'x'
    >>> x.__code__.__name__ = 'y'
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    AttributeError: readonly attribute
    
    >>> 'Gah!'
    'Gah!'
    

    But, wait, the function's __code__ member is not read-only, so we can do what the modulefinder does:

    >>> from types import CodeType
    >>> co = x.__code__
    >>> x.__code__ = CodeType(co.co_argcount, co.co_kwonlyargcount,
                 co.co_nlocals, co.co_stacksize, co.co_flags,
                 co.co_code, co.co_consts, co.co_names,
                 co.co_varnames, co.co_filename,
                 'MyNewCodeName',
                 co.co_firstlineno, co.co_lnotab, co.co_freevars,
                 co.co_cellvars)
    >>> x.__code__.co_name
    'MyNewCodeName'
    >>> x()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 1, in MyNewCodeName
    NotImplementedError
    

    The thing to note in this example is that the traceback uses the co_name attribute, not the func.__name__ attribute when producing values in the stack trace.

    One more note: The above is Python 3, to make it Python 2 compatible, just leave out the second argument to the constructor (co_kwonlyargcount).

    UPDATE: Victor Stinner added a new method, 'replace', to the CodeType class in Python 3.8, which simplifies the situation quite considerably. This was done to eliminate future compatibility issues, as 3.8 also added a new 'co_posonlyargcount' argument into the call list after 'co_argcount', so at least your 3.8 and later code will be somewhat future proofed if the argument list changes again.

    >>> x.__code__ = x.__code__.replace(co_name='MyNewCodeName')
    
    0 讨论(0)
提交回复
热议问题