Python embedded : How to pass special characters to PyRun_SimpleFile

Consider the following code running embedded Python script from C++. It create an embedded Python module with a function that will report current file/line upon execution.

#include <Python.h>

#include <iostream>
#include <fstream>

PyObject * mymodule_meth_test(PyObject * self) {
    PyObject * exc;
    PyObject * val;
    PyObject * tb;
    PyErr_Fetch(&exc, &val, &tb);
    PyTraceBack_Print(tb, PySys_GetObject("stderr"));
    std::cout << "LINE is " << PyLong_AsLong(PyObject_GetAttrString(PyObject_GetAttrString(tb, "tb_frame"), "f_lineno")) << std::endl;
    std::cout << "FILE is " << PyUnicode_AsUTF8(PyObject_GetAttrString(PyObject_GetAttrString(PyObject_GetAttrString(tb, "tb_frame"), "f_code"), "co_filename")) << std::endl;


PyMethodDef module_methods[] = {
    {"test", (PyCFunction)mymodule_meth_test, METH_NOARGS, NULL},

PyModuleDef module_def = {PyModuleDef_HEAD_INIT, "mymodule", NULL, -1, module_methods};

extern "C" PyObject * PyInit_mymodule() {
    PyObject * module = PyModule_Create(&module_def);
    return module;
void runScript( const std::string& script, bool utf8 )
    Py_SetPythonHome( L"C:\\dev\\vobs_sde\\sde\\3rdparty\\tools_ext\\python\\Python38" );

    PyImport_AppendInittab("mymodule", &PyInit_mymodule);

    // Initialize the Python Interpreter

    FILE* file = NULL;
    if ( file )
        wchar_t* sScriptUTF8 = Py_DecodeLocale(script.c_str(), NULL);
        if ( PyRun_SimpleFile(file, (utf8) ? (const char*) sScriptUTF8 : script.c_str()) == 0 )
            std::cout << "SUCCESS" << std::endl;
            std::cout << "FAIL" << std::endl;

int main( int argc, char* argv[] )
    std::fstream file2; "mainé", std::ios_base::out );
    file2 << "import mymodule" << std::endl;
    file2 << "mymodule.test()" << std::endl;

    std::cout << std::endl << "Will fail to execute script" << std::endl;
    runScript( "mainé", false );
    std::cout << std::endl << "Will work! But FILE will be reported as 'm' instead of 'mainé'" << std::endl;
    runScript( "mainé", true );
    return 0;

This script outputs:

Will fail to execute script
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 4: invalid continuation byte

Will work! But FILE will be reported as 'm' instead of 'mainé'
Traceback (most recent call last):
  File "m", line 2, in <module>
LINE is 2
FILE is m

So, as you can see:

  • If I pass the regular char* "mainé" to PyRun_SimpleFile it fails to run the script.
  • If I pass the wchar_t "mainé" string to PyRun_SimpleFile, it is able to run the script, but then file nam reported by mymodule_meth_test is m while mainé is expected.

This is likelly because, as wchar_t, "mainé" is "'m', 0, 'a', 0,..." and this is later interpreted as a regular char* becoming then "m" because second item is considered as EOS.

How should I invoke PyRun_SimpleFile to have this work correctly?

Note, I ended up being able to call PyRun_SimpleFile as below:

std::string utf8Str = std::wstring_convert<std::codecvt_utf8<wchar_t>>().to_bytes(Py_DecodeLocale(script.c_str(), NULL));
if ( PyRun_SimpleFile(file, utf8Str.c_str()) == 0 )
    std::cout << "SUCCESS" << std::endl;
    std::cout << "FAIL" << std::endl;

However, later in mymodule_meth_test call to PyUnicode_AsUTF8 will return NULL and I could not find out how to correctly retrieve the file name...

