What does the error: `Loaded runtime CuDNN library: 5005 but source was compiled with 5103` mean?

后端 未结 2 1549
执笔经年
执笔经年 2021-02-20 08:58

I was trying to use TensorFlow with GPU and got the following error:

I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0)          


        
2条回答
  •  孤街浪徒
    2021-02-20 09:54

    This is an approximate description of what is going on.

    cuDNN has major releases that are numbered e.g. 4.0, 5.0, 5.1, etc.

    These major releases may incorporate API changes. Therefore a program that uses cuDNN v4 (i.e. 4.0) may need some modifications to work with or use new features in cuDNN v5 (i.e. 5.0).

    The major release is encoded in the first two digits of the 4-digit version number. So a cuDNN 4-digit version number of 5103 means it belongs to the 5.1 major release and has a sub-version number of 03. For compatibility purposes, such a release should be API-compatible with any other cuDNN library version of 51xx because they all belong to the 5.1 major release (this is not guaranteed to be strictly true AFAIK, but it is the general idea). Therefore any of these libraries with release numbering 51xx would have a compatibility version of 5100, to indicate that they belong to (and are (should be) compatible with) the 5.1 major release.

    So when we are referring to a compatibility version (what major release is this library compatible with) we only need to specify the first two digits - 5000 indicates 5.0, 5100 indicates 5.1. But it is possible for a release to have a sub-release version number that is non-zero. There could be a variety of reasons for this, for example to allow for bug-fix releases and the like.

    When a program (like tensorflow) is designed to use cuDNN, it will generally be coded to work with a particular version of cuDNN. In some cases, this can be handled at compile time, by "compiling against" a pariticular cuDNN version (and it's associated API, i.e. header files used when building tensorflow). Therefore, at compile time, a program like tensorflow can determine what version of the cuDNN API it was compiled against, and that is a 4-digit version (although generally speaking, only the compatibililty version i.e. the first two digits of the 4-digit version should really matter).

    At runtime, you have a particular version of the cuDNN library (e.g. .so on linux) loaded on your machine somewhere. The version of that library can be determined, queried, and reported. If that actual library version does not match (at least from a compatibility version perspective) the version of the cuDNN library that tensorflow was compiled against, then that's a good indication that things may not work, and so tensorflow points this out when it is running:

    Loaded runtime CuDNN library: 5005 but source was compiled with 5103.

    This is tensorflow telling you "hey, I was designed (compiled) to work with cuDNN v5.1 but you are only giving me cuDNN 5.0 to work with".

    Differences at the sub-version level should be less significant. If you know what you are doing, it may be ok to use cuDNN runtime version 5107 even if your tensorflow was compiled against version 5103. This is just a hypothetical example, but that would indicate that there is some difference in the library which was not intended to change proper functionality or behavior, or the API interface. It could be just a bug-fixed version of 5103, for example (hypothetically. This is an imaginary example.)

    In the ideal case, you would build tensorflow against the version of cuDNN that you are using. If you have downloaded pre-built tensorflow packages, however, then you may witness this sort of message (since you presumably downloaded cuDNN separately). In that case, you should at least seek to match the cuDNN major version you are using against the compatibility version that tensorflow is expecting. In this particular example, you are not doing that.

提交回复
热议问题