Difference between .pb and .h5

后端 未结 1 1470
轻奢々
轻奢々 2021-01-13 18:18

What is the main difference between .pb format of tensorflow and .h5 format of keras to store models? Is there any reason to choose one ove

相关标签:
1条回答
  • 2021-01-13 18:57

    Different file formats with different characteristics, both used by tensorflow to save models (.h5 specifically by keras).

    .pb - protobuf

    It is a way to store some structured data (in this case a neural network),project is open source and currently overviewed by Google.

    Example

    person {
      name: "John Doe"
      email: "jdoe@example.com"
    }
    

    Simple class containing two fields, you can load it in one of multiple supported languages (e.g. C++, Go), parse, modify and send to someone else in binary format.

    Advantages

    • really small and efficient to parse (when compared to say .xml), hence often used for data transfer across the web
    • used by Tensorflow's Serving when you want to take your model to production (e.g. inference over the web)
    • language agnostic - binary format can be read by multiple languages (Java, Python, Objective-C, and C++ among others)
    • advised to use since tf2.0 , you can see official serializing guide
    • saves various metadata (optimizers, losses etc. if using keras's model)

    Disadvantages

    • SavedModel is conceptually harder to grasp than single file
    • creates folder where weights are

    Sources

    You can read about this format here

    .h5 - HDF5 binary data format

    Used originally by keras to save models (keras is now officially part of tensorflow). It is less general and more "data-oriented", less programmatic than .pb.

    Advantages

    • Used to save giant data (so some neural networks would fit well)
    • Common file saving format
    • Everything saved in one file (weights, losses, optimizers used with keras etc.)

    Disadvantages

    • Cannot be used with Tensorflow Serving but you can simply convert it to .pb via keras.experimental.export_saved_model(model, 'path_to_saved_model')

    All in all

    Use the simpler one (.h5) if you don't need to productionize your model (or it's reasonably far away). Use .pb if you are going for production or just want to standardize on single format across all tensorflow provided tools.

    0 讨论(0)
提交回复
热议问题