How to cryptographically hash a JSON object?

后端 未结 7 1645
醉话见心
醉话见心 2020-12-01 01:30

The following question is more complex than it may first seem.

Assume that I\'ve got an arbitrary JSON object, one that may contain any amount of data including oth

相关标签:
7条回答
  • 2020-12-01 01:44

    JSON-LD can do normalitzation.

    You will have to define your context.

    0 讨论(0)
  • 2020-12-01 01:48

    We encountered a simple issue with hashing JSON-encoded payloads. In our case we use the following methodology:

    1. Convert data into JSON object;
    2. Encode JSON payload in base64
    3. Message digest (HMAC) the generated base64 payload .
    4. Transmit base64 payload .

    Advantages of using this solution:

    1. Base64 will produce the same output for a given payload.
    2. Since the resulting signature will be derived directly from the base64-encoded payload and since base64-payload will be exchanged between the endpoints, we will be certain that the signature and payload will be maintained.
    3. This solution solve problems that arise due to difference in encoding of special characters.

    Disadvantages

    1. The encoding/decoding of the payload may add overhead
    2. Base64-encoded data is usually 30+% larger than the original payload.
    0 讨论(0)
  • 2020-12-01 01:51

    RFC 7638: JSON Web Key (JWK) Thumbprint includes a type of canonicalization. Although RFC7638 expects a limited set of members, we would be able to apply the same calculation for any member.

    https://tools.ietf.org/html/rfc7638#section-3

    0 讨论(0)
  • 2020-12-01 01:54

    This is the same issue as causes problems with S/MIME signatures and XML signatures. That is, there are multiple equivalent representations of the data to be signed.

    For example in JSON:

    {  "Name1": "Value1", "Name2": "Value2" }
    

    vs.

    {
        "Name1": "Value\u0031",
        "Name2": "Value\u0032"
    }
    

    Or depending on your application, this may even be equivalent:

    {
        "Name1": "Value\u0031",
        "Name2": "Value\u0032",
        "Optional": null
    }
    

    Canonicalization could solve that problem, but it's a problem you don't need at all.

    The easy solution if you have control over the specification is to wrap the object in some sort of container to protect it from being transformed into an "equivalent" but different representation.

    I.e. avoid the problem by not signing the "logical" object but signing a particular serialized representation of it instead.

    For example, JSON Objects -> UTF-8 Text -> Bytes. Sign the bytes as bytes, then transmit them as bytes e.g. by base64 encoding. Since you are signing the bytes, differences like whitespace are part of what is signed.

    Instead of trying to do this:

    {  
       "JSONContent": {  "Name1": "Value1", "Name2": "Value2" },
       "Signature": "asdflkajsdrliuejadceaageaetge="
    }
    

    Just do this:

    {
       "Base64JSONContent": "eyAgIk5hbWUxIjogIlZhbHVlMSIsICJOYW1lMiI6ICJWYWx1ZTIiIH0s",
       "Signature": "asdflkajsdrliuejadceaageaetge="
    
    }
    

    I.e. don't sign the JSON, sign the bytes of the encoded JSON.

    Yes, it means the signature is no longer transparent.

    0 讨论(0)
  • 2020-12-01 01:59

    I would do all fields in a given order (alphabetically for example). Why does arbitrary data make a difference? You can just iterate over the properties (ala reflection).

    Alternatively, I would look into converting the raw json string into some well defined canonical form (remove all superflous formatting) - and hashing that.

    0 讨论(0)
  • 2020-12-01 02:02

    Instead of inventing your own JSON normalization/canonicalization you may want to use bencode. Semantically it's the same as JSON (composition of numbers, strings, lists and dicts), but with the property of unambiguous encoding that is necessary for cryptographic hashing.

    bencode is used as a torrent file format, every bittorrent client contains an implementation.

    0 讨论(0)
提交回复
热议问题