converting Google Cloud NLP API entity sentiment output to JSON

问题

I have this output result from Google Cloud Natural Language API (took me quite a while to produce it so I don't want to go with the solution in How can I JSON serialize an object from google's natural language API? (No __dict__ attribute) )

Mentions: 
Name: "Trump"
  Begin Offset : 0
  Content : Trump
  Magnitude : 0.0
  Sentiment : 0.0
  Type : 2
Salience: 0.6038374900817871
Sentiment: 

Mentions: 
Name: "hand"
  Begin Offset : 19
  Content : hand
  Magnitude : 0.0
  Sentiment : 0.0
  Type : 2
Salience: 0.20075689256191254
Sentiment: 

Mentions: 
Name: "water boarding"
  Begin Offset : 39
  Content : water boarding
  Magnitude : 0.0
  Sentiment : 0.0
  Type : 2
Salience: 0.13010266423225403
Sentiment: 

Mentions: 
Name: "some"
  Begin Offset : 58
  Content : some
  Magnitude : 0.0
  Sentiment : 0.0
  Type : 2
Salience: 0.04501711577177048
Sentiment: 

Mentions: 
Name: "GOPDebate"
  Begin Offset : 65
  Content : GOPDebate
  Magnitude : 0.0
  Sentiment : 0.0
  Type : 1
Salience: 0.020285848528146744
Sentiment:

I want to find the Magnitude and Sentiment for a set of candidate names (Donald Trump, Hillary Clinton, Bernie Sanders and Ted Cruz--or a set of similar name like only trump/hillary/clinton/cruz/bernie/sanders/@realdonaldtrump).

At first I didn't realize that output files are not json. Indeed I am not sure what the format is. I was told it is perhaps malformatted YAML. Is there a way to convert these files to json? As I said, I have processed lots of files and it is not practical for me to modify the protobuf and create json at this point.

The part of code from Google Cloud NLP tutorial that does this is:

# [START def_entity_sentiment_text]
def entity_sentiment_text(text):
    """Detects entity sentiment in the provided text."""
    client = language.LanguageServiceClient()

    if isinstance(text, six.binary_type):
        text = text.decode('utf-8')

    document = types.Document(
        content=text.encode('utf-8'),
        type=enums.Document.Type.PLAIN_TEXT)

    # Detect and send native Python encoding to receive correct word offsets.
    encoding = enums.EncodingType.UTF32
    if sys.maxunicode == 65535:
        encoding = enums.EncodingType.UTF16

    result = client.analyze_entity_sentiment(document, encoding)

    for entity in result.entities:
        print('Mentions: ')
        print(u'Name: "{}"'.format(entity.name))
        for mention in entity.mentions:
            print(u'  Begin Offset : {}'.format(mention.text.begin_offset))
            print(u'  Content : {}'.format(mention.text.content))
            print(u'  Magnitude : {}'.format(mention.sentiment.magnitude))
            print(u'  Sentiment : {}'.format(mention.sentiment.score))
            print(u'  Type : {}'.format(mention.type))
        print(u'Salience: {}'.format(entity.salience))
        print(u'Sentiment: {}\n'.format(entity.sentiment))
# [END def_entity_sentiment_text]

So I am not even sure how to apply the answer in the other SO here.

回答1:

Adding answer to an old thread. Response from NLP APIs in protobuf format can be converted to JSON using MessageToDict or MessageToJson as below

from google.protobuf import json_format
import json
response_json = json.loads(json_format.MessageToJson(result))

来源：https://stackoverflow.com/questions/49139188/converting-google-cloud-nlp-api-entity-sentiment-output-to-json

标签

python

json

google-cloud-platform

YAML

google-cloud-nl