python api to launch template unknown name cannot find field

六眼飞鱼酱① 提交于 2019-12-13 03:54:35

问题


I've created and run a DataPrep job, and am trying to use the template from python on app engine. I can successfully start a job using

gcloud dataflow jobs run 
    --parameters "inputLocations={\"location1\":\"gs://bucket/folder/*\"},
outputLocations={\"location1\":\"project:dataset.table\"},
customGcsTempLocation=gs://bucket/DataPrep-beta/temp"
--gcs-location gs://bucket/DataPrep-beta/temp/cloud-dataprep-templatename_template

however trying to use python on app engine;

service = build('dataflow', 'v1b3', credentials=credentials)
input1  = {"location1": "{i1}".format(i1=input)}
output1 = {"location1": "{o1}".format(o1=output)}

print('input location: {}'.format(input1))

GCSPATH="gs://{bucket}/{template}".format(bucket=BUCKET, template=template)
BODY = {
    "jobName": "{jobname}".format(jobname=JOBNAME),
    "parameters": {
        "inputLocations":  input1,
        "outputLocations": output1,
        "customGcsTempLocation": "gs://{}/DataPrep-beta/temp".format(BUCKET)
     }
}

print("dataflow request body: {}".format(BODY))
request = service.projects().templates().launch(projectId=PROJECT, gcsPath=GCSPATH, body=BODY)
response = request.execute()

I get back;

"Invalid JSON payload received. Unknown name "location1" at 
  'launch_parameters.parameters[1].value': Cannot find field.
Invalid JSON payload received. Unknown name "location1" at 
  'launch_parameters.parameters[2].value': Cannot find field."

Nothing I've tried seems to support passing a dict or a json.dumps() or a str() to "inputLocations" or "outputLocations".


回答1:


The issue is with the format that you are passing input1 and output1. They need to be between quotation marks like this:

input1 = '{"location1":"' + input + '" }'
output1 = '{"location1":"' + output + '" }'

I have tried sending the request with the same approach than you and it fails. It also fails if I later parse it back to string or json because it doesn't parse quotes correctly.




回答2:


Surely the format is something to do with your problem. I had the same use case to solve, but the output would be the files, instead of google bigquery dataset. and for me, the code with the following BODY parameter is initiating the google dataflow pipeline:

BODY = {
        "jobName": "{jobname}".format(jobname=JOBNAME),
        "parameters": {
            "inputLocations" : "{{\"location1\":\"gs://{bucket}/employee/input/patient.json\"}}".format(bucket=BUCKET),
            "outputLocations": "{{\"location1\":\"gs://{bucket}/employee/employees.json/file\",\"location2\":\"gs://{bucket}/jobrun/employees_314804/.profiler/profilerTypeCheckHistograms.json/file\",\"location3\":\"gs://{bucket}/jobrun/employees_314804/.profiler/profilerValidValueHistograms.json/file\"}}".format(bucket=BUCKET)
         },
         "environment": {
            "tempLocation": "gs://{bucket}/employee/temp".format(bucket=BUCKET),
            "zone": "us-central1-f"
         }
    }


来源:https://stackoverflow.com/questions/50098046/python-api-to-launch-template-unknown-name-cannot-find-field

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!