google-cloud-data-fusion

Saleforce plug-in error in google cloud data fusion

孤街醉人 提交于 2020-06-27 22:56:11
问题 I'm testing salesforce connectivity from Google Cloud Data Fusion. I get this error "Error: No discoverable found for request POST /v3/namespaces/system/apps/pipeline/services/studio/methods/v1/contexts/default/validations/stage HTTP/1.1" when clicking on the get schema button in the connector. Authentication details are all correct, I have tested it outside using Postman. 回答1: Can you go to the System Admin link at the top right hand corner and check the status for Pipeline Studio? If it's

How to use Custom Transform in Wrangler?

我是研究僧i 提交于 2020-06-27 17:15:47
问题 I'm trying to use custom transform using a column at Wrangler Plugin. Is there any documentation where I can find the list of functions to apply the custom transform? Also for a specific case, I wanna replace the value of a column based on IF-ELSE condition (or multiple cases). Is there any way of it? 回答1: The custom transform supports JEXL, so you can find a list of functions to apply here: JEXL syntax. See the Conditional section of that page for information on how to do an if-else. 回答2:

Failed to connect with mysql using google data fusion

◇◆丶佛笑我妖孽 提交于 2020-06-27 13:57:25
问题 I failed to connect to MySQL from google data fusion the step: First, I add the connector https://dev.mysql.com/downloads/file/?id=462850 Second, I try to add a connection (failed) screenshot of the MySQL: Communications link failure The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server. **** Edit **** I think this is associated with allowing data fusion to access to our production data my second question is: How can I

Import/Export DataFusion pipelines

帅比萌擦擦* 提交于 2020-06-26 04:07:16
问题 Does anyone know if it is possible to programmatically import/export DataFlow pipelines (deployed or in draft status)? The idea is to write a script to drop and create a DataFusion instance, in order to avoid billing when it's not used. Via gloud commandline it's possible to provision a DataFusion cluster and to destroy it, but it would be interesting to automatically export and import all my pipelines too. The official documentation, unfortunately, didn't help me... Thanks! 回答1: You could

How to schedule Google Data Fusion pipeline?

淺唱寂寞╮ 提交于 2020-06-16 03:49:46
问题 I have deployed a simple Data Fusion pipeline that reads from GCS and writes to BigQuery table. I am looking for way to schedule the pipeline but could not find relevant documents. Can anyone point me to documentation/pages that briefs about scheduling Data fusion pipelines? 回答1: You can schedule pipeline after deployment by clicking on Schedule button in the pipeline detail page. Once you click on it, you can configure the pipeline to run periodically. Please see screenshots below: 回答2: I

Google data fusion Execution error “INVALID_ARGUMENT: Insufficient 'DISKS_TOTAL_GB' quota. Requested 3000.0, available 2048.0.”

梦想与她 提交于 2020-02-24 12:20:29
问题 I am trying load a Simple CSV file from GCS to BQ using Google Data Fusion Free version. The pipeline is failing with error . it reads com.google.api.gax.rpc.InvalidArgumentException: io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Insufficient 'DISKS_TOTAL_GB' quota. Requested 3000.0, available 2048.0. at com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:49) ~[na:na] at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:72)

GCP Data Fusion StatusRuntimeException: INVALID_ARGUMENT: Insufficient 'DISKS_TOTAL_GB' quota. Requested 3000.0, available 2048.0

你说的曾经没有我的故事 提交于 2020-01-24 19:39:07
问题 I'm trying to deploy a pipeline in GCP Data Fusion. I was initially working on the free account, but upgraded in order to increase quotas as recommended in the following question seen here. However, I am still unclear based on the accepted answer as to what specific quota to increase in GCE to enable the pipeline to run. Could someone either provide more clarity in the above linked question or respond here to elaborate on what in the IAM Quotas needs to be increased to resolve the issue seen

Fail to start program run program_run

自古美人都是妖i 提交于 2020-01-24 18:01:49
问题 The source of the error: io.cdap.cdap.internal.app.runtime.distributed.remote.RemoteExecutionTwillRunnerService#543-runtime-startup-1 The error message: java.io.IOException: com.jcraft.jsch.JSchException: java.net.ConnectException: Connection timed out (Connection timed out) at io.cdap.cdap.common.ssh.DefaultSSHSession.(DefaultSSHSession.java:82) ~[na:na] at io.cdap.cdap.internal.app.runtime.distributed.remote.RemoteExecutionTwillPreparer.lambda$start$0(RemoteExecutionTwillPreparer.java:429)

Connecting to Cloud SQL MySQL

一世执手 提交于 2020-01-02 16:43:32
问题 We would like to test connecting Cloud SQL (mySQL) to BigQuery using Cloud Data Fusion. What is the proper way to connect to CloudSQL as that does not appear to be "build in" at this point in time. What driver is recommended and are there any instructions available? 回答1: Here are instructions to use Cloud SQL MySQL in Data Fusion. Note that in the Wrangler section, currently, Cloud SQL instances with Private IP cannot be used. However, they can still be used when running Data Fusion pipelines

Connecting to Cloud SQL MySQL

|▌冷眼眸甩不掉的悲伤 提交于 2020-01-02 16:43:00
问题 We would like to test connecting Cloud SQL (mySQL) to BigQuery using Cloud Data Fusion. What is the proper way to connect to CloudSQL as that does not appear to be "build in" at this point in time. What driver is recommended and are there any instructions available? 回答1: Here are instructions to use Cloud SQL MySQL in Data Fusion. Note that in the Wrangler section, currently, Cloud SQL instances with Private IP cannot be used. However, they can still be used when running Data Fusion pipelines