Can't connect Cloud Data Fusion with Google Cloud SQL for PostgreSQL

♀尐吖头ヾ 提交于 2019-12-04 04:05:44

问题


My goal is to read data from Cloud SQL Postgres to BigQuery via a Cloud Data Fusion pipeline.

For this, I set up a Cloud Data Fusion instance and assigned the following two permissions to the service account: (see https://cloud.google.com/data-fusion/docs/how-to/create-instance#setting_up_permissions)

  • Cloud SQL Client
  • Cloud Data Fusion API Service Agent

As a next step I connected myself to the Cloud Data Fusion Instance, and navigated to Wrangler -> Add Connection -> Database -> Google Cloud SQL for PostgreSQL.

As the driver I uploaded the postgres-socket-factory-1.0.13-jar-with-dependencies.jar which I downloaded here: https://github.com/GoogleCloudPlatform/cloud-sql-jdbc-socket-factory

For the driver configuration, I set:

  • Name: cloudsql-postgresql
  • Class name: org.postgresql.Driver

For the database connection, I set:

  • Connection name: <PROJECT_NAME>:<REGION>:<INSTANCE_CONNECTION_NAME>
  • Connection string: jdbc:postgresql://google/<DATABASE_NAME>?cloudSqlInstance=<INSTANCE_CONNECTION_NAME>&socketFactory=com.google.cloud.sql.postgres.SocketFactory
  • Username: Database username
  • Password: Database password

After clicking on Test Connection, I receive the org.postgresql.Driver error message.


回答1:


Most likely you need a driver and the JDBC Socket Factory to get it to work properly. You should bundle the postgres connector with the socket factory into a uber-jar and see if that solves the problem.

Here is a quick (untested) pom that should accomplish this:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
   <modelVersion>4.0.0</modelVersion>
   <groupId>com.example</groupId>
   <artifactId>postgres-with-cloud-sql-socket-factory</artifactId>
   <packaging>pom</packaging>
   <version>0.0.1</version>

   <dependencies>
      <dependency>
         <groupId>org.postgresql</groupId>
         <artifactId>postgresql</artifactId>
         <version>42.2.5</version>
      </dependency>
      <dependency>
         <groupId>com.google.cloud.sql</groupId>
         <artifactId>postgres-socket-factory</artifactId>
         <version>1.0.13</version>
      </dependency>
   </dependencies>

   <build>
      <plugins>
         <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-assembly-plugin</artifactId>
            <version>2.4.1</version>
            <configuration>
               <!-- get all project dependencies -->
               <descriptorRefs>
                  <descriptorRef>jar-with-dependencies</descriptorRef>
               </descriptorRefs>
            </configuration>
            <executions>
               <execution>
                  <id>make-assembly</id>
                  <!-- bind to the packaging phase -->
                  <phase>package</phase>
                  <goals>
                     <goal>single</goal>
                  </goals>
               </execution>
            </executions>
         </plugin>
      </plugins>
   </build>
</project>



回答2:


@Sebastian Auberger,

We had the same issue, basically Data Fusion creates a Service Account which tries to create a IAP Tunnel to the Cloud SQL Instance. We could get through this issue by giving the role "Cloud SQL Client".

Give it a shot and let us know!.



来源:https://stackoverflow.com/questions/55880595/cant-connect-cloud-data-fusion-with-google-cloud-sql-for-postgresql

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!