I am using the code below to write a DataFrame of 43 columns and about 2,000,000 rows into a table in SQL Server:
dataFrame
.write
.format(\"jdbc\")
.mode(
Try adding batchsize
option to your statement with atleast > 10000
(change this value accordingly to get better performance) and execute the write again.
From spark docs:
The JDBC batch size, which determines how many rows to insert per round trip. This can help performance on JDBC drivers. This option applies only to writing. It defaults to 1000.
Also its worth to check out:
numPartitions
option
to increase the parallelism (This also determines the maximum number of concurrent JDBC connections)
queryTimeout
option
to increase the timeouts for the write option.