How to deal with enormous line lengths created by mysqldump

问题

I'm using mysqldump in a cron job to backup a database with over 2 million rows.

It creates a text file which can be used to restore the datalog from the command line.

I thought it would be useful to edit the dump before a restore as a quick way of changing values and table or column names - at least until I learn more and become confident about doing it with ALTER and UPDATE.

Editing large text files does not bother me, but I was surprised to find that in a 250 megabyte dump of my database, there were only about 300 lines. Each line was something like 800k characters long.

Is there another way of generating dumps with more control over line length?

Or should I post-process the dump with tools like sed or Perl?

回答1:

By default, mysqldump generates only one INSERT command per table, resulting in one (very long) line of inserted data for each table that got dumped. This is essentially because the "batch" inserts are much faster than if it generated a separate INSERT query for every record in every table.

So, it's not that mysqldump has created arbitrarily long lines, and you can just impose some other cutoff length. The lines are long for a reason.

If it's really important to get the INSERTs broken down onto multiple lines, you can indicate that with:

mysqldump --extended-insert=FALSE --complete-insert=TRUE ...

Note, however, that restoring tables will take longer in this format.

回答2:

I was browsing the MySQL source code looking for a solution to this problem today. The maximum line length is enforced by the variable opt_net_buffer_length which is supposed to match the MySQL server's buffer size. It is comically large.

But anyway, it's an option, so just do this:

mysqldump --net_buffer_length=5000 ...

The minimum value is 4096.

回答3:

I came across an answer on the MySQL forums, which conclusively shows adding '\n' after each INSERT group is not possible using mysqldump alone, without modifying the source:

The extended format cannot be 100% properly parsed based on the comma or parenthesis, you would to count fields. The best solution, fix mysqldump to linebreak on output.

Very minor change: On line 3506, you can see where the row-ending comma is output:
fputc(',',md_result_file); /* Always row break */

Simply insert this line immediately after line 3506:
fputc('\n',md_result_file); /* Lon Binder says wrap that line! */

re-compile and done.

@see http://forums.mysql.com/read.php?28,420002,426110#msg-426110

Thanks Lon B!

(I've included the content from the MySQL forum just in case the forum disappears.)

回答4:

This flag is also worked:

mysqldump --skip-extended-insert

Just same as --extended-insert=FALSE.

回答5:

Using a regex to split lines is not enough, you need a parser that will properly understand quotes and escaping characters.

I just wrote a parser since I couldn’t find one: http://blog.lavoie.sl/2014/06/split-mysqldump-extended-inserts.html

回答6:

Post-process the dump file with python. You may be happier than perl or sed.

If you're running on Linux, you already have it installed. If you're running on Windows, the installer is painless.

Before that, however, learn to use SQL UPDATE and SQL ALTER. You'll be happiest doing things the right way.

来源：https://stackoverflow.com/questions/1293529/how-to-deal-with-enormous-line-lengths-created-by-mysqldump

标签

mysql

mysqldump