问题
I'm posting this as a Q&A to document a workaround for a problem that seems to come up frequently—how to put MySQL Workbench files under version control—but for which I have been unable to find any solutions. Feedback is welcome!
How can I tell Mercurial to diff
the contents of a zipped archive and ignore some of the changes to those contents? Specifically, how can I use hg
to diff the contents of a MySQL Workbench (.mwb
) file, ignoring the many unimportant changes that MySQL Workbench makes every time the file is opened? Can I use a custom script that ignores certain irrelevant changes?
Background
I am trying to diff
a file in an hg repository. The file, document.mwb.xml
, is an XML document extracted from a .mwb
file (a MySQL Workbench model file). Basically, I am looking to keep the model's contents—the table structure, visual model, etc.—under version control, but not commit the .mwb
file itself, which is a zip archive and thus a binary file.
Anytime I save the .mwb
file, I unzip it. I keep the unzipped contents in my repository, and just zip them up again when I need to work with the .mwb
in MySQL.
The XML in question looks like this:
<?xml version="1.0"?>
<data grt_format="2.0" document_type="MySQL Workbench Model" version="1.4.4">
<value type="object" struct-name="workbench.Document" id="8551CCFA-3AD0-4207-BC76-15ED589CF22C" struct-checksum="0x7131bf99">
<value type="object" struct-name="workbench.logical.Model" id="B48E1CD2-3386-40B7-8E59-AA191598F667" struct-checksum="0xf4220370" key="logicalModel">
<value _ptr_="0x7fbcd1cc3270" type="list" content-type="object" content-struct-name="workbench.logical.Diagram" key="diagrams"/>
<value _ptr_="0x7fbcd1cc3210" type="dict" key="customData"/>
<value _ptr_="0x7fbcd1cc32d0" type="list" content-type="object" content-struct-name="model.Marker" key="markers"/>
<value _ptr_="0x7fbcd1cc3330" type="dict" key="options"/>
<value type="string" key="name"></value>
<link type="object" struct-name="GrtObject" key="owner">8551CCFA-3AD0-4207-BC76-15ED589CF22C</link>
</value>
<value _ptr_="0x7fbcd1cc2b70" type="list" content-type="object" content-struct-name="workbench.OverviewPanel" key="overviewPanels"/>
<value _ptr_="0x7fbcd1cc2c00" type="list" content-type="object" content-struct-name="workbench.physical.Model" key="physicalModels">
<value type="object" struct-name="workbench.physical.Model" id="34B9E967-5C9B-4D1B-8759-C417F6C33AA3" struct-checksum="0x5f896d18">
...
The problem is all of those _ptr_
attributes: there are literally thousands of them in this file, and every one of them changes every single time the file is saved, even if nothing is modified. As a result, the repository can rapidly get cluttered with completely meaningless "changes" to this file.
Is there a way to use a custom diff
routine to ignore these irrelevant changes?
回答1:
I have not found a true solution, but I have developed a satisfactory workaround, inspired by this mwb-diff gist. This allows me to unzip and diff the .mwb
file's contents, commit those contents and their changes to the repository, and use the .mwb
normally when necessary.
Project Structure
My project is set up like this:
project_root
/dist
/schema
/src
/test
I save the .mwb
file - call it MyModel.mwb
- in project_root/schema
. Obviously, you can use a different structure, but you will need to modify the instructions below accordingly.
The Scripts
I created the following scripts and saved them in project_root/schema
:
unpack.sh
#!/bin/bash
# Unzip the model (MyModel.mwb) into a particular directory (project_root/schema/MyModel)
unzip -o MyModel.mwb -d MyModel/
# Replace all _ptr_="...." attributes with _ptr_="xxx"
sed -i presed -E 's/_ptr_="0x[0-9a-f]+"/_ptr_="xxx"/g' MyModel/document.mwb.xml
pack.sh
#!/bin/bash
# This file goes into the directory containing the model contents, zips them up, and saves them as a .mwb model
cd MyModel/
zip -r ../MyModel.mwb ./* -x lock
cd ..
Getting Mercurial Ready to Rock
We need to tell hg
to ignore the model (and all other .mwb
files). Also, when MySQL Workbench is open, it adds a lock
file to the .mwb
archive, which we need to ignore. So, add these lines to your .hgignore
file:
*.mwb
*.mwb.bak
schema/MyModel/lock
An Aside About the
data.db
FileOptionally, also ignore the
data.db
file (a SQLite database) in the.mwb
file. It is a binary file that contains anyINSERT
s or other non-create SQL statements that are part of your model. As a rule, I don't use MySQL Workbench for this stuff; I use it only to create and edit tables, views, etc. So, I added this line to.hgignore
:schema/MyModel/data.db
If you want to track changes to the
data.db
file, you may need to modify this workaround.
How to Use the Scripts
When you want to modify the .mwb
file, rebuild it from its components by running pack.sh
above. This could be added as a hook to happen automatically when you hg pull
, update, etc., but I haven't explored this, yet.
When you are done editing the .mwb
file and want to commit your changes, run the unpack.sh
script. If you want, you can set up a file-watching utility on your system to do this automatically when the file changes, but that's beyond the scope of this answer.
The Results
Mercurial is now perfectly happy to track changes to the contents of your .mwb
file without tracking thousands of apparently-useless _ptr_
attributes. Also, while I am using this with Mercurial, the basic logic (and the shell scripts) will work with git, SVN, etc.
IMPORTANT CAVEAT: As far as I can tell, the _ptr_
attributes are irrelevant. The scripts I have posted above actually replace the contents of those attributes. _ptr_="0x98a7b3e4"
(or whatever) becomes _ptr_"xxx"
. Based on my testing, this doesn't matter, and MySQL Workbench will happily work with the reconstituted file, apparently disregarding the _ptr_
values. I may be wrong, and these values may matter! You are strongly encouraged to test this for yourself before relying on my solution.
来源:https://stackoverflow.com/questions/33614759/hg-diff-on-mysql-workbench-files