Do you put your database static data into source-control ? How?

后端 未结 9 1620
逝去的感伤
逝去的感伤 2020-12-14 18:17

I\'m using SQL-Server 2008 with Visual Studio Database Edition.

With this setup, keeping your schema in sync is very easy. Basically, there\'s a \'compare schema\' t

相关标签:
9条回答
  • 2020-12-14 18:47

    I have come across this when developing CMS systems.

    I went with appending the static data (the stuff referenced in the code) to the database creation scripts, then a separate script to add in any 'initialisation data' (like countries, initial product population etc).

    0 讨论(0)
  • 2020-12-14 18:51

    If you are changing the static data (adding a new item to the table that is used to generate a drop-down list) then the insert should be in source control and deployed with the rest of the code. This is especially true if the insert is needed for the rest of the code to work. Otherwise, this step may be forgotten when the code is deployed and not so nice things happen.

    If static data comes from another source (such as an import of the current airport codes in the US), then you may simply need to run an already documented import process. The import process itself should be in source control (we do this with all our SSIS packages), but the data need not be.

    0 讨论(0)
  • 2020-12-14 18:53

    I have explained the technique I used in my blog Version Control and Your Database. I use database metadata (in this case SQL Server extended properties) to store the deployed application version. I only have scripts that upgrade from version to version. At startup the application reads the deployed version from the database metadata (lack of metadata is interpreted as version 0, ie. nothing is yet deployed). For each version there is an application function that upgrades to the next version. Usually this function runs an internal resource T-SQL script that does the upgrade, but it can be something else, like deploying a CLR assembly in the database.

    There is no script to deploy the 'current' database schema. New installments iterate trough all intermediate versions, from version 1 to current version.

    There are several advantages I enjoy by this technique:

    • Is easy for me to test a new version. I have a backup of the previous version, I apply the upgrade script, then I can revert to the previous version, change the script, try again, until I'm happy with the result.
    • My application can be deployed on top of any previous version. Various clients have various deployed version. When they upgrade, my application supports upgrade from any previous version.
    • There is no difference between a fresh install and an upgrade, it runs the same code, so I have fewer code paths to maintain and test.
    • There is no difference between DML and DDL changes (your original question). they all treated the same way, as script run to change from one version to next. When I need to make a change like you describe (change a default), I actually increase the schema version even if no other DDL change occurs. So at version 5.1 the default was 'foo', in 5.2 the default is 'bar' and that is the only difference between the two versions, and the 'upgrade' step is simply an UPDATE statement (followed of course by the version metadata change, ie. sp_updateextendedproperty).
    • All changes are in source control, part of the application sources (T-SQL scripts mostly).
    • I can easily get to any previous schema version, eg. to repro a customer complaint, simply by running the upgrade sequence and stopping at the version I'm interested in.

    This approach saved my skin a number of times and I'm a true believer now. There is only one disadvantage: there is no obvious place to look in source to find 'what is the current form of procedure foo?'. Because the latest version of foo might have been upgraded 2 or 3 versions ago and it wasn't changed since, I need to look at the upgrade script for that version. I usually resort to just looking into the database and see what's in there, rather than searching through the upgrade scripts.

    One final note: this is actually not my invention. This is modeled exactly after how SQL Server itself upgrades the database metadata (mssqlsystemresource).

    0 讨论(0)
  • 2020-12-14 18:54

    Here at Red Gate we recently added a feature to SQL Data Compare allowing static data to be stored as DML (one .sql file for each table) alongside the schema DDL that is currently supported by SQL Compare.

    To understand how this works, here is a diagram that explains how it works.

    The idea is that when you want to push changes to your target server, you do a comparison using the scripts as the source data source, which generates the necessary DML synchronization script to update the target. This means you don't have to assume that the target is being recreated from scratch each time. In time we hope to support static data in our upcoming SQL Source Control tool.

    David Atkinson, Product Manager, Red Gate Software

    0 讨论(0)
  • 2020-12-14 18:59

    First off, I have never used Visual Studio Database Edition. You are blessed (or cursed) with whatever tools this utility gives you. Hopefully that includes a lot of flexibility.

    I don't know that I'd make that big a difference between your type 1 and type 2 static data. Both are sets of data that are defined once and then never updated, barring subsequent releases and updates, right? In which case the main difference is in how or why the data is as it is, and not so much in how it is stored or initialized. (Unless the data is environment-specific, as in "A" for development, "B" for Production. This would be "type 4" data, and I shall cheerfully ignore it in this post, because I've solved it useing SQLCMD variables and they give me a headache.)

    First, I would make a script to create all the tables in the database--preferably only one script, otherwise you can have a LOT of scripts lying about (and find-and-replace when renaming columns becomes very awkward). Then, I would make a script to populate the static data in these tables. This script could be appended to the end of the table script, or made it's own script, or even made one script per table, a good idea if you have hundreds or thousands of rows to load. (Some folks make a csv file and then issue a BULK INSERT on it, but I'd avoid that is it just gives you two files and a complex process [configuring drive mappings on deployment] to manage.)

    The key thing to remember is that data (as stored in databases) can and will change over time. Rarely (if ever!) will you have the luxury of deleting your Production database and replacing it with a fresh, shiny, new one devoid of all that crufty data from the past umpteen years. Databases are all about changes over time, and that's where scripts come into their own. You start with the scripts to create the database, and then over time you add scripts that modify the database as changes come along -- and this applies to your static data (of any type) as well.

    (Ultimately, my methodology is analogous to accounting: you have accounts, and as changes come in you adjust the accounts with journal entries. If you find you made a mistake, you never go back and modify your entries, you just make a subsequent entries to reverse and fix them. It's only an analogy, but the logic is sound.)

    0 讨论(0)
  • 2020-12-14 19:04

    For the first two steps, you could consider using an intermediate format (ie XML) for the data, then using a home grown tool, or something like CodeSmith to generate the SQL, and possible source files as well, if (for example) you have lookup tables which relate to enumerations used in the code - this helps enforce consistency.

    This has another benefit that if the schema changes, in many cases you don't have to regenerate all your INSERT statements - you just change the tool.

    0 讨论(0)
提交回复
热议问题