问题
Is there any good way to handle large assets (i.e. 1000's of images, flash movies etc.) with a DVCS tool such as hg and git. As I see it, to clone repositories that are filled with 4 GB assets seems like an unnecessary overhead as you will be checking out the files. It seems rather cumbersome if you have source code mixed together with asset files.
Does anyone have any thoughts or experience in doing this in a web development context?
回答1:
These are some thoughts I've had on this matter of subject. In the end you may need to keep assets and code as separate as possible. I can think of several possible strategies:
Distributed, Two Repositories
Assets in one repo and code in the other.
Advantages
- In web development context you won't need to clone the giant assets repository if you're not working directly with the graphic files. This is possible if you have a web server that handles assets separate from dynamic content (PHP, ASP.NET, RoR, etc.) and is syncing with the asset repo.
Disadvantages
DVCS tools don't keep track of other repositories than their own so there isn't any direct BOM (Bill of Materials) support, i.e. there is no clear cut way to tell when both repositories are in sync. (I guess this is what git-submodule or repo is for).
Example: artist adds a new picture in one repository and programmer adds function to use the picture, however when someone has to backtrack versions they are forced to somehow keep track of these changes on their own.
Asset repository overhead even though it only affects those who do use it.
Distributed, One Repository
Assets and code reside in the same repository but they are in two separate directories.
Advantages
- Versioning of code and assets are interwoven so BOM is practical. Backtracking is possible without much trouble.
Disadvantages
- Since distributed version control tools keep track of the whole project structure there is usually no way to just check out one directory.
- You still have the problem with repository overhead. Even more so, you need to check out the assets as well as the code.
Both strategies listed above still have the disadvantage of having a large overhead since you need to clone the large asset repository. One solution to this problem is a variant of the first strategy above, two repositories; keep the code in the distributed VCS repo and the assets in a centralized VCS repo (such as SVN, Alienbrain, etc).
Considering how most graphic designers work with binary files there is usually no need to branch unless it is really necessary (new features requiring lots of assets that isn't needed until much later). The disadvantage is that you will need to find a way to back up the central repository. Hence a third strategy:
Off-Repository Assets (or Assets in CMS instead)
Code in repository as usual and assets are not in repository. Assets should be put in some kind of content/media/asset management system instead or at least is on a folder that is regularly backed up. This assumes that there is very little need to back-track versions with graphics. If there is a need for back-tracking then graphic changes are negligible.
Advantages
- Does not bloat the code repository (helpful for e.g. git as it frequently does file checking)
- Enables flexible handling of assets such as deployment of assets to servers dedicated for just assets
- If on CMS with a API, assets should be relatively easy to handle in code
Disadvantages
- No BOM support
- No easy extensive version back-tracking support, this depends on the backup strategy for your assets
回答2:
Thoughts, no experience: I would indeed seperate code from data. Assuming that there is a set of images that belongs to the application, I would just keep that on a centralized server. In the code, I would then arrange (through explicit coding) that the application can integrate both local or remote assets. People contributing can then put new images in their local store at first, integrating it with some kind of (explicit) upload procedure into the central store when required and approved.
回答3:
I've struggled with this myself. As you said, versioning GBs of assets can be a huge pain.
For projects that require external participation I've found Mercurial to be a working solution, but not a great one. It eats up disks space for large files and can be fairly slow depending on the circumstances.
For my in-house design work I prefer to use simple syncing tools (rsync, synctoy, whatever else) to keep directories up-to-date between servers/machines and then do version control manually. I find I rarely need to version-control for anything beyond major revisions.
回答4:
One fairly popular option within the game development industry (with huge repositories) is to use Plastic SCM.
They have options to store blobs in the file system instead of the database.
https://www.plasticscm.com
来源:https://stackoverflow.com/questions/1284669/how-do-i-manage-large-art-assets-appropriately-in-dvcs