I have a submodule in my git repository and my directory structure is like,
app
-- folder1
-- folder2
-- submodule @5855
I have deployed
I ran into this issue myself and, thanks to the awesome suggestions by @matt-bucci I was able to come up with what seems like a robust solution.
My specific use-case is slightly different - I am using Lambda Layers to reduce lambda redundancy, but still need to include the layers as submodules in the Lambda function repositories so that CodeBuild can build and test PRs. I am also using CodePipeline to assist with continuous delivery - so I need a system that works with both CodePipeline and CodeBuild by itself
I created a new SSH key for use by a "machine user" following these instructions. I am using a machine user in this case so that a new ssh key doesn't need to be generated for every project, as well as for potential support of multiple private submodules
I stored the private key in the AWS Parameter Store as a SecureString. This doesn't actually change anything within CodeBuild, since it's smart enough to just know how to decrypt the key
I gave the "codebuild" role AWS managed property: AmazonSSMReadOnlyAccess - allowing CodeBuild to access the private key
I made my buildspec.yml file, using a bunch of the commands suggested by @matt-bucci, as well as some new ones
# This example buildspec will enable submodules for CodeBuild projects that are both
# triggered directly and via CodePipeline
#
# This buildspec is designed with help from Stack Overflow:
# https://stackoverflow.com/questions/42712542/how-to-auto-deploying-git-repositories-with-submodules-on-aws
version: 0.2 # Always use version 2
env:
variables:
# The remote origin that will be used if building through CodePipeline
remote_origin: "git@github.com:your/gitUri"
parameter-store:
# The SSH RSA Key used by our machine user
ssh_key: "ssh_key_name_goes_here"
phases:
install:
commands:
# Add the "machine user's" ssh key and activate it - this allows us to get private (sub) repositories
- mkdir -p ~/.ssh # Ensure the .ssh directory exists
- echo "$ssh_key" > ~/.ssh/ssh_key # Save the machine user's private key
- chmod 600 ~/.ssh/ssh_key # Adjust the private key permissions (avoids a critical error)
- eval "$(ssh-agent -s)" # Initialize the ssh agent
- ssh-add ~/.ssh/ssh_key # Add the machine user's key to the ssh "keychain"
# SSH Credentials have been set up. Check for a .git directory to determine if we need to set up our git package
- |
if [ ! -d ".git" ]; then
git init # Initialize Git
git remote add origin "$remote_origin" # Add the remote origin so we can fetch
git fetch # Get all the things
git checkout -f "$CODEBUILD_RESOLVED_SOURCE_VERSION" # Checkout the specific commit we are building
fi
# Now that setup is complete, get submodules
- git submodule init
- git submodule update --recursive
# Additional install steps... (npm install, etc)
build:
commands:
# Build commands...
artifacts:
files:
# Artifact Definitions...
This install script performs three discrete steps
It installs and enables the ssh private key used to access private repositories
It determines if there is a .git folder - if there isn't then the script will initialize git and checkout the exact commit that is being built. Note: According to the AWS docs, the $CODEBUILD_RESOLVED_SOURCE_VERSION
envar is not guranteed to be present in CodePipeline builds. However, I have not seen this fail
Finally, it actually gets the submodules
Obviously, this is not a great solution to this problem. However, it's the best I can come up with given the (unnecessary) limitations of CodePipeline. A side effect of this process is that the "Source" CodePipeline stage is completely worthless, since we just overwrite the archived source files - it's only used to listen for changes to the repository
Better functionality has been requested for over 2 years now: https://forums.aws.amazon.com/thread.jspa?threadID=248267
I realized (the hard way) that my previous response didn't support CodePipeline builds, only builds run through CodeBuild directly. When CodeBuild responds to a GitHub Webhook, it will clone the entire GitHub repository, including the .git folder
However, when using CodePipeline, the "Source" action will clone the repository, check out the appropriate branch, then artifact the raw files without the .git folder. This means that we do have to initialize the github repository to get access to submodules