Resolving Application Dependencies with Git Submodules
Last updated December 02, 2024
Table of Contents
Most modern applications rely heavily on third-party libraries and must specify these dependencies within the application repository. Tools like RubyGems, Maven in Java, or Python’s pip are all dependency managers that translate a list of stated application dependencies into the code or binaries that the application uses during execution.
Sometimes the dependency manager can’t resolve the required third-party libraries. Examples are private libraries that aren’t publicly accessible or libraries whose maintainers haven’t packaged them for distribution via the dependency manager. In these cases, you can use Git submodules to manually manage external dependencies.
This guide discusses the pros and cons of dependency management with Git submodules and some alternative approaches to consider to avoid using submodules.
Git Submodules
Git submodules are a feature of the Git SCM that you can use to include the contents of one repository within another by specifying the referenced repository location. It’s a mechanism of including an external library’s source into an application’s source tree.
For example, to include the FooBar
source into the heroku-rails
project, use the git submodule add
command.
$ cd ~/Code/heroku-rails
$ git submodule add https://github.com/myusername/FooBar lib/FooBar
Cloning into 'lib/FooBar'...
remote: Counting objects: 26, done.
remote: Compressing objects: 100% (17/17), done.
remote: Total 26 (delta 8), reused 19 (delta 5)
Unpacking objects: 100% (26/26), done.
This command creates a submodule called FooBar
and places a FooBar
directory with the library’s full source tree into the lib
application directory.
After a Git submodule is added locally, commit the new submodule reference to your application repository.
$ git commit -am "adding a submodule for FooBar"
[main 314ef62] adding a submodule for FooBar
2 files changed, 4 insertions(+)
create mode 160000 FooBar
Heroku resolves and fetches submodules as part of deployment.
$ git push heroku
Counting objects: 13, done.
...
-----> Heroku receiving push
-----> Git submodules detected, installing Submodule 'FooBar' (https://github.com/myusername/FooBar.git) registered for path 'FooBar'
Initialized empty Git repository in /tmp/build_2qfce3fkvrug9/FooBar/.git/
Submodule path 'FooBar': checked out '667e0b5717631a8cca657a0aa306c045f06cfda4'
-----> Ruby/Rails app detected
...
Failures to fetch the submodules cause the build to fail.
If possible, use your language’s preferred dependency resolution mechanisms. Submodules can be confusing and error-prone.
Using submodules for builds on Heroku is only supported for builds triggered with Git pushes. Builds created with the API don’t resolve submodules. The same is true for GitHub sync.
Vendoring
While Git submodules are one way to quickly reference external library sources, users often run into issues with its nuanced update lifecycle. If you find the usability of submodules to be counterproductive, you can vendor the code into the project.
Many frameworks allow the use of vendored code, which simply copies the source of the reference library into the application’s source tree.
$ git clone <remote repo> /path/to/some/directory
$ cp -R /path/to/some/directory /app/vendor/directory
$ git add app/vendor/directory
A downside of this approach is that it requires a manual download and copy process when the external library is updated. However, for an external resource that changes slowly or one that you don’t want to introduce changes from, this approach is an option.