In a capsule's environment area, you will be able to switch between viewing the base environment (including available package managers), the postInstall (if there is one), and the underlying Dockerfile, which is the formal recipe for the environment. In general, a Dockerfile will be accompanied with a warning about editing, e.g.:
We advise you to use available package managers whenever possible. Doing so creates a transparent, user-friendly overview of installed packages, and will also automatically implement certain best practices in the Dockerfile.¹ When something needs more customized installation, the postInstall script should be your next resource (see this article for a tutorial on writing such scripts in general).
Sometimes, however, you will need a hand-edited Dockerfile. One such example is the published capsule TabbyXL: rule-based spreadsheet data extraction and transformation (version 1.0.4). In this capsule's environment, you'll see that the package managers have been disabled, and that there is a pom.xml file beneath the Dockerfile. The Dockerfile has been edited to copy that file into an accessible directory (
COPY pom.xml /tmp/ ) and to install the dependencies listed therein (
RUN cd /tmp && mvn package && rm -rf /tmp/target ).²
If you have a use case like this, here's how you would address it.
Building a capsule from a pom.xml or build.sbt file:
First, if you have any packages that can be installed via an available package manager, add them while the package managers are still available. This will pin versions when possible and make your Dockerfile clear and easy to read by default.
Second, move your project manifest file to the environment area (it should appear below the Dockerfile).
Third, click the 'Unlock' button in the Dockerfile.
Fourth, add a line to move the manifest file to an accessible location (either
COPY pom.xml /tmp/or
COPY build.sbt /tmp/).
Fifth, add a RUN command to change into the
/tmpdirectory and install the package's dependencies. For pom.xml, this looks like like
RUN cd /tmp && mvn package && rm -rf /tmp/target; for build.sbt, it will look like
RUN cd /tmp && sbt clean && sbt compile.
Sixth, edit or modify a
runscript in the
/codefolder to make the manifest file accessible again. For a pom.xml file:
mv /tmp/pom.xml .
while for build.sbt:
mv /tmp/project .
mv /tmp/build.sbt .
Seventh, continue the package creation process in your
runscript from the point after all the dependencies have been installed. In TabbyXL: rule-based spreadsheet data extraction and transformation (version 1.0.4), the next step is
mvn package -o; for a build.sbt, it will look like
sbt "set offline := true" run.
This is hard/boring, can you lend a hand?
Absolutely! If you have any questions, please write to us via live chat or an email to firstname.lastname@example.org, and we'll be happy to help.
What will a successful build look like?
A successful build will have clean, transparent code; will install all dependencies as part of the build phase rather than the run phase; and will output an executable that can be used to reproduce concrete results.
For instance, spacing for readability, pinned versions, and code that cleans and optimizes the environment -- in the screenshot above,
rm -rf /var/lib/apt/lists*.
This is preferable to uploading the pom.xml file to the
/codefolder, and building it from there, because installing those dependencies as part of the build phase rather than the run phase guarantees that all packages are available as part of the environment, and so don't need to be re-downloaded each time (which could fail on account of link rot).