Many sbt tasks depend on a collection of files. For example, the
package
task generates a jar file containing the resources and class files,
which are generated by the compile
task, for a project. Staring with version
1.3.0, sbt provides a file management system that tracks the inputs and outputs
of any task. The task can query which of its file dependencies have changed
since the task last completed allowing it to incrementally re-build only the
modified files. This system integrates with Triggered
execution so that the file dependencies of a task are
automatically monitored in a continuous build.
To best illustrate the file tracking system, we construct a build.sbt that
illustrates all of the essential features. The example will be a project that is
able to build a shared library in c using gcc. This will be done with two tasks:
buildObjects
, which compiles c source files to object files, and
linkLibrary
, which links the object files into a shared library. These can be
defined with:
import java.nio.file.Path
val buildObjects = taskKey[Seq[Path]]("Compiles c files into object files.")
val linkLibrary = taskKey[Path]("Links objects into a shared library.")
The buildObjects
task will depend on *.c
source file inputs. The linkLibrary
task depends on the output *.o
object files generated by buildObjects
. This
creates a build pipeline: if none of the input sources to buildObjects
are
modified between calls to linkLibrary
then neither compilation nor linking
should occur. Conversely, when input source changes are detected, sbt should
both generate new object files corresponding to the modified source files and
link the shared library.
It is natural for a task to specify the inputs on which it depends. These are
set with the fileInputs
key, which has type: Seq[Glob]
(see Globs).
The fileInputs
are specified as Seq[Glob]
so that more than one search query
may be provided, which may be necessary if sources are located in multiple
directories or different file types are needed within the same task.
When the fileInputs
key is set in a given scope, sbt automatically generates
a task named allInputFiles
for that scope that returns a Seq[Path]
containing all of the files matching the fileInputs
queries. For convenience,
there is an extension method defined for Task[_]
that translates
foo.inputFiles
to (foo / allInputFiles).value
. We can use these to write a
simple implementation of buildObjects
:
import scala.sys.process._
import java.nio.file.{ Files, Path }
import sbt.nio._
import sbt.nio.Keys._
val buildObjects = taskKey[Seq[Path]]("Compiles c files into object files.")
buildObjects / fileInputs += baseDirectory.value.toGlob / "src" / "*.c"
buildObjects := {
val outputDir = Files.createDirectories(streams.value.cacheDirectory.toPath)
def outputPath(path: Path): Path =
outputDir / path.getFileName.toString.replaceAll(".c$", ".o")
val logger = streams.value.log
buildObjects.inputFiles.map { path =>
val output = outputPath(path)
logger.info(s"Compiling $path to $output")
Seq("gcc", "-c", path.toString, "-o", output.toString).!!
output
}
}
This implementation will gather all of the files ending with the *.c
extension
and shell out to gcc to compile them to the output directory.
sbt will automatically monitor any file matched by the globs specified by
fileInputs
. In this
case, modifying any file with *.c
extension in the src
directory will trigger
a build in a continuous build.
Every time that buildObjects
is invoked from the sbt shell, it will re-compile
all of the source files. This becomes expensive as the number of source files
increases. In addition to fileInputs
, sbt also provides another api,
inputFileChanges
, that provides information about what source files have changed
since the last time the task successfully completed. Using the
inputFileChanges
, we can make the build above incremental:
import scala.sys.process._
import java.nio.file.{ Files, Path }
import sbt.nio._
import sbt.nio.Keys._
val buildObjects = taskKey[Seq[Path]]("Generate object files from c sources")
buildObjects / fileInputs += baseDirectory.value.toGlob / "src" / "*.c"
buildObjects := {
val outputDir = Files.createDirectories(streams.value.cacheDirectory.toPath)
val logger = streams.value.log
def outputPath(path: Path): Path =
outputDir / path.getFileName.toString.replaceAll(".c$", ".o")
def compile(path: Path): Path = {
val output = outputPath(path)
logger.info(s"Compiling $path to $output")
Seq("gcc", "-fPIC", "-std=gnu99", "-c", s"$path", "-o", s"$output").!!
output
}
val sourceMap = buildObjects.inputFiles.view.map(p => outputPath(p) -> p).toMap
val existingTargets = fileTreeView.value.list(outputDir.toGlob / **).flatMap { case (p, _) =>
if (!sourceMap.contains(p)) {
Files.deleteIfExists(p)
None
} else {
Some(p)
}
}.toSet
val changes = buildObjects.inputFileChanges
val updatedPaths = (changes.created ++ changes.modified).toSet
val needCompile = updatedPaths ++ sourceMap.filterKeys(!existingTargets(_)).values
needCompile.foreach(compile)
sourceMap.keys.toVector
}
The FileChangeReport
makes it possible to write an incremental task without
manually tracking the input files. It is a sealed trait implemented by three case classes:
Changes
— indicates that one or more source files have been modified.
Unmodified
— none of the source file have been modified since the last
run.
Fresh
— there is no cache entry for the previous source file hashes.
It is sometimes convenient to pattern match on the result of the
inputFileChanges
:
foo.inputFileChanges match {
case FileChanges(created, deleted, modified, unmodified)
if created.nonEmpty || modified.nonEmpty =>
build(created ++ modified)
delete(deleted)
case _ => // no changes
}
The input file report says nothing about the outputs. This is why the
buildObjects
implementation needs to check the target directory to see which
outputs exist. In that example, there is a 1:1 mapping between inputs and
outputs, but this need not be the case in general. An implementation of buildObjects
may include header files in the fileInputs
. These are not compiled themselves, but they may
trigger re-compilation of one or more *.c
source files.
Note that calling buildObjects.inputFileChanges
also causes buildObjects /
fileInputs
to automatically be watched in a continuous build.
The outputs of a file are often best specified as the result of a task. In the
example above, buildObjects
is a Task
returning a Seq[Path]
containing the
object files generated by compilation. sbt will automatically track the outputs
of any task that returns one of the following result types: Path
, Seq[Path]
,
File
or Seq[File]
. We can use this to build on the buildObjects
example to
write a task that links the object into a shared library:
val linkLibrary = taskKey[Path]("Links objects into a shared library.")
linkLibrary := {
val outputDir = Files.createDirectories(streams.value.cacheDirectory.toPath)
val logger = streams.value.log
val isMac = scala.util.Properties.isMac
val library = outputDir / s"mylib.${if (isMac) "dylib" else "so"}"
val linkOpts = if (isMac) Seq("-dynamiclib") else Seq("-shared", "-fPIC")
if (buildObjects.outputFileChanges.hasChanges || !Files.exists(library)) {
logger.info(s"Linking $library")
(Seq("gcc") ++ linkOpts ++ Seq("-o", s"$library") ++
buildObjects.outputFiles.map(_.toString)).!!
} else {
logger.debug(s"Skipping linking of $library")
}
library
}
Here the tracking was simpler because linking a shared library is not
incremental. Thus we have to rebuild if any of the outputs of buildObjects
has
changed or if the library doesn’t exist.
Similar to fileInputs
, there is a fileOutputs
key. This can be used as an
alternative to returning the output files in the task when the outputs have a
known pattern. For example, buildObjects
could have been defined as:
val buildObjects = taskKey[Unit]("Compiles c files into object files.")
buildObjects / fileOutputs := target.value / "objects" / ** / "*.o"
This can be useful when using an opaque external tool where the mapping of inputs to outputs is not known.
Like allInputFiles
, there is an allOutputFiles
task of return type
Seq[Path]
that is automatically
generated for a task, foo
, if the return type of foo
is one of Seq[Path]
,
Path
, Seq[File]
or File
. It is also generated if foo / outputFiles
is
specified. When both fileOutputs
is specified and the return type represents a
file or collection of files, the result of allOutputFiles
is the distinct
union of the files returned by the task and the files described by ouputFiles
.
Calling foo.outputFiles
is syntactic sugar for (foo / allOutputFiles).value
.
The fileInputs
and fileOutputs
can be filtered beyond what is specified by
their Glob
patterns. sbt provides four settings of type
sbt.nio.file.PathFilter:
1. fileInputIncludeFilter
— only include file inputs that also match this
filter
2. fileInputExcludeFilter
— exclude any file inputs that also match this filter
3. fileOutputIncludeFilter
— only include file inputs that also match this
filter
4. fileOutputExcludeFilter
— exclude any file output that also match this filter
By default, sbt sets
`scala
fileInputExcludeFilter := HiddenFileFilter.toNio || DirectoryFilter
Both
fileInputIncludeFilter and
fileInputOutputFilter are set to
AllPassFilter.toNio. The
fileOutputExcludeFilter is set to
NothingFilter.toNio`.
To exclude files matching with test in the name from buildObjects
, write:
buildObjects / fileInputExcludeFilter := "*test*"
To preserve the previous excludes of hidden files and directories, write:
buildObjects / fileInputExcludeFilter :=
(buildObjects / fileInputExcludeFilter).value || "*test*"
or
buildObjects / fileInputExcludeFilter ~= { ef => ef || "*test*" }
In most cases, it shouldn’t be necessary to set the fileInputIncludeFilter
since the path name filtering it should be handled by fileInputs
itself. It
also shouldn’t commonly be necessary to filter the outputs.
sbt automatically generates an implementation of clean
scoped to the task
foo
whenever it also generates the allOutputFiles
task. Calling foo /
clean
will remove all of the files previously generated by foo
. It will not
re-evaluate foo
. For example, calling buildObjects / clean
will remove all
of the object files generated by the previous call to buildObjects
. The
generated clean tasks are not transitive. Calling linkLibrary / clean
will
delete the shared library but will not delete the object files generated by
buildObjects
.
For each input or output file tracked by sbt, there is an associated
FileStamp
. This can either be the last modified time of the file or a hash. By
default, inputs are tracked using the hash and outputs are tracked using the
last modified time. To change this, set the inputFileStamper
or
outputFileStamper
:
val generateSources = taskKey[Seq[Path]]("Generates source files from json schema.")
generateSources / fileInputs := baseDirectory.value.toGlob / "schema" / ** / "*.json"
generateSources / outputFileStamper := FileStamper.Hash
In a continuous build, ~bar
, for an arbitrary task, bar
, given some task,
foo
, any calls to foo.inputFiles
and foo.inputFileChanges
within bar
will cause all of the globs specified by foo / fileInputs
to be monitored in a
continuous build. Transitive file input dependencies are automatically
monitored. For example, the ~linkLibrary
continuous build command will monitor
the *.c
source files defined for buildObjects
.
Input files will only trigger a re-build if their hash has changed. This behavior can be overridden with:
Global / watchForceTriggerOnAnyChange := true
Changes to file outputs, which are gathered with either foo.outputFiles
or
foo.outputFileChanges
, do not trigger a re-build.
The stamps for each file are tracked on a per-task basis. They are only updated
if the incremental task itself succeeds. In the example above, this means that
the current file last modified times for buildObjects
are stored by the linkLibrary
task only when it succeeds. This means that buildObjects
can be run many times
between calls to linkLibrary
and linkLibrary
will see the cumulative changes
to the outputs of buildObjects
.
If linkLibrary
fails to complete, sbt will also skip updating the last
modified times for the outputs of buildObjects
corresponding to linkLibrary
because it is impossible to know in general which files were successfully
processed.