Up until now, all dependency ordering in dev-pipeline is managed using a function that provides a valid sequence as a Python list. The simplified code is basically this:
build_order = resolve(self.targets, self.components)
for target in build_order:
for task in self._tasks:
task(target)
This has been fine for initial versions, but it’s not very flexible.
- There’s no way to perform multiple tasks at the same time. While we may not want to build two components at the same time (I use ninja, so it already uses every core), there’s no reason I can’t perform source control tasks (network based) at the same time.
- There’s no good way to gracefully handle failures. If something fails in the middle of the tasks, the only safe option is to abort and skip all remaining work. I generally like to fail early, but there are times when make ‑k is useful.
- There are false dependencies with this pattern. If a component uses a support library (e.g., gtest or fmt), this forces its checkout to wait until the support libraries are checked out. The actual dependency is on the support libraries being built (which in turn depends on a complete checkout), but that isn’t expressed here.
Starting in version 0.5.0, work is ordered based on tasks. The simplified version of that code looks like this:
for component_tasks in task_queue:
for component_task in component_tasks:
try:
component_task[1](component_task[0])
task_queue.resolve(component_task)
except Exception as failure:
task_queue.fail(component_task)
task_queue is an object that spits out lists of resolved tasks (component_tasks); this list contains every task whose dependencies are met. We can iterate this list like normal, marking tasks as successful or failed as they complete.
Let’s see how it compares to the previous design.
- Because component_tasks will only contain fully resolved work, worker threads can resolve or fail tasks as they complete while the main thread waits for more work. There are still considerations like how many threads to create (right now I’m leaning toward one thread per type of task), there’s at least some plumbing to work off of.
- Handling failures is now trivial, and the above code does it. The actual code here is a bit more complex since failures can be handled in various ways (by default, everything aborts), but again, there’s plumbing to build off of. When task_queue.fail is called, it recursively removes any dependent tasks from consideration (removed tasks are returned as a list).
- False dependencies no longer exists. As of 0.5.0, build.config files support task-specific dependency declarations (see below), permitting much more precise and complex dependency ordering.
So is this just gold plating dependency manager, or does it make a difference practice? For some components, this feature is actually required. Consider building something like Google Breakpad, which includes several third-party components are checked out in the build tree. Using the previous dependency management, we’d need to do something like this:
# This snippet is completely untested
[DEFAULT]
scm.tool = git
build.tool = nothing
[breakpad]
git.uri = https://chromium.googlesource.com/breakpad/breakpad
git.revision = 282996a9
[breakpad-lss]
git.uri = https://chromium.googlesource.com/linux-syscall-support
git.revision = 92920301
scm.src_path = ${breakpad:dp.src_dir}/src/third-party/lss
depends = breakpad
[breakpad-gyp]
git.uri = https://chromium.googlesource.com/external/gyp
git.revision = 1f374df9
scm.src_path = ${breakpad:dp.src_dir}/src/third-party/gyp
depends = breakpad
[breakpad-dummy]
scm.tool = nothing
depends =
breakpad-lss,
breakpad-gyp
# build configuration stuff
Because we need the lss and gyp checkouts to complete before we can build, the actual breakpad component can’t build itself; we need to have a dummy component to actually do the work. This is, to say the least, non-intuitive and easy to mess up.
With the new task-based system though, dependencies can be setup properly. The breakpad component is free to have its build task depend on the lss and gyp components, and they can have their scm tasks depend on breakpad. That configuration looks like this:
[DEFAULT]
scm.tool = git
# fake the build for now
build.tool = nothing
[breakpad]
git.uri = https://chromium.googlesource.com/breakpad/breakpad
git.revision = 282996a9
depends.build =
breakpad-lss,
breakpad-gyp
# build configuration stuff
[breakpad-lss]
git.uri = https://chromium.googlesource.com/linux-syscall-support
git.revision = 92920301
scm.src_path = ${breakpad:dp.src_dir}/src/third-party/lss
depends.checkout =
breakpad
[breakpad-gyp]
git.uri = https://chromium.googlesource.com/external/gyp
git.revision = 1f374df9
scm.src_path = ${breakpad:dp.src_dir}/src/third-party/gyp
depends.checkout =
breakpad
The dummy component has been removed, meaning the configuration is more in line with what’s actually happening. In fact, the above configuration works fine (ignoring the fact that there’s no plugin to build with configure/autotools).
$ dev-pipeline bootstrap --executor=dry-run
breakpad (checkout)
-----------------------
Checking out breakpad
Executing: ['git', 'clone', 'https://chromium.googlesource.com/breakpad/breakpad', '/tmp/dpl-test2/breakpad']
Updating breakpad
Executing: ['git', 'checkout', '282996a9']
breakpad-lss (checkout)
---------------------------
Checking out breakpad-lss
Executing: ['git', 'clone', 'https://chromium.googlesource.com/linux-syscall-support', '/tmp/dpl-test2/breakpad/src/third-party/lss']
Updating breakpad-lss
Executing: ['git', 'checkout', '92920301']
breakpad-gyp (checkout)
---------------------------
Checking out breakpad-gyp
Executing: ['git', 'clone', 'https://chromium.googlesource.com/external/gyp', '/tmp/dpl-test2/breakpad/src/third-party/gyp']
Updating breakpad-gyp
Executing: ['git', 'checkout', '1f374df9']
breakpad-lss (build)
------------------------
breakpad-gyp (build)
------------------------
breakpad (build)
--------------------
$
There’s still work to do before the new changes are completely ready for prime time, but the changes have already been merged into the master branches of the impacted modules.