When should I use dependsOn?

In short, Gradle works by calculating task dependency diagrams. Suppose you want to build a JAR file: you want to call the JAR task, and Gradle will decide to build the JAR, which needs to compile classes, process resources, and so on… Determining task dependencies, that is, what other tasks need to be performed, is done by looking at three different things:

Tasks depend on dependencies. For example, assemble. DependsOn (JAR) says that if run assemble, the JAR task must first perform the task to deliver the dependency, in which case we are not talking about the task, we are talking about the "publication". For example, when you need to compile project A, you need to be on classpath project B, which means running some of B's tasks. Last but not least, task input, that is, what does it need to do its jobCopy the code

Look at the following code:

It’s tempting to think in the same way as other build tools, such as Maven or Ant, especially if you’re not used to Gradle. You’re thinking “There’s a task, jar, that basically packages everything it finds in classes/groovy/main, so if I want to add more to the JAR task, let’s add more to classes/groovy/main”.

This is wrong for different reasons, the most obvious being:

When the docsFilesJar task will be executed, it will contribute more files to the “classes” directory, but wait, those aren’t the classes we put there, are they? It’s just a jar, a resource. Shouldn’t we use resources/groovy/main instead? Or classes/groovy/resources? Or what? Well, you shouldn’t care, because you don’t care where the Java compilation task puts its output!

It breaks cacheability: Gradle has a build cache, and multiple tasks contributing to the same output directory is a classic example of breaking the cache. In fact, it breaks all the latest checks, that is, Gradle understands that it doesn’t need to execute the task without any changes.

It is opaque to Gradle: the code above executes a copy in a doLast block. There is nothing to tell Gradle that a “class” has additional output.

Imagine another task that requires only classes. Depending on when it executes, it may or may not contain docsFileJar that it doesn’t care about. This makes the build non-reproducible (note that this is exactly why Maven builds can’t be trusted and you need to run clean, because any “target” can write to any directory at any time, so it’s impossible to infer who contributed what).

It needs to declare an explicit dependency between the JAR task and the docsFileJar task to ensure that our “docs Jar” file exists if we execute the JAR

It doesn’t say why there are dependencies: is it because you want to order something, or because you need to rely on the artifacts that the task produces? What else?

It’s easy to forget: Because you probably run builds a lot, you probably think of your build work because the JAR is part of the task diagram, and by chance, docsFileJar executes before

It creates unexpected extra work: in most cases, dependsOn triggers too much work. Gradle is an intelligent build tool that calculates exactly what needs to be done for each particular task. By using dependsOn, you are kind of using a hammer and forcing it to incorporate something unnecessary into the diagram. In short: you’re doing too much work.

It’s hard to get rid of them: When you see a dependsOn, because it doesn’t say why you need it, it’s often hard to get rid of this dependency when optimizing a build

Use implicit dependencies instead!

The answer to our question is actually easier to reason about: reverse logic. Instead of thinking “where should I put this stuff so it can be picked up by the JAR”, think “Let’s tell the JAR task that it needs to pick up my resources too”.

All in all, it’s about correctly declaring your task input.

Instead of tinkering with the output of another task (seriously, forget this!) Each task must be thought of as a function that takes input and produces output: it is isolated. So, what’s the input to our docsFileJar? We want to package the resources. What is its output? The jar itself. There were no questions about where we should put the jars, we let Gradle choose a reasonable place for us.

So what is the input of the JAR task itself? Well, that’s regular input plus our JAR. It’s easier to reason, and as a bonus, it’s even shorter!

So, let’s rewrite the above code as:

Can you see the difference? We got rid of the copy in the docFilesJar task, and we didn’t want to do that. Instead, we want to say “When you build the JAR, select this docsFileJar as well. That’s what we’re telling us we’re doing with docsFileJar. Gradle is smart enough to know when it needs to perform jar tasks. First, it needs to build the docsFilesJar.

This has several advantages:

  • Dependencies become implicit: if we no longer want to include the JAR, we can simply remove it from the input specification.
  • It does not contaminate the output of other tasks
  • DocsFileJar can be executed independently of the JAR
  • All in all, this is about keeping things separate from each other and reducing the risk of accidentally breaking the build!

The code above works, but it has one drawback: Even if we call something we don’t need, the docFilesJar and JAR tasks are configured (instantiated). For example, suppose you call Gradle compileJava: there is no reason to configure jar tasks there because we will not execute them.

To conclude:

Avoid explicit dependsOn if possible

I tend to say that the only rational use case for dependsOn is a life cycle task (a life cycle task is a task whose goal is just “organizational build”, e.g. build, assemble, check: they don’t do anything themselves, they just bind some dependents together)

If you find that a use case is not a lifecycle task and cannot be expressed through implicit task dependencies (such as declaring inputs instead of dependencies), report it to the Gradle team