In the DI test suite, each operator is tested like this:
(; f, x, grad, x0) = scenario
@test gradient(f, backend, x) == grad # without prep
prep = prepare_gradient(f, backend, x0) # prepare on different input
@test gradient(f, prep, backend, x) == grad # first test with prep
@test gradient(f, prep, backend, x) == grad # second test with prep
Preparing on a different input is essential to avoid data leakage, it has allowed me to uncover some nasty bugs (JuliaDiff/FiniteDiff.jl#185). But here what I had in mind is this second prepared test, to make sure that the preparation result is not invalidated or destroyed by a first differentiation call. One example might be a pullback closure which is invalidated after one reverse pass in the case of mutated arguments (JuliaDiff/DifferentiationInterface.jl#678).
Originally posted by @gdalle in #389 (comment)
In the DI test suite, each operator is tested like this:
Preparing on a different input is essential to avoid data leakage, it has allowed me to uncover some nasty bugs (JuliaDiff/FiniteDiff.jl#185). But here what I had in mind is this second prepared test, to make sure that the preparation result is not invalidated or destroyed by a first differentiation call. One example might be a pullback closure which is invalidated after one reverse pass in the case of mutated arguments (JuliaDiff/DifferentiationInterface.jl#678).
Originally posted by @gdalle in #389 (comment)