
Programme of the 24th Euro AD Workshop
Tuesday, November 2, 2021
 15^{00} –16^{30} Welcome, Session I (Chair: Nico Gauger, TU Kaiserslautern)
 Diana Necsulescu and Emil Slusanschi (University Politehnica of Bucharest)
Automatic Differentiation of Java Classfiles  Parallel Forward Vector Mode
ADiJaC is a source transformation automatic differentiation tool for Java Classfiles that is able to compute derivatives in both Forward and Reverse mode. The purpose of this talk is to present the implementation details of the Parallel Forward Vector Mode. The architecture of the targeted machine and the computational intensity of the code to be differentiated are taken into consideration in order to decide if the derivate computations are parallelized or not. A series of tests are run to attest the correctness and efficiency of the proposed solution.
 Johannes Blühdorn (TU Kaiserslautern)
EventBased Automatic Differentiation of OpenMP with OpDiLib
We present OpDiLib, a universal addon for operator overloading AD tools that enables the automatic differentiation of OpenMP parallel codes. Previously, support for OpenMP in reverse mode operator overloading AD tools was limited by the fact that pragma directives are outside the scope of the programming language and hence inaccessible with overloading techniques. We show how OMPT, a modern OpenMP feature, can be used to achieve fully automatic differentiation while retaining the original pragma directives; alternatively, a set of replacement macros can be used. Both approaches are supported by OpDiLib's eventbased differentiation logic. As there are no a priori restrictions on data access patterns and the AD workflow remains unchanged, OpDiLib is very easy to apply. Additionally, finegrained optimizations can be applied, including elimination of atomic updates on adjoint variables. We demonstrate that very good parallel performance can be achieved with OpDiLib in an OpenMPMPI hybrid parallel environment.
 Jan Kieseler (CERN)
Overview of the MODE project
The effective design of instruments that rely on the interaction of radiation with matter for their operation is a complex task. Furthermore, the underlying physics processes are intrinsically stochastic in nature and open a vast space of possible choices for the physical characteristics of the instrument. While even large scale detectors such as e.g. at the LHC are built using surrogates for the ultimate physics objective, the MODE Collaboration (an acronym for Machinelearning Optimized Design of Experiments) aims at developing tools also based on deep learning techniques to achieve endtoend optimization of the design of instruments via a fully differentiable pipeline capable of exploring the Paretooptimal frontier of the utility function for future particle collider experiments and related detectors. The construction of such a differentiable model requires inclusion of informationextraction procedures, including data collection, detector response, pattern recognition, and other existing constraints such as cost. This talk will give an introduction to the goals of the newly founded MODE collaboration and highlight some of the already existing ingredients.
 Jan Hueckelheim (Argonne National Laboratory)
OpenMP support in Tapenade
I will present joint work with Laurent, which introduced support for differentiation of OpenMP parallel loops with Tapenade in both forward and reverse mode. Besides a recap of differentiation rules for OpenMP scopes, I will discuss the run time support library that we developed to support dynamic schedules. If time allows, I would also like to talk about Spray (joint work with Johannes Doerfert), a library that could help Tapenade and other AD tools safely reversedifferentiate through parallel nonexclusive read access.
 16^{30} –17^{00} Break, Breakouts
 17^{00} –18^{00} Session II (Chair: Krishna Narayanan, Argonne National Labs)
 Tom Streubel (HumboldtUniversität zu Berlin)
Applying Taylor on Nonsmooth DAEs with Flow Network Structures
The simulation of gas transmission and distribution networks are a driving motivation for the development of new approaches towards the numerical approximation of solutions to nonsmooth systems of differential algebraic equations (DAE) with underlying flow network structure.
In this talk we will derive a class of generalized Taylor based integrators for such systems, based on higher order spline expansions of piecewise differentiable functions.
We will generalize the theorem of Taylor, the formula of Faa di Bruno and discuss numerical experiments.
 Kamil Khan (McMaster University)
Building ADcompatible linear underestimators of convex functions by sampling
(Joint work with Yingkai Song and Huiyi Cao.)
Deterministic methods for global optimization typically proceed by generating upper and lower bounds on the unknown objective value at a global minimum, and progressively improving these bounds. Lower bounds in global minimization are typically obtained by minimizing convex relaxations of the original objective function using a local optimization solver. However, in certain cases, this minimization may be difficult: some convex relaxations are significantly nonsmooth or noisy, and for other convex relaxations, crucial gradient/subgradient information may be unavailable. This presentation illustrates the first tractable method to construct a guaranteed linear underestimator by sampling a convex function of n variables (2n+1) times. This is compatible with established forward and reverse AD modes for subgradient evaluation by Mitsos et al. (2009) and Beckers and Naumann (AD2012), and is compatible with the established McCormick convex relaxation procedure that uses the same computational graphs and operator overloading as AD.
 Naumann (RWTH Aachen University)
A revised proof for the NPcompleteness of the chain rule of differentiation
The chain rule of differentiation is the fundamental prerequisite for computing accurate derivatives of composite functions which perform a potentially very large number of elemental function evaluations. Data flow dependences amongst the elemental functions give rise to a combinatorial optimization problem.
We formulate Chain Rule Differentiation and we prove
it to be NPcomplete. The new proof holds for derivatives of arbitrary order.
 18^{00} End of day 1

Wednesday, November 3, 2021
 15^{00} –16^{30} Session III (Chair: Bruce Christianson, University of Hertfordshire)
 Thomas Oberbichler (TU Munich)
Algorithmic Differentiation for interactive CADintegrated Isogeometric Analysis
In this talk, we discuss the application of algorithmic differentiation at the interface between architecture and civil engineering. The integration of analysis tools in computer aided design (CAD) enables structures to be generated and explored intuitively. To achieve a high degree of interactivity, the use of natural CAD geometric parametrization – for example NURBS – is also desirable at the analysis stage. Beyond NURBS, modern CAD systems provide other descriptions of freeform geometries, such as discrete meshes or subdivision surfaces. To perform various types of analysis with different geometric descriptions, it is necessary to generalize the process of CADintegrated isogeometric analysis (IGA) while also increasing the computational speed. To address this issue, we present a new, efficient, and modular approach for implementing CADintegrated analysis based on algorithmic differentiation. A featurerich digital toolbox can be derived from a set of highly optimized mechanical and geometric building blocks. We present this concept for a range of mechanical element types and geometric parameterizations. The method can be employed for classic structural analysis as well as in formfinding and the constraintdriven design of freeform geometries.
 Shreyas Gaikwad (The University of Texas at Austin)
SICOPOLISAD v2: An opensource tangentlinear and adjoint modeling framework for icesheet simulation enabled by the AD tool Tapenade
We present a new framework for the ice sheet model SICOPOLIS that enables adjoint and tangent linear code generation via source transformation using the opensource AD tool Tapenade. This framework has several advantages over earlier work using OpenAD: (1) it is uptodate with the latest SICOPOLIS code; (2) the AD tool Tapenade is opensource and actively maintained; (3) a new tangent linear code generation capability is introduced; (4) we are now able to deal with inputs in the NetCDF format; (5) we leverage continuous integration in order to track changes in the trunk that "break" the ADbased code generation; (6) we have now correctly incorporated the LIS solver, its tangent linear code, and its adjoint which improve the simulation of Antarctic ice shelves and Greenland outlet glaciers.
The adjoint and tangent linear results are validated using a finitedifference check, increasing confidence in the validity of the code produced by Tapenade. This new framework will be freely available.
 Max Sagebaum (TU Kaiserslautern)
Aggregated type handling in CoDiPack
The development of AD tools focuses mostly on handling floating point types in the target language. Taping optimizations in these tools mostly focus on specific operations like matrix vector products.
Aggregated types like std::complex are usually handled by specifying the AD type as a template argument.
This approach provides exact results, but prevents the use of expression templates.
If AD tools are extended and specialized such that aggregated types can be added to the expression framework, then this will result in reduced memory utilization and improve the timing for applications where aggregated types such as complex number or matrix vector operations are used. Such an integration requires a reformulation of the stored data per expression and a rework of the tape evaluation process. In this talk we demonstrate the overhead of unhandled aggregated types in expression templates and provide basic ingredients for a tape implementation that supports arbitrary aggregated types for which the user has implemented some type traits. Finally, we demonstrate the advantages of aggregated type handling on a synthetic benchmark case.
 Giles Strong (CERN, University of Padova)
TomOpt: PyTorchbased Differential Muon Tomography Optimisation
The MODE introductory article*, published earlier this year proposed an endtoend differential pipeline for the optimisation of detector designs directly with respect to the end goal of the experiment, rather than intermediate proxy targets. The TomOpt python package is the first concrete endeavour in attempting to realise such a pipeline, and aims to allow the optimisation of detectors for the purpose of muon tomography with respect to both imaging performance and detector budget.
This modular and customisable package is capable of simulating detectors which scan unknown volumes by muon radiography, using cosmic ray muons to infer the density of the material. The full simulation and reconstruction chain is made differentiable and an objective function including the goal of the apparatus as well as its cost and other factors can be specified. The derivatives of such a loss function can be backpropagated to each parameter of the detectors, which can be updated via gradient descent until an optimal configuration is reached.
*MODE (2021) Toward Machine Learning Optimization of Experimental Design, Nuclear Physics News, 31:1, 2528, DOI: 10.1080/10619127.2021.1881364
 16^{30} –17^{00} Break, Breakouts
 17^{00} –19^{00} Tribute to Andreas Griewank (Chair: Laurent Hascoet, INRIA)
 Andrea Walther (HumboldtUniversität zu Berlin)
Andreas and AD
 Trond Steihaug (University of Bergen)
So much for plan A
I this talk I will give glimpses of 40 years of Optimization touching on many ideas and works by Andreas Griewank
 Chris Bischof (TU Darmstadt)
Remembering Andreas Griewank
 Paul Hovland (Argonne National Laboratory)
Twenty Years of Probing for Jacobian Sparsity Patterns
In 2001, Griewank and Mitev introduced a technique for efficiently determining the sparsity pattern of Jacobian matrices via Bayesian Probing. Twenty years later, we describe some extensions to that pioneering work and some alternative probing strategies.
 Jean Utke (Allstate, USA)
Remembering Andreas Griewank
 Torsten Falko Bosse (University Jena)
Remembering Andreas Griewank
 David Juedes (Ohio State University)
Memories of Andreas Griewank
David cannot participate to this session, due to overwhelming constraints and commitments. He nevertheless took the time to write his memories of Andreas in the letter attached.
 Uwe Naumann (RWTH Aachen University)
Remembering Andreas Griewank
 19^{00} End of day 2

Thursday, November 4, 2021
 15^{00} –16^{30} Session IV (Chair: Martin Bücker, University Jena)
 Dominic Jones (Siemens PLM, London)
Block sequencing, partial derivatives and SG7
An eclectic talk touching on a few topics related to AD in C++:
 reversing a call stack without RAII
 obtaining full and partial derivatives from a single function
 C++ SG7 reflection group influence from AD
 Rodrigo Alejandro Vargas Hernandez (University of Toronto)
Applications of automatic differentiation for wavefunctionbased quantum chemistry methodologies.
The central role of theoretical chemistry or quantum chemistry is the
simulation of molecular properties using the laws of quantum mechanics.
During the last decade, a great variety of methods have been developed creating three main branches, density functional theory, wavefunction based, and semiempirical methods. For the last two methodologies, the quantum state that describes the distribution of electrons in a molecule, the wave function, is parametrized by a set of atomic basis functions that are linearly combined.
Here, we show that with automatic differentiation it is possible to optimize
the internal parameters of quantum chemistry methodologies, increasing the accuracy of the wavefunction.
In this talk, we present two applications of automatic in quantum chemistry. Difficult, a software package for abinitio quantum chemistry calculations, and Huxel, a software package for fully differentiable semiempirical methods. Both packages were implemented in the JAX ecosystem.
Diffiqult is designed to optimize any parameter of the atomic basis
set, e.g., Gaussian widths and centers, and contraction coefficients, and compute other higherorder derivatives. The optimization procedure is highly nontrivial, and as our results point, automatic differentiation engines could improve the accuracy of the wavefunction.
Huxel is the implementation of the H ̈uckel method, a semiempirical
model capable of qualitatively describing large chemical systems and providing initial guesses for abinitio methodologies. Our implementation permits us to optimize the parameters, improving the screening procedure for material science.
 Stefano Carli (KU Leuven)
Algorithmic Differentiation in plasma edge modeling for nuclear fusion reactors
Nuclear fusion has a large potential of becoming a clean and stable source of energy. However, for a reliable energy production, several challenges still have to be solved, one of the most critical being power and particle exhaust in the so called divertor component. In fact, due to the fast and strongly anisotropic transport in the plasma edge, the divertor has to withstand heat fluxes of several tens of megawatts concentrated on a few square meters area, thus reaching engineering limits of stateoftheart cooling concepts. Moreover, divertor surface damage induced by sputtering from highly energetic plasma particles significantly reduces the component lifetime. Therefore, it is of foremost importance to adequately model and predict the plasma edge behaviour for designing divertor concepts for future reactors.
To this end, plasma edge codes such as SOLPSITER [1] are commonly employed, which solve transport equations for the plasma particles, together with transport equations and interactions with neutral particles. Within these codes, the effect of plasma turbulence is only approximated with a diffusion model, adopting adhoc coefficients. In the current practice, these coefficients are estimated by visually comparing simulation results to experimental data, with manual tuning performed by the modeler in large and computationally expensive parameter scans. Moreover, such coefficients are spatially dependent, and different for each operational regime and reactor, thus hampering the code’s predictive and interpretive capability. Parameter estimation with gradientbased optimization allows to automate the procedure, with Algorithmic Differentiation (AD) providing efficient and accurate gradient calculations [2].
This talk gives an overview of the advancements of AD applied to SOLPSITER, following the first demonstration in Ref. [3]. The TAPENADE tool [4] is again employed, and adjoint derivative calculation is now available. Comparison of sensitivities obtained through finite differences, tangent and adjoint AD prove the correctness of the differentiation. Preliminary analysis of adjoint AD performance shows that a better checkpointing strategy needs to be achieved. Finally, AD is employed in a recently developed parameter estimation framework implementing both regression and Bayesian MAP estimation [5].
[1] X. Bonnin, W. Dekeyser, R. Pitts, et al. Plasma and Fusion Res., Vol. 11:1403102, 2016
[2] A. Griewank and A. Walther. Evaluating Derivatives. SIAM, 2 edition, 2008
[3] S. Carli, M. Blommaert, W. Dekeyser, M. Baelmans, Nuclear Materials and Energy 18, 611, 2019
[4] L. Hascoet and V. Pascual, ACM Trans. Math. Softw. 39, 3, Article 20 (May 2013)
[5] S. Carli, W. Dekeyser, M. Blommaert, R. Coosemans, W. Van Uytven, M. Baelmans, submitted to Contributions to Plasma Physics
 Sri Hari Krishna Narayanan (Argonne National Laboratory)
Quantum Derivatives
Quantum optimal control is solved by the GRAPE algorithm, which suffers from exponential growth in storage with increasing number of qubits and linear growth in memory requirements with increasing number of timesteps. These memory requirements are a barrier for simulating larger models or longer time spans. We have created a nonstandard automatic differentiation technique that can compute gradients needed by GRAPE by exploiting the fact that the inverse of a unitary matrix is its conjugate transpose. Our approach significantly reduces the memory requirements for GRAPE, at the cost of a reasonable amount of recomputation. We present our implementation in JAX, as well as benchmark results.
Additionally, we will discuss the so called parametershift rule for computing partial derivatives of a variational circuit. For many quantum functions implemented as quantum circuits in hardware, the same circuit can be used to compute both the quantum function and the gradient of the quantum function. We will explain some of the concepts in this area.
 16^{30} –17^{00} Break, Breakouts
 17^{00} –18^{00} Session V (Chair: Jan Hückelheim, Argonne National Labs)
 William Moses (MIT)
LanguageIndependent Automatic Differentiation and Optimization of GPU Programs with Enzyme
Derivatives are fundamental to a variety of algorithms scientific computing and machine learning such as backpropagation in neural networks, uncertainty quantification, sensitivity analysis and Bayesian inference. Enzyme is a LLVM compiler plugin for reversemode automatic differentiation (AD) and thus generates fast gradients of programs in a variety of languages, including C/C++, Fortran, Julia, and Rust. Our talk will present a combination of novel techniques that make Enzyme the first automatic reversemode AD tool to generate gradients of GPU kernels. As Enzyme differentiates within a generalpurpose compiler, we are able to introduce novel GPU and ADspecific optimizations. We differentiate five GPUbased HPC applications, executed on NVIDIA and AMD GPUs. All benchmarks run within an order of magnitude of the original program's runtime. Without GPU and ADspecific optimizations, gradients of GPU kernels either fail to run from a lack of resources or have infeasible overhead.
 Ioana Ifrim (Princeton University)
Automatic Differentiation for C++ and Cuda using Clad
Clad enables automatic differentiation (AD) for C++ algorithms through sourcetosource transformation. It is based on LLVM compiler infrastructure and as a Clang compiler plugin. Different from other tools, Clad manipulates the highlevel code representation (the AST) rather than implementing it’s own C++ parser and does not require modifications to existing code bases. This methodology is both easier to adopt and potentially more performant than other approaches. Having full access to the Clang compiler’s internals means that Clad is able to follow the highlevel semantics of algorithms and can perform domainspecific optimisations; automatically generate code (retargeting C++) on accelerator hardware with appropriate scheduling; and has a direct connection to compiler diagnostics engine and thus producing precise and expressive diagnostics positioned at desired source locations.
In this talk, we showcase the above mentioned advantages through examples and outline Clad’s features, applications and support extensions. We describe the challenges coming from supporting automatic differentiation of broader C++ and present how Clad can compute derivatives of functions, member functions, functors and lambda expressions. We show the newly added support of array differentiation which provides the basis utility for CUDA support and parallelisation of gradient computation. Moreover, we will demo different interactive usecases of Clad, either within a Jupyter environment as a kernel extension based on xeuscling or within a gpucpu environment where the gradient computation can be accelerated through gpucode produced by Clad and run with the help of the Cling interpreter.
 Lyndon White (Invenia Labs)
ChainRules.jl: AD system agnostic rules
The ChainRules project is a suite of JuliaLang packages that define custom primitives (i.e. rules) for doing AD in JuliaLang.
Importantly it is AD system agnostic.
It has proved successful in this goal.
At present it works with about half a dozen different JuliaLang AD systems.
It has been a long journey, but as of August 2021, the core packages have now hit version 1.0.
This talk will go through why this is useful, the particular objectives the project had, and the challenges that had to be solved.
This talk is not intended as an educational guide for users (For that see our 2021 JuliaCon talk: Everything you need to know about ChainRules 1.0 (https://live.juliacon.org/talk/LWVB39)).
Rather this talk is to share the insights we have had, and likely (inadvertently) the mistakes we have made, with the wider autodiff community.
We believe these insights can be informative and useful to efforts in other languages and ecosystems.
 18^{00} Closing notes, end of day 3

