Call Tree Reversal is NP-Complete
نویسنده
چکیده
The data-flow of a numerical program is reversed in its adjoint. We discuss the combinatorial optimization problem that aims to find optimal checkpointing schemes at the level of call trees. For a given amount of persistent memory the objective is to store selected arguments and/or results of subroutine calls such that the overall computational effort (the total number of floating-point operations performed by potentially repeated forward evaluations of the program) of the data-flow reversal is minimized. CALL TREE REVERSAL is shown to be NP-complete. 1 Background We consider implementations of multi-variate vector functions F : IRn → IRm as computer programs y = F(x). The interpretation of reverse mode automatic differentiation (AD) [8] as a semantic source code transformation performed by a compiler yields an adjoint code x̄+ = F̄(x, ȳ). For given x and ȳ the vector x̄ is incremented with (F ′(x))T · ȳ where F ′(x) denotes the Jacobian matrix of F at x. Adjoint codes are of particular interest for the evaluation of large gradients as the complexity of the adjoint computation is independent of the gradient’s size. Refer to [1–4] for an impressive collection of applications where adjoint codes are instrumental to making the transition form pure numerical simulation to optimization of model parameters or even of the model itself. In this paper we propose an extension to the notion of joint call tree reversal [8] with the potential storage of the results of a subroutine call. We consider call trees as runtime representations of the interprocedural flow of control of a program. Each node in a call tree corresponds uniquely to a subroutine call.1 We assume that no checkpointing is performed at the intraprocedural level, that is, a “store-all” strategy is employed inside all subroutines. A graphical notation for call tree reversal under the said constraints is proposed in Figure 1. A given subroutine can be executed without modifications (“advance”) or in an augmented form where all values that are required for the evaluation of its adjoint are stored (taped) on appropriately typed stacks (“tape (store all)”). We refer to this memory as the tape associated with a subroutine call, not to be confused with the kind of tape as generated by AD-tools that use operator overloading such as ADOL-C [9] or variants of the differentiation-enabled NAGWare Fortran compiler [14]. The arguments of a subroutine call can be stored (“store arguments”) and restored (“restore arguments”). Results of a subroutine call can be treated similarly (“store results” and “restore results”). The adjoint propagation yields the reversed data-flow due to popping the previously pushed values from the corresponding stacks (“reverse (store all)”). Subroutines that only call other subroutines without performing any local computation are represented by “dummy calls.” For example, such wrappers can be used to visualize arbitrary checkpointing schemes for time evolutions (implemented as loops whose body 1Generalizations may introduce nodes for various parts of the program, thus yielding arbitrary checkpointing schemes. is wrapped into a subroutine). Moreover they occur in the reduction used for proving CALL TREE REVERSAL to be NP-complete. Dummy calls can be performed in any of the other seven modes. advance tape (store all) ? store arguments 6 restore arguments ? store results 6 restore results reverse (store all) dummy call Fig. 1. Calling modes for interprocedural data-flow reversal. Figure 2 illustrates the reversal in split (b), classical joint (c), and joint with result checkpointing (d) modes for the call tree in (a). The order of the calls is from left to right and depth-first.
منابع مشابه
Contracting chordal graphs and bipartite graphs to paths and trees
We study the following two graph modification problems: given a graph G and an integer k, decide whether G can be transformed into a tree or into a path, respectively, using at most k edge contractions. These problems, which we call Tree Contraction and Path Contraction, respectively, are known to be NP-complete in general. We show that on chordal graphs these problems can be solved in O(n + m)...
متن کاملFormulations and Hardness ofMultiple Sorting
We consider two generalizations of signed Sorting By Reversals (SBR), both aimed at formalizing the problem of reconstructing the evolutionary history of a set of species. In particular, we address Multiple SBR, calling for a signed permutation at minimum reversal distance from a given set of signed permutations, and Tree SBR, calling for a tree with the minimum number of edges spanning a given...
متن کاملComplexity of Rainbow Vertex Connectivity Problems for Restricted Graph Classes
A path in a vertex-colored graph G is vertex rainbow if all of its internal vertices have a distinct color. The graph G is said to be rainbow vertex connected if there is a vertex rainbow path between every pair of its vertices. Similarly, the graph G is strongly rainbow vertex connected if there is a shortest path which is vertex rainbow between every pair of its vertices. We consider the comp...
متن کاملThe transposition median problem is NP-complete
During the last years, the genomes of more and more species have been sequenced, providing data for phylogenetic reconstruction based on genome rearrangement measures, where the most important distance measures are the reversal distance and the transposition distance. The two main tasks in all phylogenetic reconstruction algorithms is to calculate pairwise distances and to solve the median of t...
متن کاملThe Complexity of Translation Membership for Macro Tree Transducers
Macro tree transducers (mtts) are a useful formal model for XML query and transformation languages. In this paper one of the fundamental decision problems on translations, namely the “translation membership problem” is studied for mtts. For a fixed translation, the translation membership problem asks whether a given input/output pair is element of the translation. For call-by-name mtts this pro...
متن کامل