Hessian transfer for multilevel and adaptive shape optimization

We have developed a multilevel and adaption parametric strategies solved by optimization algorithms which require only the availability of objective function values but no derivative information. The key success of these hierarchical strategies refer to the quality of the downward and upward transfers of information. In this paper, we extend our approach when using a derivative-based optimization algorithms. The aim is to better re-initialize the Hessian and the gradient during the optimization process based on our construction of the downward and upward operators. The efficiency of this proposed approach is demonstrated by numerical experiments on an inverse shape model.


Introduction
We focus on improving the computational efficiency on optimum shape algorithms for application in which the cost functional relies on the prior solution of a complex set of partial-differential equations (PDEs), such as those governing compressible aerodynamics (e.g. the Euler equations), or related coupled disciplines [1] such as structural mechanics (e.g.elasticity), or electromagnetics [2] (e.g. the Maxwell equations).Hence, each evaluation functional is computationally costly.In several papers [3][4][5][6][7][8], we introduced a conceptual parallel from grid to geometrical parametrization by constructing a full and adaptive multi-level optimum-shape algorithm (FAMOSA), analogous to the classical full multigrid method (FMG) known to have optimum linear complexity with respect to the number of degrees of freedom).We demonstrated the efficiency of this construction for problems of drag reduction for transonic flight, external noise reduction for supersonic flight, and aerodynamic and structural optimization of a business-jet wingshape.Figure 1 illustrates an unstructured mesh, composed of 31,124 nodes and 173,445 elements, generated around the wing, including a refined area in the vicinity of the shock.Figure 2 shows a comparison of the convergence for the three strategies; FMOSA is the most efficient, yielding a shape of better fitness using a smaller computational effort.Using such setting, the wing geometry of an aircraft in transonic flow has been optimized by the minimization of a functional of the form: C D0 and C L0 are respectively the drag and lift coefficients corresponding to the initial shape (NACA 0012 section).
Until now, free-optimization techniques, such as the classical simplex search method, the genetic algorithms (GAs) and the particular swarm optimizers (PSOs), has been employed to reach the optimum design.Despite their greater robustness, they require a great number of function evaluations and the cost increases with the number of design variable.Another alternative, according to the literature, consists to utilize a derivative-based optimization algorithms that are able to locate the optimum solutions in the expense of only few function evaluations (i.e. with reasonable CPU time).They are often supported by tools that compute the derivatives of the objective function; such as the adjoint techniques [9,10], the Automatic Diffrentiation (AD) [11,12], the surrogate models (Radial Basis Function(RBF), Artificial Neural Networks (ANN), etc.) [13].
We note that prior to us, we find many applications employing the derivative-based methods with Bézier parametrization.In particular, Marino et al. [14] has used triangular Bézier surface and the optimization is numerically solved by a constrained gradient-based minimization algorithm.In this work, Marino et al. find the gradient-based algorithm more suitable than the evolutionary algorithm.
In reference [15], we find a combination of NURBS (Non-Uniform Rational B-Spline) and a gradient-based optimization algorithm to conduct shape optimization of stratospheric airships.
In this paper, we propose to solve the shape optimisation problem, within multilevel and adaptive strategies, by the BFGS (Broyden Fletcher Goldfarb and Shanno) Quasi Newton method.One of the key properties of BFGS method is its superlinear convergence, which is guarantee under reasonable assumptions [16].In addition, it works well in practice, even when the initial iterate is far from the solution.Despite these advantages, the initialization of BFGS algorithm by identity may affects the performance of our hierarchical and adaption strategies; this later depend on the quality of transfer operators in order to rigorously construct embedded and nested search space.Thus, our proposed approach consists to well re-initialize the Hessian and the gradient at each change of the search parametric space.
This paper is organized as follow.First, we review the construction of multilevel algorithms, in the context of parametric shape optimization.Second, we present the proposed approach for multilevel and adaptive parametric shape optimization.We then present a numerical results on a model problem from caculus of variations.Finally, we conclude and mention some perspectives.

Nested Bézier parameterizations for multilevel shape representation
We begin with the simplest situation of a two-dimensional geometry for which we employ a Bézier shape representation: in which the parameter t varies from 0 to 1, n is the degree of the parameterization, and we refer to the vector X as the support of the parameterization, and the vector Y as the design vector.Typically, we optimize the design vector for fixed support according to some physical criterion, such as drag reduction in aerodynamics.The somewhat unsymmetrical roles dispensed to the vectors X and Y are chosen to reduce (to n essentially) the dimension of the search space in the optimization phase, which is the most numerically costly and subject to numerical stiffness.We also use the notation: in which the vector B n ðtÞ T ¼ ðB 0 n ðtÞ; B 1 n ðtÞ; :::; B n n ðtÞÞ.In all this article, only supports for which the sequence fx k g is monotone increasing are said to be admissible and considered throughout.Thus, the function xðtÞ is monotone-increasing and defines a one-to-one mapping of, say, [0,1] onto itself.Recall also the simple formula for the derivative: in which D denotes the forward-difference operator (Dx k = x k+1 À x k ) as well as the associated n Â ðn þ 1Þ matrix.
In the prototypical case of an airfoil, we use such a parametric representation for both the upper and lower surfaces separately.The vertical slope at the leading edge is enforced by the conditions: for both surfaces which assures a smooth match; at the trailing edge, we simply have: for a continuous match.
Our geometrical construction employs the degree-elevation process, well-known in the Computer-Aided Design literature (see, e.g.[17]).This process permits to cast (Eq.( 6)) into the following equivalent Bézier parameterization of degree n + 1: in which the new control points P 0 k ¼ ðx 0 k ; y 0 k Þ are obtained from the former by convex combinations: obtained by multiplying equation ( 6) by (1 À t) + t and grouping together the monomials in t k (1 À t) n+1Àk , for each k.
From a theoretical viewpoint, our construction guarantees rigorously nested search spaces, and exact upward transfer operators (from low to high-degree parameterization).Note that, apart from the specified endpoints, the abscissas of the initial support X are not a subset of the abscissas of any support of a higher degree parameterization.Nevertheless, any Bézier curve given on the initial support can be expressed exactly on any other support of higher degree provided it results from the degree elevation process.The parameterizations are nested, or embedded in one another in this sense precisely.
Based on this concept of nested Bézier parameterizations, the following theorem introduces the upward and downward transfer operators.Theorem 1.Let X 0 ¼ fx 0 k g and X ¼ fx k g two nested (by degree elevation process) parametrization supports of degree n 0 = n + 1 and n respectively, t i (i = 0, . .., n 0 ) and q i (i = 0, . .., n) two arbitrary partitions of the interval [0, 1].Then, the upward relation (elevation) between X 0 and X is given by the relation: where the generalized Vandermonde matrix B n 0 and the rectangular matrix B n;n 0 are: and the downward relation (reduction) between X 0 and X is given by the relation: where the generalized Vandermonde matrix " B n and the rectangular matrix " B n;n 0 are: Proof.Let X = {x k } and X 0 ={x k 0 } two nested Bézier parametrization supports of degree n 0 and n respectively.It means that: Let {t i } (i = 0, . .., n 0 ) arbitrary partition of [0,1] and {e i } (i = 0, . .., n 0 ) a canonic base of R n 0 .We first muliply the equation ( 18) by e k (k = 0, . .., n 0 ), then: and by adding these equations (19) with respect to k, we find the following formula: where and Since the generalized Vandermonde matrix B n 0 is invertible, it follows that: In what follows, we note by E N n the upward operator from the lower level n to N (N > n) and by R N n the downward operator.By construction, we have this following property: To test these two transfer operators, we use the following model problem: in which x(t) is given, smooth and monotone increasing, p and A are, for specified x(t) > 0 and a > 1, the pseudolength of the arc, and the pseudo-area below the arc.This model problem has been studied in [18] to which we refer for a full description of the numerical test-case which corresponds to a % 2.03, for the functional is known to be convex, and a certain x(t) for which the minimizing shape is the half-thickness distribution of the RAE2822 airfoil.Figure 3a (resp.3b) illustrates the upward process using E 8 6 (resp.the downward process R 8  6 ).The curves corresponding to the both parametrizations are superimposed.

Hessian and gradient transfer
In this section, we present our approach within a multilevel algorithms and adaption procedure.The subject is to good initialize the optimization process when the parametric search space is changed (Sect.3.1) and the control polygon is regularized (Sect.3.2).

First case
As explained in the previous section, the transfer operators guarantee the construction of nested control polygon with higher degree (N > n).To achieve a good transfer of informations, we need to better update the Hessian and the gradient in order to conserve the efficiency of our multilevel strategies.
For fixed support X 0 , let (X 0 , Y 0 ) the solution of the minimization of the problem (25) and c 0 the corresponding shape.By using the upward operator, we construct a new control polygon (X, Y) of degree N corresponding to the same shape c 0 .It follows that: where j N is the cost function of the problem (25) corresponding to the Bézier parametrization of degree N.
To express the Hessian and the gradient on the coarse level (of degree n), we differentiate two times the equation ( 29) with respect to Y 0 : where dY dY 0 ¼ E N n .By differentiating the equation ( 28) with respect to Y , we find: and where R n N ¼ dY 0 dY .We note that (33) and (34) can also be obtained using the property (24).Proposition 1.The Hessian matrix given by ( 34) is positive definite. Proof.Let The matrix B n is the BFGS approximation corresponding to the first optimization phase on the coarse level; thus, it is positive definite.
Let x 2 R N ðx 6 ¼ 0Þ, then where As consequence, the following expression holds: Finally, we can update the Hessian by the matrices B N or B N À1 , it depends on the BFGS method used, instead of identity when the parametrization degree is augmented or reduced.The second phase of the optimization in the fine level of degree N is initialized by an exact Hessian matrix.

Second case
Suppose that we are placed on the search space of degree n. the parametrisation adaption consists to adapt the support X by alternating two complementary phases: 1. Optimization: optimize the design vector Y for fixed support X = X 0 according to some criterion; let Y 0 be the result of this phase.2. Regularization: given the parametrization (X 0 , Y 0 ) of an approximate optimum shape c 0 , the new support X 1 is taken to be the better support for which the total variation (TV) in the components of the corresponding vector Y 1 is minimal, such that the correspondent shape c 1 to (X 1 , Y 1 ) approximates c 0 in the sense of least squares; substitute X 1 to X 0 .
This adaption procedure has been studied extensively in [3] for two-dimentional parametrization.The extension to threedimentional optimum design in aerpdynamics within the framework of the so-called free-form deformation has been proposed in reference [19].It improves noticeably the convergence rate and reduces the numerical stifness.The main idea  consists to redefine the geometrical representation by using regularization techniques.Thus, a new formulation of the Hessian and the gradient is required in order to re-initialize the adaption process by a good regularized solution with equivalent performance.
As the both control polygons (X 0 ,Y 0 ) and (X 1 ,Y 1 ) correspond to the same performance, it follows that: i.e.
J ½B n ðtÞ T X 0 ; B n ðtÞ This intrinsic formulation can be transformed into a parametric optimization by: where y T ðt; X Þ ¼ B n ðsÞ T Y 0 and s ¼ sðt; X Þ is related to the change of support X 0 !X , and defined by this condition: To satisfy the unicity of the solution, we impose that the components of X 0 are monotone increasing.
The derivative of the cost function J X n is given by: where The matrix A(X) is real-symmetric positive definite and depends linearly upon the vector X, thus where A = A 0 (X) is a tensor of order 3, independent of X and stands for the contracted product.The minimization of J X n with respect to Y is equivalent to In particular, (X 0 , Y 0 ) and (X 1 ,Y 1 Þ verify this linear system. For X = X 0 , the first and second derivative of (40) with respect to Y 0 are given by: and By differentiating the linear system (46) with respect to Y 0 , we obtain: where As AðX 1 Þ and dbðX 1 Þ dY 0 do not depend on Y 0 , then: Proposition 2. If all the components of the support X are monotone increasing, then the matrix A(X) given by the formula (43) is positive definite.
Proof.Let Y 2 R n (Y 6 ¼ 0).We have where Z ¼ B n ðtÞ T Y .As the components of X are monotone increasing, then DX > 0 (D is the forward difference operator).So, its enough to prove that Z T Z 6 ¼ 0. We proceed by contradiction and assume that Z T Z ¼ 0, so Z ¼ 0. As B n ðtÞ T form a basis of R n , then Y ¼ 0, which contradicts the first assumption.
h Since X 1 satisfy the condition of the Proposition 2, then the matrix AðX 1 Þ is invertible.Thus: As consequence, The formulas express the relation between the Hessian matrices corresponding to the initial control polygon (X 0 , Y 0 ) and the regularized one (X 1 , Y 1 ).It permits to re-initialize the optimization process after each parametrization adaption.

Numerical experiments
The proposed approach is tested using the model problem, described in Section 3.1, for the first case and the shape-reconstruction problem, described in Section 3.2, for the second one.The numerical optimum is determined by the Quasi-Newton algorithm [20] based on BFGS (Broyden-Fletcher-Goldfarb-Shanno) updated formula.This algorithm is a line search family method, one of the most powerful methods to solve unconstrained optimization problem.It exposes superlinear convergence; resource-intensivity is estimated as Oðn 2 Þ per iteration for n-component argument vector.The search direction is given by s k = ÀH k G k , where G k and H k are respectively the gradient and the Hessian of the  objective function.H k a symmetric definite matrix, constructed at each iteration, which approximates the Hessian matrix and/or its inverse.In the literature, there are several formulas for the calculation of H k+1 from H k .We use here, the BFGS update formula [16], where As a starting point, H 0 can be set to any symmetric positive definite matrix and very often the identity matrix.This initialization has a bad impact on the efficiency of our multilevel and adaption strategies.
In the following numerical experiments, the gradient is calculated analytically and we focus only on the impact of the Hessian transfer.For the model problem, the derivative of the cost function in the search space of degree n is given by:  We use here the formula of the derivative of y 0 (t) given by nB T nÀ1 ÁY , in which D is the n • (n + 1) matrix associated with the forward-difference operator ðÁY ¼ y kþ1 À y k ).
For the shape-reconstruction problem, the gradient corresponds to the linear system (46): To Calculate the gradient on the search space of degree N > n (resp.N < n), we apply the upward operator E N n (resp.the downward operator R N n ) to X and Y and we substitute.Figure 4 provides the convergence history of the basic algorithm for three optimization strategies: at full convergence, with Hessian transfer after 20 iterations and without Hessian transfer (the Hessian is initialized automatically by identity).And Figure 4 gives the corresponding optimal shapes.Clearly, all the optimization achieve the convergence and the main point is the fact that the strategy based on Hessian transfer has no effect on the convergence rate; so, this experiment validate the formulas of the Hessian and gradient.In this experiment, the multilevel optimization is considered here for n = 4 and N = 12.Figures 5a and 5b give respectively the iterative performance and the correspondingoptimal shapes for the multilevel algorithm with and without Hessian transfer.Clearly, the algorithm based on Hessian transfer is faster in term of the convergence rate.This experiment confirms the impact on the performance of the multilevel algorithm when the Hessian matrix is initialized by identity.
Lastly, we measure the effect of the hessian transfer on the accuracy of the optimization when using the adaption procedure.Figure 6a shows a comparison of the basic algorithm for n = 8, with the adaption procedure with and without Hessian transfer.The Figure 6b gives the corresponding optimal shape.As expected, the Hessian transfer guarantees a best convergence rate of the optimization.

Conclusion
The Hessian transfer is essential to preserve and enhance the quality of our multilevel strategies and adaption procedure.This approach permits to better initialize the BFGS Quasi-Newton method.The numerical experiments demonstrate clearly its efficiency in the case of two-dimentional model problem.This study, encourage developing a new multilevel geometrical structure in which, for example in the case of two-level ideal V-cycle [22,23], the problem is solved to complete convergence on the coarse level by robust optimization algorithms (Genetic algorithm [GA], Particle swarm optimization [PSO], simplex method, etc.) and few iterations are enough (acts as preconditionners) in the fine level to alleviate the numerical stifness; BFGS Quasi-Newton can be used in this case.This variant is illustrated by the following schematic.Others strategies can be elaborated mimicking the multigrid-type strategies (saw-touth, V and W cycle, FMG, etc.) [21].
The proposed approach will be generalized for 3D optimum design in aerodynamics, structural mechanics, or coupled discplines within the framework of the so-called Free-Form deformation (FFD) [24].Testing a great number of promising algorithmic strategies is also a part of our ongoing work.

Figure 1 .Figure 2 .
Figure 1.Initial wing shape and mesh in the symmetry plane.

Figure 3 .
Figure 3. Illustration of E N n and R N n by using two embedded control polygons of degree n = 6 and N = 8.(a) Upward process, (b) downward process.