Optimization of wind plant layouts using an adjoint approach

Using adjoint optimization and three-dimensional steady-state Reynolds-averaged Navier–Stokes (RANS) simulations, we present a new gradient-based approach for optimally siting wind turbines within utilityscale wind plants. By solving the adjoint equations of the flow model, the gradients needed for optimization are found at a cost that is independent of the number of control variables, thereby permitting optimization of large wind plants with many turbine locations. Moreover, compared to the common approach of superimposing prescribed wake deficits onto linearized flow models, the computational efficiency of the adjoint approach allows the use of higher-fidelity RANS flow models which can capture nonlinear turbulent flow physics within a wind plant. The steady-state RANS flow model is implemented in the Python finite-element package FEniCS and the derivation and solution of the discrete adjoint equations are automated within the dolfin-adjoint framework. Gradient-based optimization of wind turbine locations is demonstrated for idealized test cases that reveal new optimization heuristics such as rotational symmetry, local speedups, and nonlinear wake curvature effects. Layout optimization is also demonstrated on more complex wind rose shapes, including a full annual energy production (AEP) layout optimization over 36 inflow directions and 5 wind speed bins.


Introduction
Optimizing wind turbine locations within a wind plant is a uniquely challenging problem that combines turbulent flow control with practical engineering challenges concerning the economical development of renewable energy. The problem is further complicated by the strong nonlinear coupling between turbine locations, power production, atmospheric boundary layer turbulence, and mechanical loads on turbine components. Wind plant optimization techniques used in industry often rely on heuristic guidelines and simplified linear flow models that limit computational costs. However, these approaches neglect important turbulent flow physics and can result in the underperformance of wind plants relative to their pre-construction estimates. The reduced power output and increased uncertainty due to optimization with low-fidelity flow models ultimately increases the levelized cost of energy (LCOE) and associated project risk for investors.
Major impediments to improving the LCOE for utilityscale wind plants are the difficulty of performing the turbine layout optimization using fluid dynamical models with sufficient fidelity and performing optimization in highdimensional design spaces. In order to improve wind plant optimization, we present WindSE, a wind plant layout tool that uses a steady-state three-dimensional (3-D) Reynoldsaveraged Navier-Stokes (RANS) flow model and gradientbased optimization. This gradient-based approach, made possible by automated adjoint derivations, yields an efficient, high-fidelity, and high-dimensional wind plant optimization framework. The software is part of the National Renewable Energy Laboratory Wind-Plant Integrated System Design and Engineering Model (WISDEM ™ ) (Dykes et al., 2014), which provides a modular framework for comprehensive analysis and optimization of wind plant designs. Through the use of a higher-fidelity 3-D RANS flow model and gradient-based adjoint optimization, WindSE is able to provide optimized layouts with arbitrary topological complexity at relatively low computational cost, while at the same time more accurately capturing turbulent flow effects present in real wind plants.
This paper is organized as follows: background on wind plant flow modeling, layout optimization, and adjoint optimization techniques is provided in Sect. 2; a description of the methods used in WindSE for implementing the wind plant optimization problem, flow model, turbine representation, numerics, and optimization algorithm is given in Sect. 3; a demonstration of the flow model capabilities and optimization results for idealized and real world test cases with increasingly complex wind roses, culminating in a full annual energy production (AEP) optimization, is presented in Sect. 4; and the paper is concluded with a discussion of the results and implications for wind plant design in Sect. 5.

Wind plant flow modeling
Utility-scale wind plants in the United States typically involve tens to hundreds of turbines arranged in semi-regular arrays. The layout topology is generally an outcome of optimizing the power production or net capacity factor subject to competing influences from site constraints, the local wind resource, and construction costs. These constraints include the patchwork of viable building areas formed by leases and setbacks due to environmental concerns or physical infrastructure, terrain and soil characteristics like slope or vegetation, turbine manufacturer spacing requirements, and continuity requirements imposed by access roads and electrical connections. This results in a complex design problem, with turbine layouts varying drastically between different geographic locations and exhibiting complex topologies.
The AEP from a wind plant layout has traditionally been, and generally continues to be, assessed using reduced-order linear flow models. Such models estimate the relative wind speed across a site based on the linearized Navier-Stokes equations and treat terrain features as perturbations in boundary conditions. The underlying governing equations, introduced by Jackson and Hunt (1975) and implemented in packages such as MS3DJH/3R (Walmsley et al., 1986), are based on analytical perturbation solutions to flow over a low hill. This approach decomposes the terrain into sinusoidal hills and calculates relative speedup effects over each hill. Speedup effects from multiple hills are superimposed to obtain relative velocities over the entire site. The emphasis on relative velocities is motivated by the need to extrapolate from point measurements at meteorological towers to velocity fields covering the entire plant.
A velocity deficit representing the turbine wake is then superimposed on the background wind resource at each turbine location. The PARK model developed by Jensen (1983) and the eddy viscosity model developed by Ainslie (1988) are commonly used, and a comprehensive review of wake modeling can be found in Crespo et al. (1999). Such approaches decouple the wind flow calculation from the wake calculation and use wake models calibrated to a single turbine in isolation. As a result, the wake model approach neglects flow curvature, speed-up effects around turbines, and changes to wakes deep in a plant. This results in known inaccuracies in complex terrain, ad hoc model adjustments for overlapping wakes from multiple turbines, and systemic underpredictions of wake losses and over-predictions of power output (Barthelmie et al., 2009). Despite these limitations, linear flow models and prescribed wake models are commonly used in engineering practice because they are computationally inexpensive and reasonably accurate in simple terrain with few turbines.
Recently, higher-fidelity computational fluid dynamics (CFD) models have been increasingly applied to the study of wind plant performance. Several RANS models have been developed for simulating atmospheric flows in wind plants using actuator disk turbine representations, with a particular emphasis on modified k − closures (Cabezón et al., 2011;El Kasmi and Masson, 2008;van der Laan et al., 2015a, b). These approaches greatly improve upon linearized flow models and, given an appropriate model for the turbulent eddy viscosity, better capture turbulent transport effects in the shear layer at the edge of the wake. Large eddy simulation (LES) is also increasingly being applied to wind plant modeling (Calaf et al., 2010;Porté-Agel et al., 2011;Churchfield et al., 2012), as well as to some aspects of wind plant optimization. These simulations are vastly more expensive than RANS approaches, but resolve time-varying turbulent motions up to the filter cutoff scale. This allows for better characterization of wake meandering and unsteady loads on turbines. Finally, wind plant modeling has also been improved through coupling to numerical weather prediction simulations (Fitch et al., 2012;Mirocha et al., 2014). Such simulations are able to incorporate synoptic-scale weather forcing and interactions between the surface, wind plant, and full atmospheric boundary layer.

Approaches to wind plant optimization
To date, wind plant layout optimization has been performed primarily using linear flow models with analytical wake deficits. The relatively cheap computational cost of such models allows gradient-free methods to be used in the optimization process. A number of layout optimization studies have utilized gradient-free approaches to minimize wake losses (Kusiak and Song, 2010), maximize net present value (González et al., 2010), and minimize noise propagation (Kwong et al., 2012). Other optimization approaches include particle swarm optimizations that determine turbine layouts and rotor diameters (Chowdhury et al., 2012), extended pattern searches for multimodal layout optimization (Du Pont and Cagan, 2012), and even game-theoretic methods (Mar-den et al., 2013). An exhaustive review of wind plant optimization efforts has been compiled by Herbert-Acero et al. (2014) that surveys the wide variety of objective functions, flow models, and constraints that have been studied. Results from these prior optimizations are, however, heavily influenced by the use of analytical wake models which do not fully account for nonlinear and turbulent flow physics. Consequently, significant differences in optimal layouts are expected when using higher-fidelity flow models.
Recently, higher-fidelity CFD flow models have been used in a limited range of wind plant optimization applications. The Technical University of Denmark has developed TOP-FARM (Larsen et al., 2011), which employs an improved wake model (the dynamic wake meandering model) as well as a parabolic Navier-Stokes solver. TOPFARM uses a hybrid optimization approach that combines sequential linear programming (SLP) with gradient-free genetic algorithms. The genetic algorithm is used to find the neighborhood of the global optimum, and then the gradient-based SLP algorithm completes the optimization. However, the genetic algorithm step penalizes large design spaces, limiting TOPFARM to relatively small wind plants. King et al. (2016) used adjoints of a 2-D RANS flow model to optimize turbine locations and observed that the 2-D nature of the flow solver resulted in substantial flow curvature. Meyers and Meneveau (2012) studied the effects of spacing and alignment in infinite wind plants with turbines arranged in regular gridded turbine arrays. In their study, a pseudospectral LES was used to model atmospheric boundary layer turbulence and complex wake interactions, but the optimized layouts were restricted to grids, as opposed to a more general layout topology optimization where turbines are free to arrange in non-gridded configurations. LES results have also been used to tune linear flow models that were used in optimization of yaw control , turbine layouts (Bokharaie et al., 2016), and coupled layout and yaw optimization , or in the tuning of RANS models for wind plant control optimization (Iungo et al., 2016). Wind plant LES has also been used to directly perform adjoint optimization of wind plant controls by adjusting rotor thrust during operation (Goit and Meyers, 2015;Goit et al., 2016). These approaches benefit from high-fidelity CFD and leverage the power of adjoint optimization but have only been applied for fixed layouts. Finally, Funke et al. (2014) performed an adjoint optimization of ocean turbine layouts using an analysis based on the shallow water equations, with ocean turbines represented by increased bottom friction.
In the present study, we use adjoint techniques to enable gradient-based optimization of wind turbine locations within a plant, subject to realistic turbulent flow fields. A steady 3-D RANS flow solver is employed as a first-principles model that can predict new turbulent flow physics, rather than prescribing fixed wake behaviors as in linear flow models. The RANS model provides an accurate model of a neutral atmospheric boundary layer at moderate computational cost and without requiring calibration using LES. The RANS model is also amenable to automatic differentiation and gradientbased optimization, as explained in the next section. This gradient-based approach permits the use of high-dimensional control spaces, thereby providing optimized layouts of arbitrary complexity (i.e., optimized layouts are not restricted to grids or any other regular arrangement). We further demonstrate layout optimization for the full plant AEP based on a real site wind rose, going beyond the uniform speed optimization considered previously (King et al., 2016). The resulting optimization framework thus represents a novel application of adjoint techniques to the optimization of utilityscale wind plants using a higher-fidelity CFD flow model.

Adjoint techniques for efficiently calculating gradients
The greater expense of CFD flow models such as RANS and LES compared to linearized flow models requires the use of an efficient optimization technique that minimizes the number of flow model evaluations. Gradient-based optimization methods are promising candidates for CFD-driven wind plant optimization since they require orders of magnitude fewer function evaluations than gradient-free techniques. However, finding these gradients can be challenging when using complex flow models or when there are many control variables. Calculating such gradients with a finitedifference approach requires a function evaluation for each control variable, making this approach prohibitively expensive for optimizing utility-scale plants with many turbines. In the present optimization framework, the necessary gradients are obtained relatively inexpensively by using adjoint optimization techniques. A comprehensive review of discrete techniques for calculating gradients of engineering design problems, including the adjoint approach, can be found in Martins and Hwang (2013). The adjoint approach allows one to calculate gradients at a cost that scales with the dimension of the objective function rather than with the number of design variables. For wind plant optimizations with scalar objective functions, this means that gradients can be found at a fixed cost regardless of the number of control variables.
Adjoint systems arise naturally from consideration of dynamical systems whose behavior can be described by differential equations, and they have a rich history in fluid dynamics and sensitivity analysis. Reviews of adjoint techniques in various flow control and optimization applications have been published by Jameson (2003), Giles and Pierce (2000), Giannakoglou and Papadimitriou (2008), and Luchini and Bottaro (2014). At its core, the adjoint approach reverses the propagation of information in a dynamical system. Given an ordinary or partial differential equation governing the evolution of a dynamical system and an objective function measuring a quantity of interest, the adjoint approach produces a differential equation whose states evolve backward in time and indicates where a perturbation would maximally influence the objective function. By reversing the flow of information, the adjoint reveals what is effectively the optimal open-loop control input.
The adjoint operator is defined by the bilinear identity Au, v = u, Bv , where operator B is adjoint to operator A. This holds if A is a continuous differential operator, in which case B is found through integration by parts, or if A is a matrix, in which case B = A * , where A * is the complex conjugate transpose of A. This bilinear identity reveals that the adjoint operator is implicitly defined by the forward model, but in order for the adjoint problem to be well posed, additional constraints are required. Because the adjoint travels backwards in time, these constraints are terminal conditions rather than initial conditions, and they come from the specific objective function under consideration as well as from the states produced by the forward model. The RANS model considered in this study is steady state in time and consequently the adjoint system is also steady state, equivalent to solving a single time step in the dynamical systems interpretation.
The following general framework illustrates the computational advantages of the adjoint approach. Consider a dynamical system with governing equations that can be expressed in residual form as F (u,m) ≡ 0, where F is a vector-valued differential equation (e.g., the RANS equations), u ∈ R n is the system state vector (e.g., the flow field velocities), and m ∈ R m is a vector of control variables (e.g., the wind turbine coordinates). Additionally, consider a scalar objective functional J (u,m) ∈ R that measures a quantity of interest (e.g., the LCOE). Many engineering problems can be formulated as constrained optimization problems that seek optimal states and control parameters to minimize J , namely where h and g are additional equality and inequality constraints on the control parameter m, such as upper and lower bounds on a control input (e.g., wind plant site boundaries).
In common engineering problems, F is a partial differential equation (PDE) that is expensive to evaluate, and the dimensions of both the control space and state space are high. Efficiently solving this PDE-constrained optimization problem requires algorithms that scale well to high dimensions and that minimize the number of evaluations of F. A common approach to solving a PDE-constrained problem is to minimize a reduced functionalĴ (m) ≡ J [u (m) , m]. This formulation takes advantage of the fact that the PDE constraint F (u,m) = 0 implicitly defines the state u in terms of the control m. Minimizing the reduced functional is equivalent to minimizing the original functional if the governing equations F produce a unique state u for a given set of controls m, and if F [u (m) , m] is assumed to be continuously invertible so that u (m) is continuously differentiable (Hinze et al., 2009).
Gradient-based optimization algorithms can approach second-order convergence to local minima and can minimize the evaluations of F, but require the gradient of the objective functional with respect to all of the control parameters. This gradient, dĴ /dm, is given by the chain rule as Since J (u,m) is generally a user-defined function, ∂J /∂m ∈ R 1× m and ∂J /∂u ∈ R 1×n are straightforward to determine analytically. However, du/dm ∈ R n× m is expensive to compute for high-dimensional control and state spaces. A finitedifference approach to calculating this gradient would require m evaluations of F, which is intractable for many engineering problems.
In the adjoint approach, du/dm is eliminated from Eq. (2) by taking the derivative of the PDE constraint F (u, m) = 0 with respect to m, resulting in the tangent linear system Solving for du/dm in Eq.
(2) then yields where we have introduced the new adjoint variable . This variable maps source perturbations in F (u,m) = 0 to sensitivities of J . By definition, it is governed by With a solution of the forward model F and the adjoint , the derivative of the objective function expressed in Eq. (2) can then be calculated as Because Eq. (5) is independent of m, the gradient can be calculated at a fixed cost regardless of the dimension of m. This enables efficient gradient-based optimization for systems governed by computationally demanding PDEs. Equation (5) provides an intuitive interpretation of the adjoint as a linear mapping between source perturbation in the governing equations and changes in the objective function. The adjoint thus provides the optimal control input and directly encodes the sensitivities ofĴ . Alternatively, the adjoint can also be interpreted as a Lagrange multiplier enforcing the PDE constraint F.

Methodology
In the present study, wind plant layout optimization is approached as a PDE-constrained optimization problem using the adjoint theory developed in the previous section. The wind plant power output is maximized with gradient-based optimization techniques, subject to a PDE constraint corresponding to the RANS equations, which are used to predict turbulent flow within the plant.

Optimization problem definition
We seek to maximize steady-state power output from N different turbines experiencing K different inflow wind states (each state is defined by a wind speed and direction) by controlling the 2-D Cartesian positions of the turbines, x = (x 1 . . .x N ) and y = (y 1 . . .y N ). The Reynolds-averaged velocity field u i ∈ R 3 and pressure field p ∈ R are taken to be the states (collectively denoted u) and the turbine coordinates are the control variable m = x, y T ∈ R 2N . This leads to the following optimization problem: where J [u (m) , m] is the negative of the total power production, α k are weights for each inflow state from the wind speed distribution and wind rose describing the site climatology, A n is the turbine rotor area, and ρ is the air density. The PDE constraints are the steady 3-D RANS and continuity equations which govern the evolution of u i and p. Boundary conditions corresponding to the log law are applied on inflow, top, and bottom boundaries, and outflow boundary conditions are applied on the sides and exit of the domain as further described in Sect. 3.3. The RANS equations have an additional body force term f AD,n that represents the force imparted on the flow by each wind turbine. The Boussinesq hypothesis is used to close the deviatoric Reynolds stress term τ ij by making use of an eddy viscosity ν T . The eddy viscosity is calculated with a mixing length model (Wilcox, 2006), where the mixing length mix is taken to be one-eighth the vertical distance from the bottom surface. A mixing length model was chosen because of its simplicity, differentiability, and familiarity to the wind energy community from use in traditional wake models (Ainslie, 1988) and in recent RANS models developed for real-time wind plant controls (Boersma et al., 2016). The power and thrust coefficients for turbine n, denoted c p,n and c t,n , respectively, are derived from actuator disk theory and are described in the next section. The variables ϕ n and β n are geometric smoothing kernels and normalization constants, respectively, and are also described further in the next section. The site constraints are imposed as lower, L x,lower and L y,lower , and upper, L x,upper and L y,upper , bounds on the turbine x and y locations. An inter-turbine minimum spacing constraint is further enforced with a minimum spacing D min of 3 times the rotor diameter (RD) used in the AEP simulations. The turbines are assumed to always yaw into the wind and the turbine body force is thus directed into the incoming wind by the unit vectorn k . Note that the subscript n in the above system of equations represents one of the N different turbines and the subscript k denotes one of the K different wind states included in the analysis. As described in the previous section, a reduced functional , m] is used in the adjoint optimization approach.
We stress that the strength of this approach is not the particular form of the RANS closure model, but rather its embedding in an adjoint optimization framework. Different objective functions, turbulence closures, or turbine representations can be easily implemented in this framework and still benefit from the adjoint approach.

Wind turbine representation
Turbines in the simulations are represented as non-rotating actuator disks using actuator disk theory, as described in standard wind energy texts (Burton et al., 2011). The power, P , and thrust force, T , generated by an actuator disk are given in terms of a power coefficient, c p , a thrust coefficient, c t , and an upstream reference velocity, u ref , as where power and thrust coefficients are given in terms of an axial induction factor a as The axial induction factor describes the velocity at the rotor disk, u rotor = u ref (1 − a), as a fraction of the upstream reference velocity. The reference velocity is typically taken to be the far-field upstream velocity for a turbine in isolation, but as discussed by Sanderse et al. (2011), the determination of u ref for waked turbines or in complex terrain is more difficult. We adopt the same approach used by Calaf et al. (2010) and Meyers and Meneveau (2012) and define modified power and thrust coefficients, c p and c t , respectively, that are based on the local rotor disk velocity u rotor rather than u ref . These modified coefficients are resulting in power and thrust given by For this study, turbine operating parameters of c t = 3/4, c p = 0.34, and a = 1/4 are used for all turbines, consistent with values used in previous CFD studies (Calaf et al., 2010;Meyers and Meneveau, 2012;Jimenez et al., 2007). This results in modified power and thrust coefficients of c p = 0.81 and c t = 4/3. Standard actuator disk theory assumes that the rotor disk is uniformly loaded, but this introduces singularities at the rotor edge. To ensure that the thrust force and power production are continuously differentiable, as well as to avoid numerical instabilities, the turbine force and power production are smoothly distributed across the rotor swept area. This differentiability with respect to position is crucial for gradientbased layout optimization, and smoothly distributing rotor forces is common practice in actuator disk (Wu and Porté-Agel, 2010) and actuator line (Churchfield et al., 2012) implementations. This is accomplished here using a geometric smoothing kernel ϕ. The following multivariate exponential distribution is used as the smoothing kernel in order to create a rotor with almost compact support that is still continuously differentiable: where ϕ n (x, y, z ; x n , y n , z n ) is the distribution for the nth turbine centered on (x n , y n , z n ), r is the rotor radius, w is the rotor half-width, and γ controls the sharpness of the rotor edge. Finally, we also introduce a constant β n given by which is used to normalize the values of the smoothed actuator disk. Both 1-D and 2-D prototypes of this smoothing function are shown in Fig. 1a and b for γ = 6 and w = r/4 with r = 40 m, and a hub-height slice of the actuator disk representation of a small wind plant is show in Fig. 1c.

Simulation setup
The 3-D computational domain used in the simulations has horizontal dimensions of 2.4 km × 2.4 km and a vertical dimension of 640 m (equivalent to 30 × 30 × 8 RD, with RD = 80 m), as shown in Fig. 2. Dirichlet boundary conditions are used to prescribe the wind speed and direction on inflow, top, and bottom boundaries, corresponding to the planes at x = −1.2 km, z = 640 m, and z = z 0 , respectively. The inflow velocity profile is specified according to a neutral logarithmic velocity profile where z is the vertical coordinate, z 0 is the roughness height, u * is the friction velocity, and κ = 0.4 is the von Kármán constant. The roughness height is taken to be z 0 = 0.04 m, corresponding to open and relatively smooth terrain, and u * is found by solving for the desired hub-height velocity. A no-slip boundary condition is used on the bottom boundary and the velocity along the top boundary is u in (640 m) from Eq. (14). On outflow boundaries (i.e., the planes at x = 1.2, y = −1.2, and y = 1.2 km), a standard "do-nothing" finiteelement outflow boundary condition (Heywood et al., 1996) is applied. The initial velocities at all locations in the domain are given by the logarithmic profile in Eq. (14), and the initial relative pressure is assumed to be 0 bar everywhere. A coarse computational mesh of finite elements is generated for the entire domain and then further refined within a circle that circumscribes the site boundaries, as shown in Fig. 2. For all tests presented here, the site boundary is assumed to be a square of side 18 RD centered in the computational domain. This area corresponds to a 6 RD spacing in streamwise and spanwise directions if 16 turbines are arranged in a regular grid. The resolution of the finite elements used in the simulation can be quantified by the radius of a circle that circumscribes a single tetrahedral element, termed the circumradius. The mesh is refined such that the turbine rotor diameter is at least 4 times larger than the circumradius of the finite elements within the wind plant site boundaries. The mesh is also stretched by a factor of 1.2 in the vertical direction in order to increase resolution near the bottom boundary. This results in a mesh with approximately 200 000 degrees of freedom.
To confirm that the domain is sufficiently large for the number of turbines, we examined the blockage ratio. In wind tunnel studies, the blockage ratio is typically defined as the ratio of the total tunnel cross-sectional area to the total rotor disk area. A worst-case blockage scenario would involve all 16 turbines impeding the flow with no overlap between rotors. This worst-case results in a blockage ratio of 5.2 %, which is well below the threshold of 10 % blockage that requires a correction in wind tunnel studies of horizontal-axis wind turbines (Chen and Liou, 2011).  Simulations are performed for a range of different inflow directions and inflow speeds, corresponding to both idealized and real-world wind roses and wind speed distributions. Steady-state solutions are found for each of the K wind states. We assume that the turbines are always yawed into the wind such that the rotor disk is perpendicular to the inflow direction. The computational domain and boundary conditions are assumed constant and the control variables containing the turbine locations are defined in a fixed reference frame. Different wind directions are modeled by applying a rota-R. N. King et al.: Adjoint optimization of wind plant layouts tion to the turbine coordinates corresponding to the inflow angle when calculating actuator disk forces. This rotation is included in the adjoint calculation and the resulting gradients are with respect to changes in the reference frame turbine coordinates. A weighted sum of the power production for each wind state (i.e., speed and direction) is performed based on the site wind speed distribution and wind rose, and the adjoint gradient is calculated over the total power output, taking into account the layout rotation. Because the boundary conditions are part of a well-posed PDE used as an optimization constraint, this approach of rotating the layout is preferred to explicitly changing the inflow direction because the PDE constraint in the optimization is kept constant. Figure 3 shows a schematic of the multi-state optimization workflow. The layout is optimized over all inflow states simultaneously in a multilevel optimization process with respect to the total power output rather than in a sequential optimization process over each inflow separately. The basic optimization process is the following:

Gradient-based layout optimization process
1. Begin with an initial layout m i , which can be either a gridded or random layout of the N turbines.
2. Perform flow-field simulations for each of the K desired wind states (corresponding to different wind speeds and directions obtained from either real or idealized wind roses and speed distributions).
3. Calculate the negative of the wind plant power, J k , for each of the K wind states.
4. Calculate the objective function by taking a weighted sum J = α k J k over all K wind states, where α k is the relative probability of each wind state obtained from either a real or idealized wind rose and speed distribution.
5. Compute adjoint simulations for the forward simulations, and calculate the gradient of the reduced functional dĴ /dm.
6. Use the gradient to perform a gradient-based optimization of the layout to obtain m i+1 and go to step 2 above.
This process is repeated until the change in the objective function or its gradient falls below a user-defined threshold, which typically occurs in 30-50 iterations.
It should be noted that, because the optimization algorithm is gradient-based, it finds local rather than global minima in the total objective function J . The layout optimization problem can have many local minima, particularly with just a few inflow wind states. Starting with a regular gridded layout slightly smaller than the site constraint was generally observed to produce reasonable results since none of the variables are initially constrained and a gridded layout is often a "good enough" guess to be in the radius of convergence to the global optimum. This initial layout, used for all simulations described in the following, is shown in Fig. 2. However, gradient-based methods are still fundamentally local searches and cannot provide strong assurances of finding a global minima. We note that with many inflow states (i.e., for large K), the optimization problem actually becomes more convex and convergence is achieved in fewer iterations. Additionally, running the optimization from many different starting configurations generated by random sampling or Latin hypercube samples can be used to further characterize the robustness of the minima.

Numerical implementation
The WindSE flow solver is implemented in a software package called FEniCS (Logg et al., 2012), which automates the solution of PDEs using the finite-element method. FEniCS is written in Python and can be easily integrated with other Python-based systems engineering tools like WISDEM. The FEniCS project is based on the DOLFIN problem-solving environment and connects a number of useful components for the automatic discretization and solution of finite-element problems. These components include a form language that allows users to specify equations in variational form using a syntax that closely resembles their mathematical description, automated compilers that generate finite-element forms for a chosen basis, and just-in-time compilation to C++ to enhance computational speed. FEniCS can interface to common HPC libraries such as PETSc and Trilinos for numerical linear algebra, ParMETIS and SCOTCH for domain decomposition, and MPI and OpenMP for parallelization. FEniCS has been extensively tested and validated on a number of computational problems in solid and fluid mechanics, eigenvalue problems, and coupled PDEs (Logg et al., 2012).
The description of finite-element problems as variational forms in FEniCS lends itself to highly abstracted algorithmic differentiation. The software package dolfin-adjoint (Farrell et al., 2013) performs a highlevel algorithmic differentiation of a forward problem implemented in FEniCS and can derive both discrete adjoint and tangent linear models. The discrete adjoint and tangent linear models are important in the gradient-based optimization algorithms that are used in data assimilation, optimal control, and error estimation. The algorithmic differentiation routines in dolfin-adjoint act on the forward problem discrete equations described in variational form and stored in memory. The corresponding discrete adjoint equations are derived in dolfin-adjoint and passed back to FEniCS as an additional PDE problem. This approach operates at a higher level of abstraction than traditional algorithmic differentiation, which typically treats forward models as a series of elementary instructions involving native operations in lowlevel code like addition and multiplication. This higher level of abstraction gives dolfin-adjoint greater flexibility and automation across a wide range of PDE applications because it avoids differentiating across low-level code where the mathematical and implementation details have been intermixed. Moreover, dolfin-adjoint can be implemented on unsteady and nonlinear PDEs, and can also be run in parallel. It can directly interface to the optimization algorithms in SciPy and also contains routines for checking the correctness of adjoint gradients and checkpointing.
The 3-D RANS and continuity equations that form the PDE constraint in Eq. (7) are solved with a nonlinear Newton solver in a coupled fashion using a mixed finite-element space with piecewise linear elements for both the velocity and pressure fields. To satisfy the Ladyzhenskaya-Babuška-Brezzi (LBB) (or inf-sup) compatibility condition (Brezzi and Fortin, 1991) with equal-order basis functions, we augment the momentum equation with an additional pressurestabilized Petrov-Galerkin term that weights the residual of the momentum equation by the gradient of the pressure test function. This pressure-based stabilization alleviates the saddle-point nature of the equal-order finite-element problem (Donéa and Huerta, 2003) but still vanishes for the exact solution to the momentum equation. Each nonlinear solve is initialized with the base logarithmic velocity profile and the relative residual is converged to below 10 −7 with Newton's method. The Newton solver uses Jacobians derived automatically within FEniCS and linear systems are solved directly with the sparse, parallel solver MUMPS (Amestoy et al., 2000). The choice of equal-order piecewise linear mixed finite-element spaces differs from previous studies on wind and ocean turbine layout optimization in FEniCS (King et al., 2016;Funke et al., 2014) that used Taylor-Hood mixed finite-element spaces that are piecewise quadratic for the velocity field. The lower-order representation in this study was necessary when implementing a 3-D solver to keep the to-tal degrees of freedom sufficiently low so that a direct linear algebra solver could be used.
The gradients obtained from dolfin-adjoint are used to optimize turbine locations with Python's SciPy implementations of the sequential least-squares programming (SLSQP) or limited-memory Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm with bounds (L-BFGS-B) algorithms. SLSQP is a gradient-based optimization algorithm that can also handle constraints (Nocedal and Wright, 1999). SLSQP minimizes a quadratic approximation to the objective function at each optimization iteration, with a linear approximation of the constraints. The L-BFGS-B algorithm (Byrd et al., 1995) is a limited-memory version of the popular BFGS algorithm that approximates the inverse Hessian matrix used in quasi-Newton methods. We use the L-BFGS-B algorithm for simpler test cases without inter-turbine spacing constraints as in Sect. 4.2 and 4.3. In Sect. 4.4 we consider a real-world AEP optimization over a full wind rose and do enforce inter-turbine spacing constraints, which requires the use of the SLSQP algorithm. Gradients of the objective function are provided by dolfin-adjoint and gradients of the minimum turbine spacing constraint are calculated analytically.
The forward and adjoint problems are parallelized with MPI and can be run on a desktop or in a high performance computing environment. The discrete adjoint calculation is automatically parallelized by dolfin-adjoint if the forward model is run in parallel, which drastically simplifies code development.

Results
In the following, we present results for several different layout optimization cases. First, we provide results for standard test cases of flow past a single turbine and flow through a  very deep wind plant in order to demonstrate that the RANS flow solver accurately captures wind turbine wakes, thereby providing confidence that subsequent layout optimizations are performed according to the correct flow physics. Second, we optimize a 16-turbine wind plant using wind roses with evenly weighted wind directions and a constant wind speed of 8 m s −1 in order to demonstrate new layout insights and optimization heuristics when accounting for nonlinear flow effects with a high-fidelity model. Third, we optimize a 16turbine wind plant using unevenly weighted wind roses that exhibit complex directional preferences, again with a constant wind speed of 8 m s −1 . Finally, we perform a full AEP optimization using data from the M2 meteorological tower at the National Wind Technology Center (NWTC) to demonstrate the capabilities of WindSE when optimizing layouts for conditions that are representative of real wind plants. In the full AEP optimization, both the wind direction and wind speed are varied, resulting in approximately 100 total wind states over which the optimization is performed. In all cases, we assume a turbine with a rotor diameter of RD = 80 m operating with fixed power and thrust coefficients (as described in Sect. 3.2) and a site constraint that is a square of side 18 RD centered in the computational domain.  Table 1. Relative efficiency of power production from the test cases compared to 16 turbines with no wake effects. Results from a "naïve" case with two parallel turbine rows and no layout optimization are also included to demonstrate the improvements achieved by the layout optimization.

RANS model testing
As a test of the qualitative performance of the RANS flow solver, WindSE was used to simulate flow fields for both a single turbine and a deep wind plant. Figure 4 shows vertical and horizontal velocity profiles in the wake of a single wind turbine. Consistent with theoretical expectations and previous results, a logarithmic velocity profile is observed in the undisturbed flow upstream of the turbine, a Gaussian velocity deficit is observed in the turbine wake, and a gradual wake recovery is observed with increasing distance downstream from the turbine. Moreover, a slight speedup around the edges of the wake is observed near the turbine rotorsuch speedups are not captured by traditional linear wake flow models and appear here due to the use of the higherfidelity RANS flow solver, which more accurately captures nonlinear flow physics. Figure 5 shows the velocity field from a simulation of a wind plant with 10 turbine rows perpendicular to the incoming wind direction. This flow field qualitatively agrees with time-averaged LES results reported by Wu and Porté-Agel (2013). The results presented in this section are only intended to demonstrate the qualitative agreement of the RANS solver with prior high-fidelity studies and analytical wake theory. This demonstrates that the present flow solver captures a rea-sonable level of fidelity to introduce the adjoint optimization framework. A detailed turbulence model verification and validation is beyond the scope of this study as we are instead focused on the integration of the model into a flexible and automated adjoint optimization framework. A comprehensive study on the implementation of more sophisticated turbulence models within the WindSE framework is left for future research.

Layout optimization for evenly weighted wind roses
In order to demonstrate many of the characteristics of optimal layouts and flow fields found using WindSE, Fig. 6 shows results obtained for wind roses with an increasing number of evenly weighted inflow directions. In all cases, the wind speed was assumed constant at 8 m s −1 . During the optimization, the turbines are constrained to stay within an 18 RD × 18 RD region in the center of the computational domain (see also Fig. 2). This region serves as the site boundary in the present tests. The turbines are initialized on a regular grid that spans two-thirds of the site constraints (as shown in Fig. 2) and no inter-turbine spacing constraint was applied. Convergence was achieved in approximately 30-50 iterations of the optimization loop shown in Fig. 3.  Figure 7. Spanwise velocities produced by the optimized layout found in the five-inflow-direction case shown in Fig. 6d. The spanwise velocity is normalized by the hub-height inflow velocity, indicating strong curvature at the rotor and a slight overall deflection of the flow away from the center of the wind plant. Figure 6 shows that optimal layouts are symmetric about a rotation or reflection when the wind directions are evenly weighted. For the two-direction and six-direction cases shown in Fig. 6a and e, the layout is symmetric about a 180 • rotation and for the four-direction case shown in Fig. 6c, the layout is symmetric about a 90 • rotation. For odd numbers of inflow directions, Fig. 6b and d show that optimal layouts are symmetric about a horizontal reflection across the northsouth axis. The rotational and reflectional symmetries of the layouts for evenly weighted, uniform speed wind roses are useful heuristics for checking the flow solver and optimization results.
The flow fields shown in Fig. 6 indicate that the RANS solver captures flow curvature due to nonlinear turbulent transport effects and pressure increases upwind of the rotor disk. These effects are neglected when superimposing a wake deficit on a flow field obtained from a linear model, as is typically done in many industry-standard optimization frameworks. The flow curvature results in local speedup effects between two turbines, or just outside a wake, that downwind turbines take advantage of in strongly directional wind roses. This effect is further demonstrated in Fig. 7, which shows nonzero spanwise velocities generated by the flow curving around the turbines. This curvature is responsible for the "staggered" appearance of many of the layouts in Fig. 6 where the optimizer takes advantage of these local speedups. Because the power production scales with the cube of the wind speed, these small speedups can have a strong nonlinear effect on the power output. This speedup effect due to flow curvature is particularly enhanced in 2-D, but is still present in the 3-D simulations.
Flow curvature also affects the propagation and interaction of the turbine wakes. Wakes near the edge of the plant are slightly deflected away from the plant center and reflect the overall spreading of the flow streamlines. This curvature can be observed near the edges of the plant in the cases shown in Fig. 6. Such curvature is again not captured by engineering wake models which prescribe wakes that always travel perpendicular to the rotor. Additionally, the wakes are pinched, curved, or merged when encountering speedups around downwind turbines or other wakes. The RANS flow solver accounts for the effects of other turbines and their wakes on the expansion and dissipation of wakes beyond what is accounted for in prescribed wake models. The deflection and curvature of wakes likely has important ramifications for yaw control strategies that attempt to steer wakes away from downwind turbines.
It is emphasized that the results in Fig. 6 show that the optimizer does not place turbines in straight rows perpendicular to a single predominant wind direction or in a regularly spaced grid for these evenly weighted wind rose cases. Instead, the turbines are placed closer together and staggered to take advantage of local speedups between laterally placed turbines. This is a different strategy than the maximized spacing found when optimizing with linear flow models and prescribed wakes, and is likely sensitive to the number and size of wind direction bins.
We compared the optimized layout for two inflow directions shown in Fig. 6a to a case with two parallel rows of turbines aligned perpendicular to the inflow directions and with maximal spacing between the rows as allowed by the site constraints. This "naïve" strategy is often found when using linear flow models. The optimized layout increases power production by 18.4 % over the "naïve" layout due to the strongly directional speedups in this case. We also examined the power production of the test cases when normalized by the total power available if all turbines were unwaked, shown in Table 1. The optimized layouts substantially reduce, but do not entirely eliminate, the wake losses.

Layout optimization with unevenly weighted wind roses
In the previous section, each inflow direction was given an equal weight (i.e., α k ) in creating the total objective function J . However, real-world wind roses are seldom so simple and typically have several preferred wind directions, with many other less dominant directions. Here we demonstrate optimization of a 16-turbine layout using the same computational domain and setup as in the previous section, but with unevenly weighted wind roses that have two and three dominant directions. Once again, in both cases we assume a constant 8 m s −1 wind speed and do not enforce inter-turbine spacing constraints. The first wind rose considered has two prominent wind directions spaced 107 • apart, with a random normal distribution of directions around the primary axes. The wind rose is binned into 36 directions and is roughly shaped like a  boomerang, as shown in Fig. 8. Compared to the initial uniform gridded layout (see Fig. 2), the optimized layout shown in Fig. 8 improves power production by 9.4 %. Despite the uneven weighting of the wind rose, the resulting flow field in the right panel of Fig. 8 once again conforms to many of the heuristics outlined in the previous section, including turbines that take advantage of local speedups between upstream turbines and slight flow curvature at the edge of the plant.
The second wind rose considered is given by the directional distribution of 8 m s −1 wind speeds from the M2 mast at NWTC (Jager and Andreas, 1996), as shown in Fig. 9. The wind rose is constructed from publicly available data recorded over the 2015 calendar year. For this wind speed, the wind rose has three prominent directions roughly aligned with inflow from the north, south, and west, along with many other much less prominent inflow directions. The wind rose is again binned into 36 directions (giving K = 36) and the optimized layout and resulting flow field are shown in Fig. 9. Compared to the initial regular gridded layout shown in Fig. 2, the power production of the optimized layout is improved by 7.0 %. Once again, despite the much more topologically complex nature of the optimized layout shown in Fig. 9, heuristics such as local speed-ups, staggered spacing, and flow curvature are shown to be important in determining the final layout. It is also emphasized that for such a complex wind rose, the resulting layout bears little resemblance to a regular grid where the turbines would be spaced as far from each other as possible.

Layout optimization based on annual energy production
As a final demonstration of the power and flexibility of WindSE, we optimize a 16-turbine wind plant based on the full AEP from real-world site data. In this case, wind states corresponding to both different wind speeds and wind directions are considered, instead of simply considering a single uniform wind speed as in the tests described in the previous two sections. Data are once again used from the M2 mast at NWTC (Jager and Andreas, 1996), and distributions are formed by binning the data into 36 wind directions and 5 wind speed classes centered on 4, 6, 8, 10, and 12 m s −1 , giving a total of K = 180 possible wind states in the analysis. Wind states that occurred less than 0.2 % of the time were neglected in the optimization, reducing the total number of states considered to 99. We further enforce a minimum interturbine spacing D min of 3 times the rotor diameter. As shown in Fig. 10, the wind rose for the M2 mast is predominately distributed along the west-northwest direction, with secondary influences from the north and south. The highest wind speeds are also observed for winds from the west-northwest, and so it can be anticipated that a full AEP optimization will result in a layout that is preferentially suited for winds that blow from this direction. This is indeed the case, as shown in the final optimized layout in Fig. 10. The turbines are loosely arranged in two northsouth rows that result in relatively large separations between upstream and downstream turbines when the wind is from the west-northwest. As with other tests for evenly and unevenly weighted wind roses with uniform wind speeds, the turbines in the full AEP optimization are staggered with respect to each other in order to take advantage of local speedups between upwind turbines. The resulting optimized layout im-proves AEP by 8.6 % compared to the initial regular gridded layout shown in Fig. 2.
It should be noted that despite the predominant high-speed winds from the west-northwest, the turbines in the northsouth rows shown in Fig. 10 do not each fall perfectly on the site boundaries. That is, if only winds from the westnorthwest were included in the analysis, one might naïvely place all turbines along the upstream and downstream site boundaries in order to maximize the separation between turbines in the direction of the dominant wind. However, since other wind directions are also included in the analysis, the final optimal layout is more complicated and ensures some degree of turbine staggering in the east-west direction to take advantage of local speedups and minimize wake losses when the wind blows from the north or south. It is also emphasized that simply by accounting for the full AEP in the optimization, the optimal layout in Fig. 10 is substantially different from the optimal layout shown in Fig. 9, where only a single uniform wind speed was considered.

Summary and conclusions
WindSE represents a fundamentally new approach to utilityscale wind plant layout optimization that implements adjoint optimization of turbine locations in a flexible CFD framework. The steady-state 3-D RANS flow model advances the fidelity of the fluid dynamics in wind plant optimization simulations, but also requires gradient-based optimization techniques because of the expense of solving the RANS PDE constraint. The adjoint optimization framework in WindSE provides these gradients, and further enables efficient, high-dimensional optimization of very large control spaces. This enables layout optimization with a first-principles flow model without running expensive LES for tuning purposes. The results presented in this paper are achieved at a relatively low computational cost as all optimization results were obtained on a single workstation with a six-core Intel Xeon processor and 32 GB of memory.
The results presented in this paper show that the nonlinear flow effects leading to wake curvature and local speedups are significant when optimizing over a few prominent wind directions. We find consistent rotational symmetry in the optimal layouts with evenly weighted inflow directions, suggesting that evenly weighted wind roses may be useful diagnostic tests for wind plant optimization. As the wind rose is refined into more bin directions, the optimizer is able to take advantage of prominent wind directions and increase energy production. As the number of wind direction and inflow speed combinations increases, WindSE is able to perform a full AEP optimization and achieve sizable gains of almost 9 % compared to initial gridded layouts. In the full AEP optimization, the optimizer emphasizes the high speed winds from the west-northwest, since these winds contain the most energy. However, rather than aligning the turbines into rows perpendicular to the incoming wind, the turbines are offset within the general row structure. This is beneficial when the wind is blowing parallel to the row, which is the most common secondary wind direction. It also allows the turbines to take advantage of slight speedup effects around the edges of the wake from upstream turbines. The gradient-based techniques used in this study are inherently constrained to a local search, however, and further research is needed to assess what types of initial layouts are needed to draw conclusions about global optima.
A number of different future studies utilizing WindSE can be imagined. Because the adjoint approach is insensitive to the number of control variables, coupled optimization of turbine locations, hub height, rotor diameter, control settings, etc. can be considered. The flow curvature captured in WindSE is observed to deflect or modify turbine wakes and will likely have important implications in yaw control optimization applications. The high level of abstraction and automation also makes WindSE a useful framework for studying the effects of different turbulence closure models on optimal layouts. We further intend to compare layouts optimized using WindSE and linear flow models by testing both layouts using LES. Additionally, the finite-element method used in WindSE is, in principle, also well suited to examining layout optimization in the presence of terrain-induced complex flows. Finally, the integration of WindSE within NREL systems engineering tools will enable economic analysis and a consideration of LCOE alongside AEP in future optimization studies.
Data availability. Simulations were performed using the WindSE code that will be made available through NREL's WISDEM software tool (Dykes et al., 2014), and can be accessed at https://nwtc. nrel.gov/WISDEM. The NWTC wind data used for layout optimizations in Sect. 4.3 and 4.4 are available at https://www.nrel.gov/ midc/nwtc_m2/ (Jager and Andreas, 1996).