We define and demonstrate a procedure for quick assessment of site-specific lifetime fatigue loads using simplified load mapping functions (surrogate models), trained by means of a database with high-fidelity load simulations. The performance of five surrogate models is assessed by comparing site-specific lifetime fatigue load predictions at 10 sites using an aeroelastic model of the DTU 10 MW reference wind turbine. The surrogate methods are polynomial chaos expansion, quadratic response surface, universal Kriging, importance sampling, and nearest-neighbor interpolation. Practical bounds for the database and calibration are defined via nine environmental variables, and their relative effects on the fatigue loads are evaluated by means of Sobol sensitivity indices. Of the surrogate-model methods, polynomial chaos expansion provides an accurate and robust performance in prediction of the different site-specific loads. Although the Kriging approach showed slightly better accuracy, it also demanded more computational resources.

Before installing a wind turbine at a particular site, it needs to be ensured
that the wind turbine structure is sufficiently robust to withstand the
environmentally induced loads during its entire lifetime. As the design of
serially produced wind turbines is typically based on a specific set of wind
conditions, i.e., a site class defined in the

Various methods and procedures have been attempted for simplified load
assessment for wind energy applications.

In the present work, we analyze, refine, and expand the existing simplified
load assessment methods, and provide a structured approach for practical
implementation of a surrogate modeling approach for site feasibility
assessment. The study aims at fulfilling the following four specific goals:

define a simplified load assessment procedure that can take into account all the relevant external parameters required for full characterization of the wind fields used in load simulations;

define feasible ranges of variation in the wind-related parameters, dependent on wind turbine rotor size;

demonstrate how different surrogate modeling approaches can be successfully employed in the problem, and compare their performance; and

obtain estimates of the statistical uncertainty and parameter sensitivities.

The scope of the present study is loads generated under normal power
production, which encompasses design load cases (DLCs) 1.2 and 1.3 from the
IEC 61400-1 standard

Figure

Schematic overview of the site-specific load analysis procedure.

The turbulent wind field serving as input to aeroelastic load simulations can
be fully statistically characterized by the following variables:

mean wind field across the rotor plane as described by the

average wind speed at hub height,

vertical wind shear exponent,

wind veer (change of mean flow direction with height,

turbulence described via

variance of wind fluctuations,

turbulence probability density function (e.g., Gaussian);

turbulence spectrum defined by the

turbulence length scale

anisotropy factor

turbulence dissipation parameter

air density

mean wind inflow direction relative to the turbine in terms of

vertical inflow (tilt) angle

horizontal inflow (yaw) angle

The loads experienced by a wind turbine are a function of the wind-derived factors described above, and of the structural properties and control system of the wind turbine. Therefore, a load characterization database taking only wind-related factors into account is going to be turbine-specific.

Bounds of variation for the variables considered. All values are defined as statistics over 10 min reference period.

The variables describing the wind field often have a significant correlation
between them, and any site-specific load or power assessment has to take this
into account using an appropriate description of the joint distribution of
input variables. At the same time, most probabilistic models require inputs
in terms of a set of independent and identically distributed (i.i.d)
variables. The mapping from the space of i.i.d variables to joint
distribution of physical variables requires applying an
isoprobabilistic transformation as,
for example, the Nataf transform

The choice for ranges of variation in the input variables needs to ensure a balance between two objectives: (a) covering as wide a range of potential sites as possible, while (b) ensuring that the load simulations produce valid results. To ensure validity of load simulations, the major assumptions behind the generation of the wind field and computation of aerodynamic forces should not be violated, and the instantaneous wind field should have physically meaningful values.

For the case of building a high-fidelity load database, all variables given
in Table

The turbulence intensity,

Sample distributions obtained using 1024 low-discrepancy points
within a 6-dimensional variable space

Building a large database with high-fidelity load simulations covering the
entire variable space is a central task in the present study as such a
database can serve several purposes:

be directly used as a site assessment tool by probability weighting the relative contribution of each point to the design loads;

serve as an input for calibrating simplified models, i.e., surrogate models and response surfaces.

Characterizing the load behavior of a wind turbine over a range of input
conditions requires an experimental design covering the range of variation in
all variables with sufficient resolution. In the case of having more than 3–4
dimensions, a full factorial design with multiple levels quickly becomes
impractical due to the exponential increase in the number of design points as
a function of the number of dimensions. Therefore, in the present study we resort
to a MC simulation as the main approach for covering the joint
distribution of wind conditions. For assuring better and faster convergence,
we use the low-discrepancy Halton sequence in a quasi-MC approach

A large-scale generic load database is generated in order to serve as a
training data set for the load mapping functions. The point sampling is done
using a Halton low-discrepancy sequence within the 9-dimensional variable
space defined in Sect.

Up to

The physical values of the stochastic variables for all quasi-MC samples
are obtained by applying a Rosenblatt transformation using the conditional
distribution bounds given in Table

For each sample point, eight simulations, with 3800 s duration each, are carried out. The first 200 s of the simulations are discarded in order to eliminate simulation run-in time transients, and the output is 3600 s (1 h) of load time series from each simulation.

The Mann model simulation parameters (

Each 1 h time series is split into six 10 min series, which on average will have the required statistics. This leads to a total of 48 10 min time series for each quasi-MC sample point.

Simulation conditions are kept stationary over each 1 h simulation period.

The DTU 10 MW reference wind turbine model

By choosing to run 1 h simulations followed by splitting up of the time series instead of directly simulating 10 min periods, we want to capture some of the low-frequency fluctuations generated by the Mann model turbulence, especially at larger turbulence length scales. When we generate a longer turbulence box, it includes more of these low-frequency variations, which in fact introduce some degree of nonstationarity when looking at 10 min windows.

The main quantities of interest from the load simulation output are the
short-term (10 min) fatigue damage-equivalent loads (DELs), and the 10 min
extremes (minimum or maximum, depending on the load type). For each load
simulation, four statistics (mean, standard deviation, minimum, and maximum
values) are calculated for each load channel. For several selected load
channels, the 1 Hz DEL for a reference period

Obtaining site-specific lifetime fatigue loads from a discrete set of
simulations requires integrating the short-term damage contributions over the
long-term joint distribution of input conditions. The lifetime
damage-equivalent fatigue load is defined as

With the present problem of evaluating the uncertainty in aeroelastic
simulations – for any specific combination of environmental conditions,

Confidence intervals (CIs) reflecting
such an uncertainty can be determined in a straightforward way using the
bootstrapping technique

In this section we present five different approaches that can be used to map
loads from a high-fidelity database to integrated site-specific design loads:

importance sampling,

nearest-neighbor interpolation,

polynomial chaos expansion,

universal Kriging, and

quadratic response surface.

The first two methodologies carry out a direct numerical integration over the
high-fidelity database presented in Sect.

Figure

One of the simplest and most straightforward (but not necessarily most
precise) ways of carrying out the integrations needed to obtain predicted
statistics is to use importance sampling (IS), where probability weights
are applied on each of the database sample points

Estimating an expected function value with a true multidimensional
interpolation from the high-fidelity database would require finding a set of
neighboring points that form a convex polygon. For problem dimensions higher
than 3, this is quite challenging due to the nonstructured sample
distribution. However, it is much easier to find a more crude approximation
by simply finding the database point closest to the function evaluation point
in a nearest-neighbor approach. This is similar to the table
look-up technique often used with structured grids; the denser
the distribution of the sample points is, the closer will the results be to
an actual MC simulation. Finding the nearest neighbor to a function
evaluation point requires determining the distances between this point and
the rest of the points in the sample space. This is done most consistently in
a normalized space, i.e., where the input variables have equal scaling. The
cumulative distribution function (CDF) of the variables is an
example of such a space, as all CDFs have the same range of

Since some of the input variables may have significantly bigger influence on
the result than other variables, it may be useful to weight the CDF of
different variables according to their importance (e.g., by making the weights
proportional to the variable sensitivity indices; see
Sect.

Polynomial chaos expansion (PCE) is a popular method for approximating a
stochastic function of multiple random variables using an orthogonal
polynomial basis. For the present problem, using a Wiener–Askey generalized
PCE

Kriging

The functional form of the mean field

The main practical difference between regression- or expansion-type models
such as regular PCE and the Kriging approach is

A quadratic-polynomial response surface (RS) method based on central
composite design (CCD) is a reduced-order model which, among other
applications, has been used for wind turbine load
prediction

Example of a rotatable central composite design (CCD) in a
2-dimensional standard normal space

We use the global Sobol indices,

For any polynomial-based regression model that includes dependence between
variables, the problem grows steeply in size when the number of dimensions,

One useful corollary of the orthogonality in the PCE polynomial basis is that
the contribution of each individual term to the total variance of the
expansion (i.e., the individual Sobol indices) can be easily computed based on
the coefficient values (see Appendix

Convergence of a PCE of dimension 6 and order 6, as a function of
number of collocation points and hours of simulation per collocation point.
The

We assess the convergence of PCE by calculating the normalized
root-mean-square error (NRMSE) between a set of observed quantities (i.e.,
DELs from simulations)

Figure

The IS procedure has relatively slow convergence compared to, for example, a quasi-MC
simulation. Figure

Convergence of an importance sampling (IS) calculation of the blade
root moment from the high-fidelity database towards
site-specific lifetime fatigue loads for reference site 0
(Table

Normalized root mean square error characterizing the difference
between aeroelastic simulations and reduced-order models. Load channel
abbreviations are the following: TB: tower base; TT: tower top; MS: main
shaft; BR: blade root. Loading directions consist of

PCE-based Sobol sensitivity indices for the high-fidelity load database variable ranges.

Since the prediction of lifetime fatigue loads is the main purpose of the
present study, the performance of the load prediction methods with respect to
estimating the lifetime DEL is the main criterion for evaluation. However,
the lifetime DEL as an integrated quantity will efficiently identify model
bias but may not reveal the magnitude of some uncertainties that result in
zero-mean error. As an additional means of comparison we calculate the
NRMSE, defined in Eq. (

The RMS error analysis reveals a slightly different picture. In contrast to the lifetime DEL where the Kriging, PCE, and RS methods showed very similar results, the RMS error of the quadratic RS is for some channels about twice the RMS error of the other two approaches.

As described earlier in Sect.

Site-specific Sobol sensitivity indices derived for site

The low-fidelity site-specific load calculation methods presented in this
study are validated against a set of reference site-specific load
calculations on a number of different virtual sites, based on real-world
measurement data that cover most of the variable domain included within the
high-fidelity database. In order to show a realistic example of situations
where a site-specific load estimation is necessary, the majority of the
virtual sites chosen are characterized with conditions that slightly exceed
the standard conditions specified by a certain type-certification class.
Exceptions are site 0, which has the most measured variables available and is
therefore chosen as a primary reference site, and the virtual “sites”
representing standard IEC class conditions. The IEC classes are included as
test sites as they are described by only one independent variable (mean wind
speed). They are useful test conditions as it may be challenging to correctly
predict loads as a function of only one variable using a model based on up to
nine random variables. The list of test sites is given in
Table

Reference virtual sites used for validation of the site-specific load estimation methods.

Site 0 (also referred to as the reference site) is located at the
Nørrekær Enge wind farm in northern Denmark

Sites 5 and 6 are located at NREL's National Wind Technology Center (NWTC),
near the base of the Rocky Mountain foothills just south of Boulder,
Colorado

For each site, the joint distributions of all variables are defined in terms
of conditional dependencies, and generating simulations of site-specific
conditions is carried out using the Rosenblatt transformation,
Eq. (

With this procedure, 1000 quasi-MC samples of the environmental
conditions at each site are generated from the respective joint distribution.
All realizations where the wind speed is between the DTU 10 MW wind turbine
cut-in and cut-out wind speed are fed as input to load simulations. The
actual number of load simulations for each site are given in
Table

Comparison of predictions of the lifetime damage-equivalent loads (DELs) for six different estimation approaches. All values are normalized with respect to the mean estimate from a site-specific Monte Carlo (MC) simulation, and the error bars represent the bounds of the 95 % confidence intervals (CIs). Results from two PCEs are shown: the blue bar corresponds to the output of a fourth-order PCE, while the black bar corresponds to a sixth-order PCE.

Parameters defining the conditional distribution relationships used in computing joint distributions of the environmental conditions for the test sites/conditions.

Comparison of predictions of the lifetime damage-equivalent loads (DELs) for six different estimation approaches. All values are normalized with respect to the mean estimate from a site-specific Monte Carlo simulation.

Comparison of predictions of the lifetime damage-equivalent loads (DELs) for six different estimation approaches. All values are normalized with respect to the mean estimate from a site-specific Monte Carlo simulation.

The lifetime damage-equivalent loads (DELs) are computed for all reference
sites in Table

Lifetime-equivalent load predictions normalized with respect to
MC simulations and averaged over 10 reference sites. Load channel
abbreviations are the following: TB: tower base; TT: tower top; MS: main
shaft; BR: blade root. Loading directions consist of

Model execution times for the lifetime damage-equivalent fatigue load computations for site 0.

The results for site 0 show that for all methods the prediction of blade root
and tower top loads is more accurate than the prediction of tower base loads.
Also, overall the predictions from the reduced models – the quadratic RS and
the PCE, as well as from the Kriging model – are more robust than the IS and
nearest-neighbor (NN) interpolation techniques. Similar performance is
observed for most other validation sites. The summarized site-specific
results for all surrogate-based load estimation methods are shown in
Table

Predictions of lifetime damage-equivalent tower loads for five different estimation approaches and four load channels for the different sites (0–6) and IEC conditions (virtual sites 7–9). All values are normalized with respect to the mean estimate from a site-specific Monte Carlo (MC) simulation. The abbreviations refer to PCE: polynomial chaos expansion; RS: quadratic response surface; IS: importance sampling; NN: nearest-neighbor interpolation; and KM: universal Kriging model.

Another important aspect to consider when comparing the performance of the
surrogate models is the model execution speed, and whether there is a
tradeoff between speed and accuracy. A comparison of the model evaluation
times for the site-specific lifetime load computation for site 0 is given in
Table

The previous sections of this paper described a procedure for estimating
site-specific lifetime damage-equivalent loads (DELs), using several simplified
model techniques applied to 10 different sites and conditions. Based on the
site-specific lifetime DEL comparisons, for quick site-specific load
estimation, the three models based on machine learning were most
viable (sufficiently accurate over the majority of the sampling space):
polynomial chaos expansion, Kriging, and the quadratic response surface (RS). When
estimating lifetime DEL, these methods showed approximately equal levels of
uncertainty. However, in the one-to-one comparisons, the quadratic RS model
showed larger error, especially for sample points corresponding to more
extreme combinations of environmental conditions. This is due to the lower
order and the relatively small number of calibration points of the quadratic
RS, which means that the model accuracy decreases in the sampling space away
from the calibration points, especially if there is any extrapolation. This
inaccuracy is reflected in the NRMSE from one-to-one comparisons, but
is less obvious in the lifetime fatigue load computations that average out
errors with zero mean. The universal Kriging model demonstrated the smallest
overall uncertainty, both in sample-to-sample comparisons and in lifetime DEL
computations. This is to be expected since the Kriging employs a
well-performing model (the PCE) and combines it with an interpolation scheme
that subsequently reduces the uncertainty even further. However, in most
cases the observed improvement over a pure PCE is not significant. This
indicates that the sources of the remaining uncertainty are outside the
models – e.g., the seed-to-seed turbulence variations: the models being
calibrated with turbulence realizations different from the ones used to
compute the reference site-specific loads. As a result, the trend function
(the

Predictions of lifetime damage-equivalent loads (yaw, shaft torsion, blade-root) for five different estimation approaches and four load channels. All values are normalized with respect to the mean estimate from a site-specific Monte Carlo (MC) simulation. The abbreviations refer to PCE: polynomial chaos expansion; RS: quadratic response surface; IS: importance sampling; NN: nearest-neighbor interpolation; and KM: universal Kriging model.

For all site-specific load assessment methods discussed, the estimations are trustworthy only within the bounds of the variable space used for model calibration – extrapolation is either not possible or may lead to unpredictable results. It is therefore important to ensure that the site-specific distributions used for load assessment are not outside the bounds of validity of the load estimation model.

The variable bounds presented in this paper are based on a certain degree of consideration of atmospheric physics employed in the relationships between wind speed, turbulence, wind shear, wind veer, and turbulence length scale. The primary scope is to encompass the ranges of conditions relevant for fatigue load analysis, and the currently suggested variable bounds include all normal-turbulence (NTM) classes. However, for some other calculations it may be more practical to choose other bound definitions; for example, for the extreme turbulence models prescribed by the IEC 61400-1, the currently suggested bounds do not include ETM class A.

For the more advanced methods like PCE and Kriging, there is a practical
limitation on the number of training points to be used in a single-computer
setup. For a PCE the practical limit is mainly subject to memory availability
when assembling and inverting the information matrix, and for a PCE of order
6 and with nine dimensions, this limit is on the order of 1–

Considering the overall merits of the load prediction methods analyzed, the PCE provided an accurate and robust performance. The Kriging approach showed slightly better accuracy but at the expense of increased computational demands. Taking this together with the other useful properties of the PCE, such as orthogonality facilitating creation of sparse models through variance-based sensitivity analysis, we consider the PCE as the most useful method overall.

In addition to the load-mapping approaches presented in this paper,
artificial neural networks (ANNs) are interesting alternative candidates.
ANNs

The results from the site validations showed that for the majority of sites
and load channels, the simplified load assessment techniques can predict the
site-specific lifetime fatigue loads to within about 5 % accuracy.
However, it should be noted that this accuracy is relative to full-fidelity
load simulations, and not necessarily to the actual site conditions, where
additional uncertainties (e.g., uncertainty in the site conditions or the
turbine operating strategy) can lead to even larger errors. The procedures
demonstrated in this study are thus very suitable for carrying out quick site
feasibility assessments; the latter can help to decide in a timely fashion
whether to discard a given site as unfeasible, or to make additional
high-fidelity computations or more measurements of site conditions. The same
procedure, but with additional variables (e.g., three variables for wake-induced
effects as in

In the present work we defined and demonstrated a procedure for quick
assessment of site-specific lifetime fatigue loads using load surrogate
models calibrated by means of a database with high-fidelity load simulations.
The performance of polynomial chaos expansion, quadratic response surface,
universal Kriging, importance sampling, and nearest-neighbor interpolation in
predicting site-specific lifetime fatigue loads was assessed by training the
surrogate models on a database with aeroelastic load simulations of the DTU
10 MW reference wind turbine. Practical bounds of variation were defined for
nine environmental variables and their effect on the lifetime fatigue loads
was studied. The study led to the following main conclusions.

The variable sensitivity analysis showed that mean wind speed
and turbulence (standard deviation of wind speed fluctuations) are the
factors having the highest influence on fatigue loads. The wind shear and the
Mann turbulence length scale were also found to have an appreciable
influence, with the effect of wind shear being more pronounced for rotating
components such as blades. Within the studied ranges of variation, the Mann
turbulence parameter

The best performing models had errors of less than 5 % for most sites and load channels, which is in the same order of magnitude as the variations due to realization-to-realization uncertainty.

A universal Kriging model employing polynomial chaos expansion as a trend function achieved the most accurate predictions, but also required the longest computing times.

A polynomial chaos expansion with Legendre basis polynomials was concluded to be the approach with best overall performance.

The procedures demonstrated in this study are well suited for carrying out quick site feasibility assessments conditional on a specific wind turbine model.

Due to storage limitations, only 10 min statistics are stored, as well as the scripts that can be used to regenerate the full data sets. Model training and evaluation was done entirely based on the 10 min statistics. These data are available upon request.

Polynomial chaos expansion (PCE) is a popular method for approximating
stochastic functions of multiple random variables, using an orthogonal
polynomial basis. In the classical definition of PCE

In the
classical definition of the PC decomposition used in, for example, spectral
stochastic finite element methods

Using the notation defined by

With the above, each multivariate polynomial is built as the product of

The aim of using PCE is to represent a scalar quantity

The solution of Eq. (

Kriging

Here

It follows that the model prediction

For a problem with

The main practical difference between regression- or expansion-type models
such as regular PCE and the Kriging approach is in the way the training
sample is used in the model: in the pure regression-based approaches the
training sample is only used to calibrate the regression coefficients, while
in Kriging as in other interpolation techniques the training sample is
retained and used in every new model evaluation. As a result, the Kriging
model may have an advantage in accuracy since the model error tends to zero
in the vicinity of the training points; however, this comes at the expense of
an increase in the computational demands for new model evaluations. The extra
computational burden is mainly the time necessary to assemble

One useful corollary of the orthogonality in the PCE polynomial basis is that
the total variance of the expansion can be expressed as the

Denoting

The Sobol indices estimated using the above procedure represent the relative
contribution to the model variance from variables following the joint input
distribution used to calibrate the PCE. In the present case, this
distribution would span the uniform variable space of the high-fidelity
database defined in Sect.

ND carried out the load simulations, the calibration, and evaluation of surrogate models, as well as a major part of the paper writing. MCK devised the bounds for environmental conditions, wrote several parts of the paper (environmental conditions and parts of the Appendix), participated in the conceptual development, and provided critical review. AV analyzed data sets with environmental conditions and provided the reference site information, as well as a critical review of the paper. JB participated in the conceptual development of the study, contributed with text in the introduction, site-specific calculations, discussion, and critically reviewed the text.

The authors declare that they have no conflict of interest.

The work reported in this paper was carried out as part of the Wind2Loads internal project at the Technical University of Denmark, Department of Wind Energy. The authors thank their colleagues for the valuable input and support. Edited by: Michael Muskulus Reviewed by: two anonymous referees