Conditional and unconditional optimization, areas of application. Methods of unconditional and conditional optimization Methods of expert assessments

from the total set of options, you can build a histogram, evaluate how often good options occur, and, finally, you can make a decision - to continue the search or limit yourself to the solution found.

Despite the universality and simplicity of the random sensing procedure, it cannot be limited due to the significant computational complexity. Therefore, methods have become more widespread directed search solutions.

4.5.3. Unconstrained optimization methods

The necessary conditions for achieving an extremum in all the forms discussed above lead to the solution of a system of nonlinear equations - a very complex and time-consuming task (even in computational mathematics, the solution of nonlinear equations is often reduced to some kind of optimization problem). Therefore, in practice, other approaches to optimizing functions are used, the consideration of which will begin with the so-called direct methods. In the future, we will talk about minimization, so the extremum is the minimum.

Currently, many numerical methods have been developed for both unconditional and conditional optimization problems. The quality of a numerical method is characterized by many factors: the speed of convergence, the execution time of one iteration, the amount of computer memory required to implement the method, the class of problems being solved, etc. The problems being solved are also very diverse: they can have high and low dimensions, be unimodal and multi-extremal etc. The same method, effective for solving problems of one type, may turn out to be completely unacceptable for problems of another type.

Below is an overview of the main methods for solving nonlinear programming problems. It should be borne in mind that the entire list of such methods is very wide and remains open. In addition, various modifications are known for a number of the methods under consideration. More detailed information can be obtained at

example, in .

Let's start by considering direct methods of unconstrained optimization when there are no restrictions.

The meaning of direct unconditional optimization methods is to construct a sequence of points X, X, ..., X, such

that f (X )>f (X )>… …>f (X ). As starting point X, an arbitrary point can be chosen, but one strives to choose it as close as possible to the minimum point. The transition (iteration) from point X to point X, k =0,1,2,... consists of two stages:

selecting the direction of movement from a point X ;

determining the step along this direction.

Methods for constructing such sequences are often called descent methods, since the transition is carried out from larger values ​​of the function to smaller ones.

Mathematically, descent methods are described by the relation

X =X +a k p , k =0,1,2,...,

where p is the unit vector defining the direction of descent;

a k – step length.

Different descent methods differ from each other in the way they choose p and a k. In practice, only methods that converge are used. They allow you to get the minimum point or get fairly close to it in a finite number of steps. The quality of convergent iterative methods is assessed by the speed of convergence.

Theoretically, in descent methods the problem is solved in an infinite number of iterations. In practice, calculations are stopped when certain criteria (conditions) for stopping the iterative process are met. For example, this may be the condition of small increment

argument

X[ k] − X[ k − 1 ]

f (X [ k ]) − f (X [ k − 1])< γ . Здесь k – номер итерации; ε , γ – задан-

nary values ​​of the accuracy of solving the problem.

Methods for finding the minimum point are called deterministic if both parameters of the transition from X to X (direction of movement and step size) are chosen uniquely based on the information available at point X. If any random mechanism is used during the transition, then the search algorithm is called a random minimum search.

Deterministic unconstrained minimization algorithms are divided into classes depending on the type of information used. If at each iteration only the values ​​of the minimized functions are used, then the method is called the zero-order method. If, in addition, it is necessary to calculate the first derivatives of the function being minimized, then first-order methods take place,

if additional calculation of second derivatives is necessary, use second-order methods.

It should be noted that when solving unconditional minimization problems, first- and second-order methods, as a rule, have a higher convergence rate than zero-order methods. However, in practice, calculating the first and second derivatives of a function of a large number of variables is very laborious. In some cases they cannot be obtained in the form of analytical functions. Derivatives by various numerical methods are determined with errors that may limit the use of such methods. In addition, the optimality criterion can be specified not explicitly, but by a system of equations. In this case, it becomes very difficult and sometimes impossible to find derivatives analytically or numerically. Therefore, zero-order methods are discussed in more detail here.

One-dimensional search methods. List of one-dimensional search methods - numerical search for the extremum of a function of one argument f(x ) – is quite wide and well covered in the literature. Therefore, here we will limit ourselves to considering only one method, which, according to the authors’ experience, is one of the most effective - the “golden section” method.

The idea of ​​the method is to sequentially reduce the uncertainty interval - the interval of values ​​of the argument x containing the desired minimum point - to a length not exceeding

permissible error of the result ε. The initial interval can be considered given by conditions problem, an admissible range of values ​​of the argument or, in the case where the latter does not have left and (or) right boundaries, a certain area within the admissible one, to which preliminary analysis indicates that the required minimum belongs to.

Inside any interval there are two points x =y 0 and x =z 0 that fulfill its “golden section” - a division into two unequal parts such that the ratio of the larger part to the length of the entire interval coincides with the ratio of the smaller part to the larger one. Obviously, these points are located symmetrically relative to the center of the interval (Fig. 26). The coordinates of the “golden section” points can be found from the corresponding proportions:

b−y0

y0 − a

= δ ,

z0 − a

b − z0

= δ,

b−a

b−y

b−a

− a

from where it is easy to obtain δ = (1–δ )/δ and arrive at the equation: δ 2 +δ –1=0. As a result, we obtain the relative shares that determine the “golden section” of the interval: δ =0.618, 1–δ =0.382. " Golden ratio"possesses important property: point y 0 is one of the “golden section” points of the interval, point z 0 is one of the “golden section” points of the interval. In this

A simple calculation awaits: 0.382/0.618 = 0.618 and (0.618–0.382)/0.618 = 0.382.

The algorithm for finding a minimum, built on the basis of the “golden section” method, provides for the selection at each iteration of the left or right point of the “golden section” as one of the boundaries of the reduced interval in such a way that the sought minimum is preserved within it:

1. Set k =0, initial uncertainty interval, permissible error of the result ε.

2. Calculate the coordinates of the “golden section” points:

y k =a k +0.382(b k –a k ), z k =a k +0.618(b k –a k ).

3. Calculate the values ​​of the objective function at the found points

f (y k) and f (z k).

4. If f (y k)≤f (z k) (Fig. 26, a), assign a k + 1 =a k, b k + 1 =z k, z k + 1 =y k, y k + 1 =a k +z k –y k, k =k +1. Otherwise (Fig. 26, b) a k + 1 =y k, b k + 1 =b k, y k + 1 =z k, z k + 1 =y k +b k –z k, k =k +1.

5. Check the fulfillment of the search completion condition

b k + 1 − a k + 1 ≤ ε . If it is fulfilled, the point x = (y k + 1 + z k + 1 ) 2 is chosen as a solution. Otherwise, go to step 2.

The computational efficiency of the “golden section” method is due to the fact that at each iteration it requires only a single calculation of the value of the objective function.

Direct search method (Hooke-Jeeves method). Some

second starting point X. Alternately changing the components of the vector X, the neighborhood of this point is examined, as a result of which a point (new basis) is found that determines the direction in which the minimized function f (X) decreases. A descent is carried out in the selected direction, making sure that the value of the function decreases. The procedure is repeated cyclically until it is possible to find the direction of descent taking into account the accepted stopping condition.

The algorithm of the direct search method in its most general form can be formulated as follows:

1. Set by the values ​​of the coordinates x i, i= 1,2,…n, the starting point (k = 0), the vector of initial coordinate increments

∆ X = (∆ x 1, ∆ x 2,…, ∆ x n) in the process of examining the surrounding area, the smallest permissible value of ε components ∆ X, accelerating factor λ ≥ 1, which determines the speed of descent, scale factor d >1.

2. Take X as the “old basis”: X b = X. Calculate

value f(X b).

3. Alternately change each coordinate x b i, i= 1,2,…n,

point X b by the value ∆ x i, that is, take x i = x b i + ∆ x i, then

x i =x b i –∆ x i. Calculate the values ​​of f (X) at the resulting test points and compare them with the value of f (X b). If f(X)< < f (X б ), то соответствующая координата х i приобретает новое значение, вычисленное по одному из приведенных выражений. В противном случае значение этой координаты остается неизменным. Если после изменения последней n -й координаты f (X )

4. Descent in the direction from the “old” to the “new” basis through the last one, i.e. calculate the coordinates of the new point

X: x i = x i + λ (x i – x bi), i= 1,2,...n. Calculate the value of f(X). If the condition f(X) is satisfied

The “new” basis is accepted as the “old” one (X b = X, f (X b) = f (X)) and proceed to step 5. Otherwise, accept x i = x i, i = 1,2,...n .

5. As in step 3, alternately change each coordinate of the point X, comparing the corresponding values ​​of the function f (X) with the value f (X) obtained in step 4. After changing the last coordinate, compare the corresponding value

functions f (X) with the value f (X b) obtained in paragraph 4. If f (X)

6. If for all i ∆ x i<ε , вычисления прекращаются. В противном случае уменьшают значения ∆ х i в d раз и переходят к п. 3.

The operation of the algorithm is illustrated in Fig. 27. Lines shown

level of the minimized function f (x 1 ,x 2 ), i.e. lines obtained from the conditions f (x 1 ,x 2 )=f 1 =const, f (x 1 ,x 2 )=f 2 =const and so on Further. Here f 1 >f 2 >f 3 . Solid lines are the results of a single execution of paragraphs. 3...5 (search for the direction of decreasing the function and descent), the dotted line is the next descent.

The advantage of the direct search method is the ease of programming it on a computer. It does not require explicit knowledge of the objective function, and also easily takes into account restrictions on individual variables, as well as complex restrictions on the search space.

The disadvantage of the direct search method is that in the case of highly elongated, curved or sharp-angled lines of the objective function level, it may not be able to ensure progress to the minimum point due to the limited number of analyzed directions.

Deformable polyhedron method (Nelder-Mead method) is that to minimize the function n variables f(X) in n-dimensional space, a polyhedron is constructed containing n +1 vertex. Obviously, each vertex corresponds to a certain vector Xi . Calculate the values ​​of the objective function f(Xi ), i=1,2,…, n +1, at each of the vertices of the polyhedron, determine the maximum of these values ​​and the corresponding vertex Xh . Through this vertex and the center of gravity of the remaining vertices, draw a projecting line on which the point is located Xq with a smaller objective function value than at the vertex Xh (Fig. 28, a ). Then eliminate the vertex Xh . From the remaining vertices and points Xq build a new polyhedron with which the described procedure is repeated. In the process of performing such operations, the polyhedron changes its dimensions, which determines the name of the method.

Let us introduce the following notation: X – vector of coordinates of the i-th vertex of the polyhedron at the k-th search step, i= 1,2,…n +1, k= 1,2,…; h – number of the vertex in which the target value

tires, with the exception of X. The coordinates of the center of gravity are calculated

xj [n + 2, k] =

n+ 1

according to the formula

∑ xj [ i, k] − xj [ h, k]

J= 1,2,…n.

j= 1

An approximate algorithm for the deformable polyhedron method is as follows:

1. Specified by reflection coefficientsα, tension γ >1, compression β<1 , допустимой погрешностью определения координат

minimum points ε. Select the coordinates of the vertices of the original polyhedron X, i= 1,2,…n +1, k= 1.

2. Calculate the values ​​of the objective function at all vertices f (X), i= 1,2,…n +1, and find points X, X (in Fig. 28, b points X 2 and X 1, respectively), as well as X.

3. Carry out point projection X through the center of the

tin: X =X +α (X –X).

4. If f (X)≤ X, perform the stretching operation

tion: X =X +γ (X –X ). Otherwise, go to step 6.

5. A new polyhedron is being built: if f(X)

by replacing X with X, otherwise by replacing X with X. Continue calculations from step 2 with k =k +1.

6. If X >f (X)>X for all i not equal to h,

perform the compression operation: X =X +β ​​(X – X). Construct a new polyhedron by replacing X with X and continue calculations from step 2 with k =k +1.

7. If f (X)>X, then, preserving the vertex X, build a new polyhedron, similar to the current one, reducing the lengths of all edges by half: X =X +0.5(X –X) and continue calculations from step 2 at k =k +1.

In paragraphs 6, 7 before moving to step 2, it is necessary to check the fulfillment of the condition for completing the search for the minimum, for example, by the condition

I see max n ∑ + 1 (x j [ i ,k ] − x j [ n + 2,k ] ) 2< ε 2 .

i j = 1

WITH Using the operation of stretching and compressing, the dimensions and shape of the deformable polyhedron are adapted to the topography of the target function. As a result, the polyhedron is stretched along long inclined surfaces, changes direction in curved depressions, and contracts in the vicinity of the minimum, which determines the effectiveness of the method considered.

α =1, 2≤ γ ≤3, 0.4≤β ≤0.6.

Rotating coordinate method (Rosenbrock method). Its essence consists in successive rotations of the coordinate system in accordance with the change in the direction of the fastest decrease in the objective function (Fig. 29). From the starting point X descend to the point X in directions parallel to the coordinate axes. At the next iteration, one of the axes should move in the direction x’1 = X– X, the rest - in directions perpendicular to x’1 . Descent along these axes leads to the point X , which makes it possible to construct a new vector x’’1 = X– X and on its basis a new system of search directions

minimum points X.

IN Unlike other zero-order methods, Rosenbrock's method is focused on finding the optimal point in each direction, and not just on a fixed shift in all directions. The step size in the search process continuously changes depending on the topography of the level surface. The combination of coordinate rotation and step control makes Rosenbrock's method effective for solving complex optimization problems.

IN In particular, this method, unlike many others, is effective in minimizing the so-called “gully” functions (with highly elongated level surfaces), since the resulting search direction tends to be located along the axis of the “ravine”.

Parallel tangent method (Powell method). Its essence is to sequentially carry out a one-dimensional search for the minimum of the objective function according to n+1 in any direction from the known one-dimensional methods. At the first iteration, as the first n coordinate directions are chosen, as(n+1)th directions, the first of them is used (Fig. 30). At each subsequent iteration, the search begins from the second direction of the previous iteration, respectively, the numbers of directions are reduced by one;(n+1)th the direction of the subsequent iteration is given by the vector X– X[n+1] – from the minimum point found at the first step of the previous iteration, through the minimum point found at its last step.

5. Multidimensional optimization

Linear programming

Optimization is a purposeful activity aimed at obtaining the best results under appropriate conditions.

The quantitative assessment of the quality being optimized is called optimality criterion or target function .It can be written in the form:

(5.1)

where x 1, x 2, …, x n– some parameters of the optimization object.

There are two types of optimization problems – unconditional and conditional.

Unconditional task optimization consists in finding the maximum or minimum of the real function (5.1) ofnreal variables and determining the corresponding argument values.

Conditional optimization problems , or problems with restrictions, are those in the formulation of which restrictions in the form of equalities or inequalities are imposed on the values ​​of the arguments.

Solving optimization problems in which the optimality criterion is a linear function of independent variables (that is, contains these variables to the first degree) with linear restrictions on them is the subject of linear programming.

The word “programming” here reflects the ultimate goal of the study - determining the optimal plan or optimal program, according to which, from the many possible options for the process under study, the best, optimal option is selected based on some criterion.

Example such a task is problem of optimal distribution of raw materials between different industries at the maximum cost of production.

Let two types of products be made from two types of raw materials.

Let's denote: x 1 , x 2 – the number of units of products of the first and second types, respectively; c 1 , c 2 – unit price of products of the first and second types, respectively. Then the total cost of all products will be:

(5.2)

As a result of production, it is desirable that the total cost of production be maximized.R (x 1 , x 2 ) is the objective function in this problem.

b 1, b 2 – the amount of raw materials of the first and second types available;a ij– number of units i -th type of raw material required to produce a unitj-th type of product.

Considering that the consumption of a given resource cannot exceed its total quantity, we write down the restrictive conditions for resources:

(5.3)

Regarding variables x 1 , x 2 we can also say that they are non-negative and infinite:

(5.4)

Among the many solutions to the system of inequalities (5.3) and (5.4), it is required to find such a solution ( x 1 , x 2 ), for which the functionRreaches its greatest value.

The so-called transport problems (problems of optimally organizing the delivery of goods, raw materials or products from various warehouses to several destinations with a minimum of transportation costs) and a number of others are formulated in a similar form.

Graphical method for solving linear programming problems.

Let it be required to find x 1 and x 2 , satisfying system of inequalities:

(5.5)

and conditions non-negativity:

(5.6)

For whose function

(5. 7 )

reaches its maximum.

Solution.

Let's construct in a system of rectangular coordinates x 1 x 2 area of ​​feasible solutions to the problem (Fig. 11). To do this, replacing each of the inequalities (5.5) with equality, we construct appropriate its boundary line:

(i = 1, 2, … , r)

Rice. eleven

This straight line divides the entire plane into two half-planes. For coordinates x 1 , x 2 any point A one half-plane the following inequality holds:

and for the coordinates of any point IN another half-plane – the opposite inequality:

The coordinates of any point on the boundary line satisfy the equation:

To determine on which side of the boundary line the half-plane corresponding to a given inequality is located, it is enough to “test” one point (the easiest way is the point ABOUT(0;0)). If, when substituting its coordinates into the left side of the inequality, it is satisfied, then the half-plane is turned towards the point under test; if the inequality is not satisfied, then the corresponding half-plane is turned in the opposite direction. The direction of the half-plane is shown in the drawing by hatching. Inequalities:

correspond to half-planes located to the right of the ordinate axis and above the abscissa axis.

In the figure we construct boundary straight lines and half-planes corresponding to all inequalities.

The common part (intersection) of all these half-planes will represent the region of feasible solutions to this problem.

When constructing a region of feasible solutions, depending on the specific type of system of restrictions (inequalities) on variables, one of the following four cases may occur:

Rice. 12. The region of feasible solutions is empty, which corresponds to the inconsistency of the system of inequalities; no solution

Rice. 13. The region of feasible solutions is represented by one point A, which corresponds to the only solution to the system

Rice. 14. The area of ​​feasible solutions is limited and is depicted as a convex polygon. There are an infinite number of feasible solutions

Rice. 15. The region of feasible solutions is unlimited, in the form of a convex polygonal region. There are an infinite number of feasible solutions

Graphical representation of the objective function

at a fixed valueRdefines a straight line, and when changingR- a family of parallel lines with a parameterR. For all points lying on one of the lines, the function R takes one specific value, so the indicated straight lines are called level lines for function R.

Gradient vector:

perpendicularto the level lines, shows the direction of increaseR.

The problem of finding an optimal solution to the system of inequalities (5.5), for which the objective function isR(5.7) reaches a maximum, geometrically reduces to determining in the region of admissible solutions the point through which the level line corresponding to the largest value of the parameter will passR

Rice. 16

If the region of feasible solutions is a convex polygon, then the extremum of the functionR is reached at least at one of the vertices of this polygon.

If the extreme valueRis achieved at two vertices, then the same extreme value is achieved at any point on the segment connecting these two vertices. In this case, the task is said to have alternative optimum .

In the case of an unbounded region, the extremum of the functionReither does not exist, or is achieved at one of the vertices of the region, or has an alternative optimum.

Example.

Suppose we need to find the values x 1 and x 2 , satisfying the system of inequalities:

and conditions non-negativity:

For whose function is:

reaches its maximum.

Solution.

Let us replace each of the inequalities with equality and construct the boundary lines:

Rice. 17

Let us determine the half-planes corresponding to these inequalities by “testing” the point (0;0). Taking into account non-negativity x 1 and x 2 we obtain the region of feasible solutions to this problem in the form of a convex polygon OAVDE.

In the region of feasible solutions, we find the optimal solution by constructing the gradient vector

showingdirection of increaseR.

The optimal solution corresponds to the point IN, the coordinates of which can be determined either graphically or by solving a system of two equations corresponding to the boundary straight lines AB and VD:

Answer: x 1 = 2; x 2 = 6; Rmax = 22.

Tasks. Find the position of the extremum point and the extreme value of the objective function

under given restrictions.

Table 9

Option No.

Extremum

Restrictions

M ax

; ;

; ;

Max

; ; ;

;

; ;

; ;

; ;

; ; ;

;

; ;

Optimization is the process of finding an extremum (global maximum or minimum) of a certain function or choosing the best (optimal) option from a set of possible ones. The most reliable way to find the best option is a comparative assessment of all possible options (alternatives). If the number of alternatives is large, mathematical programming methods are usually used to find the best one. These methods can be applied if there is a strict formulation of the problem: a set of variables is specified, the area of ​​their possible change is established (constraints are specified), and the type of the objective function (the function whose extremum needs to be found) from these variables is determined. The latter is a quantitative measure (criterion) for assessing the degree of achievement of the goal.

The problem of unconstrained optimization is to find the minimum or maximum of a function in the absence of any restrictions. Despite the fact that the majority practical problems optimization contains restrictions, the study of unconstrained optimization methods is important from several points of view. Many algorithms for solving a constrained problem involve reducing it to a sequence of unconstrained optimization problems. Another class of methods is based on finding a suitable direction and then minimizing along this direction. The justification of unconstrained optimization methods can be naturally extended to the justification of procedures for solving problems with constraints.

The constrained optimization problem is to find the minimum or maximum value of a scalar function f(x) of n-dimensional vector arguments. The solution to the problem is based on a linear or quadratic approximation of the objective function to determine the increments x1, ..., xn at each iteration. There are also approximate methods for solving nonlinear problems. These are methods based on the piecewise linear approximation method. The accuracy of finding solutions depends on the number of intervals on which we find a solution to a linear problem that is as close as possible to the nonlinear one. This method allows calculations to be made using the simplex method. Typically, in linear models, the coefficients of the objective function are constant and do not depend on the values ​​of the variables. However, there are a number of problems where costs depend nonlinearly on volume.

Solution algorithm:

  • 1. The work begins by constructing a regular simplex in the space of independent variables and estimating the values ​​of the objective function at each of the vertices of the simplex.
  • 2. The vertex is determined - the largest value of the function.
  • 3. The vertex is projected through the centroid of the remaining vertices to a new point, which is used as the vertex of the new simplex.
  • 4. If the function decreases smoothly enough, the iterations continue until either the point min is covered, or cyclic movement along 2 or more simplexes begins.
  • 5. The search ends when either the dimensions of the simplex or the differences between the function values ​​at the vertices remain sufficiently small.

Task: capacity optimization. Achieve minimal costs for the manufacture of a container with a volume of 2750 liters for storing sand.

Z = C1X1 + C2X2 + C3X3 + C4X4 + C5X5 min;

where: X1 - amount of required metal, kg;

C1 - cost of metal, rub/kg;

X2 - mass of required electrodes, kg;

C2 - cost of electrodes, rub/kg;

X3 - amount of consumed electricity, kWh;

C3 - cost of electricity, rub/kWh;

X4 - welder working time, h;

C4 - welder tariff rate, rub/hour;

X5 - lift operating time, h;

C5 - lift fee, rub/hour.

1. Find the optimal surface area of ​​the container:

F = 2ab+2bh+2ah min (1)

where V=2750 liters.

x1=16.331; x2=10.99

The minimum of the function was obtained in the optimization process using the Box method - 1196.065 dm2

In accordance with GOST 19903 - 74, we accept:

h=16.50 dm, b=10.00 dm.

Let us express a from (1) and get:

Let's calculate the optimal thickness of the metal sheet

Let's choose ordinary carbon steel St2sp

For this steel 320 MPa, ;

Mass of sand.

Load on the wall of the container with the largest area:

Let's calculate the load per 1 linear centimeter of a sheet 100 cm wide:

Let us determine the wall thickness based on the condition:

where: l is the length of the sheet (preferably the largest to leave extra stock strength);

q - load per 1 linear centimeter, kg/cm;

Metal sheet thickness, m

Maximum permissible metal stress, N/mm2.

Let us express the wall thickness from (2):

Considering that 320 MPa = 3263 kg/cm2,

Metal weight

where: F - surface area of ​​the container, m2;

Metal wall thickness, m;

Metal density, kg/m3.

The price of St2sp steel is about 38 rubles/kg.

2. Weld length:

We will use electrodes to of stainless steel"UONI-13/45"

Price 88.66 rub/kg;

where: Sweld - cross-sectional area of ​​the weld, m2;

l is the length of the weld, m;

Density of deposited metal, kg/m3.

3. Welding time:

where l is the length of the weld, m;

v - welding speed, m/h.

Total power consumption:

Рsum = 5 17 = 85 kWh;

The cost of electricity is 5.7 rubles/kWh.

4. For manual arc welding, the costs of auxiliary, preparatory and final time and time for servicing the workplace average 40 - 60%. Let's use the average value of 50%.

Total time:

Payment for a welder of the VI category is 270 rubles/hour.

Plus a tariff coefficient of 17% for work in a confined, poorly ventilated space:

The assistant's payment will be 60% of the welder's payment:

8055 0.6 = 4833 rub.

Total: 8055+4833 = 12888 rubles.

5. A crane is needed to hold sheets of metal during welding, loading and unloading sheets of metal and the finished container itself.

To “grab” the entire structure, the welder needs to apply about 30% of the seams.

Payment for the crane is 1000 rubles/hour.

Total cost of the container.

Among the zero-order optimization methods in CAD, the methods of Rosenbrock, configurations, deformable polyhedron, and random search are used. Methods using derivatives include steepest descent, conjugate gradient, and variable metric methods.

Rosenbrock's method is an improved version of coordinate descent.

Coordinate descent method is characterized by the choice of search directions alternately along all coordinate axes, the step is calculated on the basis of one-dimensional optimization, the criterion for ending the search is , where is the specified accuracy of determining the local extremum, is the dimension of the space of controlled parameters. The coordinate descent trajectory for an example of a two-dimensional space of controlled parameters is shown in Fig. 1, where are points on the search trajectory and are controlled parameters. The objective function is represented by its lines of equal levels, and the corresponding value is written next to each line. Obviously there is a minimum point.

Rice. 1. Trajectory of coordinate descent

When using the coordinate descent method, there is a high probability of the search getting stuck at the bottom of a ravine far from the extremum point. In Fig. 2 shows that after hitting the point located at the bottom of the ravine, further steps are possible only in the directions or , but they lead to a deterioration of the objective function. Therefore, the search stops at point .

Note 1

A ravine is a part of the space of controlled parameters in which weak changes in the derivatives of the objective function are observed in some directions and significant changes with a change in sign in some other directions. The sign of the derivative changes at points belonging to the bottom of the ravine.

Rice. 3. Trajectory of coordinate descent with favorable orientation of the coordinate axes

Rosenbrock method consists in rotating the coordinate axes so that one of them turns out to be quasi-parallel to the bottom of the ravine. This rotation is carried out on the basis of data obtained after a series of steps of coordinate descent. The position of the new axes can be obtained by linear transformation of the previous axes: the axis coincides in direction with the vector; the remaining axes are chosen from the condition of orthogonality to and to each other.

Another successful modification of the coordinate descent is configuration method(Hook-Jeeves). In accordance with this method, the usual series of steps of coordinate descent are first performed, then an additional step is taken in the direction of the vector, as shown in Fig. 4, where an additional step is performed in the direction of the vector, which leads to point .

Rice. 4. Illustration of the configuration method

Search for extremum deformable polyhedron method(Nelder-Mead) is based on the construction of a polyhedron with vertices at each search step, where is the dimension of the space of controlled parameters. At the beginning of the search, these vertices are chosen randomly; in subsequent steps, the choice is subject to the rules of the method.

These rules are explained in Fig. 5 using the example of a two-dimensional optimization problem. The vertices of the original triangle are selected: , , . The new vertex is located on the ray drawn from the worst vertex (from the vertex with highest value objective function) through the center of gravity of the polyhedron, and it is recommended to choose at a distance from equal to . The new vertex replaces the worst vertex. If it turns out that it has the best value of the objective function among the vertices of the polyhedron, then the distance is increased. In the figure, this is exactly the situation that occurs and the increase gives the point . In a new polyhedron with vertices , , the worst is vertex , similarly one gets vertex , then vertex , etc. If the new vertex turns out to be worse, then the best vertex must be preserved in the polyhedron, and the lengths of all edges must be reduced, for example, by half (contracting the polyhedron to the best vertex). The search stops when the condition of reducing the size of the polyhedron to a certain limit is met.

the optimal step is selected using one-dimensional optimization.

When using the steepest descent method, like most other methods, the search efficiency is significantly reduced in gully situations. The search trajectory takes on a zigzag shape with slow movement along the bottom of the ravine towards the extremum. To increase the efficiency of gradient methods, several techniques are used.

One of the techniques used in conjugate gradient method(also called the Fletcher-Reeves method), is based on the concept of conjugacy of vectors. Vectors and are called -conjugate if , where is a positive definite square matrix of the same order as the size of the vectors and (a special case of conjugacy is the orthogonality of vectors, when is the identity matrix of order ), is a row vector, is a column vector.

The peculiarity of conjugate directions for , where is the Hessian matrix, in problems with a quadratic objective function is as follows: one-dimensional minimization sequentially along conjugate directions makes it possible to find the extreme point in no more than steps.

Note 2

The Hessian matrix is ​​the matrix of second partial derivatives of the objective function with respect to the controlled parameters.

The reason for using -conjugate search is that for functions () general view Quadratic approximation can be applied, which in practice results in performing the search in more than steps.

The search for an extremum is performed in accordance with the formula

where is the coefficient. In addition, the conjugacy condition is taken into account

Since the step is calculated based on the one-dimensional optimization condition, then, firstly, the following relation is true:

The search algorithm is reduced to applying formula (3) until the condition for completing the calculations is met

To determine the coefficient, solve the system of equations (2)-(7) by substituting into (4) the values ​​from (3) and from (2):

or

where

and taking into account (6) and (7)


Expression (10) is a system of linear algebraic equations. Its root is another approximation to the solution

If the process converges, then the solution is achieved in a small number of iterations, the end of which is the fulfillment of the condition
Where


That's why

It can be shown that tends to , - to when , where is the dimension of the space of controlled parameters. After steps, you need to start again from .