Lecture
Suppose that player B applies his active pure strategy, and player A uses his optimal strategy. Then the average winnings of player A will be equal to:
υ j = a 1 j p 1 + a 2 j p 2 + ... + a ij p i + ... + a mj p m ,
Considering that υ j cannot be less than υ 1 , we write the conditions:
a 11 p 1 + a 21 p 2 + ... + a i1 p i + ... + a m1 p m ≥ υ
a 12 p 2 + a 22 p 2 + ... + a i2 p i + ... + a m2 p m ≥ υ
------------------------------------------
a 1j p 1 + a 2j p 2 + ... + a ij p i + ... + a mj p m ≥ υ
------------------------------------------
a 1 n p 1 + a 2 n p 2 + ... + a in p i + ... + a mn p m ≥ υ
Divide the left and right sides of each inequality by the price of the game, we get:
We introduce new notation:
Then the inequalities take the form:
a 11 x 1 + a 21 x 2 +… + a i 1 x i +… + a m 1 x m ≥ 1
a 12 x 2 + a 22 x 2 +… + a i2 x i +… + a m2 x m ≥ 1
------------------------------------------
a 1j x 1 + a 2j x 2 +… + a ij x i +… + a mj x m ≥ 1
------------------------------------------
a 1n x 1 + a 2n p 2 + ... + a in x i + ... + a mn x m ≥ 1
all x i ≥0, since p i ≥0, υ> 0.
From the normalization condition p 1 + p 2 +… + p m = 1, the variables x i must satisfy the condition:
x 1 + x 2 + ... + x i + ... + x m = 1 / υ.
Given that Player A seeks to maximize υ, we obtain a linear objective function:
z = x 1 + x 2 + ... + x i + ... + x m
Consequently, the problem of solving a game has been reduced to the following linear programming problem: to find the non-negative values of the variables x i , which minimize the function () and satisfy the constraints ().
From the solution of the LP problem, we find the price of the game υ and the optimal strategy of player A:
The optimal strategy of player B is found from the expression:
where u j - non-negative variables of the LP problem:
z = u 1 + u 2 + ... + u i + ... + u m
a 11 u + a 21 u 2 +… + a i1 u i +… + a m1 u m ≤ 1
a 12 u 2 + a 22 u 2 +… + a i2 u i +… + a m2 u m ≤ 1
------------------------------------------
a 1n u 1 + a 2n p 2 + ... + a in u i + ... + a mn u m ≤ 1
which is dual with respect to the problem represented by the conditions () and ().
Here :
Thus, optimal strategies and games with a payment matrix (a ij ) m x n can be found by solving a symmetric pair of dual LP problems:
Initial task
|
Dual task
|
Wherein:
Recall an example of the game of Colonel Blotto.
It can be formulated as an LP problem and solved by a simplex method. We get the game solution;
With optimal strategies, the colonel must adhere to a more uneven distribution of forces between positions with a probability of 4/9, and the general - on the contrary, must adhere to a greater degree of uniform separation between positions with frequencies 4/9, then the colonel will receive a gain estimated at about 1, 5 units.
The reverse fact is also known - for any LP problem, an equivalent problem of game theory can be constructed. This relationship “game theory - linear programming” is useful not only for game theory, but also for LP. Sometimes, approximate numerical methods for solving problems in game theory (with a large dimension of a problem) turn out to be simpler than the “classical methods” of the LP.
Games with a nonzero amount.
Let f k (x 1 , ..., x N ) be the winning function of the k-th player (total N) if player i applies the i-th strategy. In games with a non-zero sum, equality
not necessarily satisfied.
We first note the distinctive properties of two-person zero-sum games:
1) it is not profitable for any player to inform the opponent about his strategy;
2) there is no point for players to negotiate before the game and agree on a joint action plan;
3) if () and () are pairs of optimal strategies of the first and second players, respectively, then () and () are such optimal pairs of strategies, and
As a rule, non-zero-sum games do not possess any of these properties.
Formally, a non-zero-sum game can be reduced to a zero-sum game by introducing the (N + 1) th player, the so-called dummy, whose set of strategies consists of one point and
but it does not bring practical gains. For a two-person game with a non-zero amount, we will describe it as follows. The sets of players' strategies will be denoted and, at the end, the first player receives the sum, the second -. Then the matrix of the game will look like this:
|
β 1 |
... |
β j |
... |
β n |
α 1 |
(f1 (α 1 , β 1 ), f2 (α 1 , β 1 )) |
... |
(f1 (α 1 , β j ), f2 (α 1 , β j )) |
... |
(f1 (α 1 , β n ), f2 (α 1 , β n )) |
... |
... |
... |
... |
... |
... |
α i |
(f1 (α i , β 1 ), f2 (α i , β 1 )) |
... |
(f1 (α i , β j ), f2 (α i , β j )) |
... |
(f1 (α i , β n ), f2 (α i , β n )) |
... |
... |
... |
... |
... |
... |
α m |
(f1 (α m , β 1 ), f2 (α m , β 1 )) |
... |
(f1 (α m , β j ), f2 (α m , β j )) |
... |
(f1 (α m , β n ), f2 (α m , β n )) |
Elements of the matrix are known to both players.
In zero-sum games, players cannot achieve mutual benefits through any cooperation, but in non-zero-sum games, this can always be achieved. The important question is whether players are allowed to cooperate in the game? Under the cooperative game means a game in which they have complete freedom of communication before the game to draw up mutually binding agreements. In a non-cooperative game, no cooperation between players prior to the game is allowed. An example is antitrust laws that prohibit certain types of agreements between large firms.
Example. Two people were in the burning house. They can leave the house and escape only through the front door. But it stuck so much that you can only open it together. In this case, each player has two strategies:
α 1 β 1 - push the door and try to open it,
α 2 β 2 - do not push the door.
Acting together, players can be saved - the gain of each is equal to one, otherwise both will suffer - the gain of each is zero.
|
β 1 |
β 2 |
α 1 |
(eleven) |
(0, 0) |
α 2 |
(0, 0) |
(0, 0) |
This is, firstly, a cooperative game, secondly
f (α ij ) - f (β ij ) = C = const, i =, j =,
those. players win or lose at the same time, this is a game with a constant difference (in this case, zero). In games with a constant difference, coordination of players' actions is needed.
Example. Game "Family dispute."
There is a game matrix
|
β 1 |
β 2 |
α 1 |
(2, 1) |
(-eleven) |
α 2 |
(-eleven) |
(12) |
The husband (the first player) and the wife (the second player) can choose one of two evening entertainments: a football match (α 1 and β 1 ) or a ballet (α 2 and β 2 ). It is natural to assume that player A will choose football, and B - ballet. However, it is much more important for both of us to go together than alone.
The outcome is preferable for the first player (α 1 , β 1 ), for the second - (α 2 , β 2 ). Each of these pairs is a pair of optimal strategies: each strategy of one of the pairs is better than another strategy in the same pair:
f 1 (α 1 , β 1 ) = f 1 (α 1 , β 1 ),
f 1 (α 1 , β 2 ) = f 1 (α 1 , β),
f 1 (α 2 , β 2 ) = f 1 (α 1 , β 2 ),
f 1 (α 2 , β 2 ) = f 1 (α 2 , β).
Neither (α 1 , β 2 ) nor (α 2 , β 1 ) are optimal pairs of strategies. Moreover, the optimal pairs (α 1 , β 1 ) and (α 2 , β 2 ) bring players different differences, i.e. This game does not have the property 3) for zero-sum games.
And the first two properties of zero-sum games are also not fulfilled, since if the players do not communicate before the game and have strong characters, i.e. the first will choose α 1, since he wants (α 1 , β 1 ), and the second will choose β 2 , because he wants (α 2 , β 2 ), then both lose. A similar situation will be in the case when each spouse has a soft character and decides to give up.
The best in the game is a cooperative option. In this case, both of them will tend to the outcome (α 1 , β 2 ) or (α 2 , β 2 ), and a random choice will be fair.
Example. Tanker task or “prisoner's dilemma”.
Two suspects of serious crime were arrested and isolated from each other. The prosecutor has no serious evidence of their guilt. He informs each of the detainees about two available alternatives: confess a crime (α 2 , β 2 ) or not confess (α 1 , β 1 ).
|
β 1 (N) |
β 2 (P) |
α 1 (N) |
(eleven) |
(10, 1/4) |
α 2 (P) |
(1/4, 10) |
(8, 8) |
If both prisoners are not recognized, they will be charged with a minor charge and they will receive 1 year in prison. If both confess, they will be convicted, but not in the strictest way (for 8 years), if one confesses, and the other does not, the confessed will receive a mild sentence, and the stubborn will be convicted for the maximum term. This is a noncooperative option.
(α 2 , β 2 ) - not the best strategy for both players.
As each of the players strives for maximum utility, their reasonable choices will be α 2 and β 2 .
Assume that cooperation is allowed in the game. Then the players will choose (α 1 , β 1 ). But this couple is unbalanced. This leads to the possibility of a breach of agreement. If the first player breaks, and the second does not, the position of the first player improves, otherwise, the opposite.
Here cooperation is unprofitable.
The optimal Nash solution for non-cooperative game.
Strategies are called Nash optimal in a game of N non-zero-sum persons (or a Nash game decision or a Nash equilibrium point) if for each k = 1, ..., N.
f k () =
f ().
The optimality of the Nash solution is related to the equilibrium state of the game, i.e. with a situation in which it is unprofitable for any of the players to change their strategy. In the considered game “family dispute” of the pair (α 1 , β 1 ); (α 2 , β 2 ) are the solution according to Nash, in the game “prisoner's dilemma” such a pair is (α 2 , β 2 ).
As in zero-sum games, a non-zero-sum game may not have a Nash solution in pure strategies.
Nash theorem. Any non-cooperative game of N non-zero-sum games has at least one Nash solution in the class of mixed strategies.
Consider the problem of decision-making in the non-cooperative game of two players from the point of view of a third party - the “arbiter”. What characteristics should a good player decision have? First, the property of efficiency in terms of the gains received by players, and secondly, the property of stability of the solution. From the standpoint of a neutral person, the game of two persons with target functions:
f 1 (α, β) →
f 2 (α, β) →
can be considered as multi-criteria on the set L = A × B. The argument is the vector ηL, η = (α, β), and the problem takes the form:
f 1 (η) →
f 2 (η) →
To select effective solutions, the Pareto principle is used. Pareto - optimal or effective solutions have the property that you can improve the gain of one of the players only by reducing the gain of other players. It is clear that a solution that is outside the Pareto set (negotiation set) can be improved at once for all players. In the negotiation set, the interests of the players are antagonistic.
The difficulty is that the choice of η is made not by one person, but by several. And here the Nash principle becomes important: the choice of a rational strategy should be made among the Nash equilibrium points.
There is a contradiction between the optimality of the Pareto solution and the Nash optimality, i.e. the contradiction between profitability and sustainability.
Only in the case when the stable solutions are simultaneously Pareto's, can the Nash principle be effectively used.
An example . Two cars drive at high speed at right angles to each other to an unregulated intersection. Each driver has two strategies:
- α 1 , β 1 - reduce speed to safe (B)
- α 2 , β 2 - continue driving at high speed (risky strategy - P).
If both drivers adhere to strategy B, this will lead to a happy outcome, estimated for each by the number 1. If both choose a risky strategy, an accident occurs and the loss is estimated at (-9). With other combinations (B, P), (P, B), the outcome is estimated to be 0 for slowing down (for time lost) and 3 for not lowering (for saving time).
|
β 1 (B) |
β 2 (P) |
α 1 (B) |
(eleven) |
(0, 3) |
α 2 (P) |
(thirty) |
(-9, -9) |
Situations (B, B) and (P, P) are obviously unstable, because Each player can win more with a one-sided decision. Situations (B, P) and (P, B) are both stable. The Poreto situation is not optimal (P, P). Those. In the problem there are two situations (B, P), (P, B) optimal simultaneously in both Poreto and Nash.
For the game "prisoner's dilemma" solution (P, P) is Nash stable, but not Pareto optimal
Games with nature
or decision making under uncertainty .
In the problems of game theory, considering the operations carried out under conditions of uncertainty, we associated this uncertainty with the behavior of the opponent (or opponents). It was assumed automatically that the adversary is reasonable (the principle of equal rationality), that he acts consciously, choosing for himself precisely those actions that are disadvantageous for us, i.e. figuratively speaking, the enemy was considered to be “malicious”. However, very often the uncertainty accompanying a certain operation is not associated with the conscious behavior of the enemy, but depends only on some objective reality unknown to player A, on the lack of information about the objective situation. Uncertainty of this type may be due to various reasons: the complexity of the situation, which is rather difficult to assess, market conditions, changes in demand, government policy, reliability of the partner, equipment failures, exchange rates, inflation, economic conditions, natural disasters.
This kind of situation is called games with nature (often with Nature, Nature). The second player in this case is called nature and, in game theory, is not a purposeful, conscious player. It is considered as some uninterested instance. Possible strategies of nature, its state are implemented randomly.
Thus, only one player acts in the game with nature, he is player A, he is a researcher of the operation, he is a decision maker. Nature P is the second player, but not the opponent of player A. Player A in a game with nature is sometimes called a statistician, and the theory of games with nature is sometimes called the theory of statistical decisions.
Let player A need to perform an operation in an unknown situation, with respect to the states of which you can make n assumptions P 1 , P 2 , ..., P n . These assumptions will be considered as a strategy of nature. Player A has at its disposal m possible strategies А 1 , А 2 , ..., А m . The winnings of player A and ij for each pair of strategies are assumed to be known and are set by the payment matrix A = || a ij || (or the matrix of winnings). The task is to determine such a strategy (pure or mixed) that would provide the player with the greatest gains.
Example1. Student X, entering the tram, decides whether to take a ticket. The outcome is determined by two circumstances: the decision of the student and the possible behavior of the controller. The student acts as a player, and the fact of the appearance of the controller - as a state of nature.
P j A i |
State of nature |
|
The controller will appear |
Controller will not appear |
|
Take a ticket |
-2 cu |
-2 cu |
Do not take a ticket |
-8 cu |
0 cu |
Winnings are shown in the payment matrix.
The analysis of the matrix of the game with nature is both simpler and more difficult than the analysis of the matrix of the antagonistic game. Due to the fact that nature does not counteract statistics, it may turn out that playing with nature is simpler than a strategic game. In fact, it is not. The opposite of interests and players as if removes uncertainty, which cannot be said about playing with nature. On the other hand, the operating side is easier in the game with nature in the sense that it will most likely win more than in the game with a reasonable opponent. However, the decision is more difficult to make, especially an informed decision, because uncertainty in the game with nature affects to a much greater extent.
It is advisable not only to estimate the gain in a given game situation, but also to determine the difference between the maximum possible gain for a given state of nature and the gain that will be obtained by applying the strategy A i under the same conditions. This difference in game theory is called risk.
The maximum gain in the jth column is denoted by β j , i.e. β j =. The value of β j characterizes the favorable state of nature. The risk of a player when he applies the strategy A i under the conditions P j is denoted by r ij . Then r ij = β j - a ij , where r ij ≥0.
Risk Matrix R = || r ij || m × n in many cases allows you to more deeply understand the uncertainty of the situation than the payment matrix.
β j is called the indicator of the auspiciousness of the state P j of nature to increase the gain. Favorable state of nature is considered as a factor conducive to player A. player’s risk r ij when using the strategy А i in the state of nature П j is the loss of the opportunity for the maximum win β j (the non-won part of the win β j ).
Example 2. Planned operation in a pre-clear conditions, relating, for example, market conditions: P 1 , P 2 , P 3 , P 4 .
The profitability of the operation (expected profit) is given by the payoff matrix.
B j A i |
P 1 |
P 2 |
P 3 |
P 4 |
A 1 |
one |
four |
five |
9 |
A 2 |
3 |
eight |
four |
3 |
A 3 |
four |
6 |
6 |
2 |
β 1 = 4 β 2 = 8 β 3 = 6 β 4 = 9
Get the risk matrix:
B j A i |
P 1 |
P 2 |
P 3 |
P 4 |
A 1 |
3 |
four |
one |
0 |
A 2 |
one |
0 |
2 |
6 |
A 3 |
0 |
2 |
0 |
7 |
In the second line, a 21 = a 24 = 3. Under the condition of nature P 1, the choice of a 21 is almost perfectly good, but under the state of nature of P 4 the choice of strategy A 2 is very bad.
Criteria for making decisions in games with nature.
A criterion based on known probabilities of states of nature (Bayesian criterion).
Suppose that statistics from past experience are known not only the state of nature, P 1 , P 2 , ..., P n in which nature can be, but also the corresponding probabilities with which nature realizes these states.
The indicator of the effectiveness of the strategy А i according to Bayes criterion is the average value or the expectation of winning the i-th line:
= q 1 a i1 + q 2 a i2 + ... + q n a in =, i = 1, ..., m.
Those. - weighted average of the gains of the i-th row, taken with weights q 1 , q 2 , ..., q n
Optimal by the Bayes criterion is considered a strategy with a maximum indicator of efficiency, i.e.
=
Let's go back to the ticket example.
P j A i |
P 1 (q 1 ) |
P 2 (q 2 = 1- q 1 ) |
A 1 |
-2 |
-2 |
A 2 |
-eight |
0 |
= q 1 (-2) + (1 - q 1 ) (- 2) = - 2;
= q 1 (-8) + (1 - q 1 ) 0 = -8 q 1 .
According to the Bayesian criterion, the А 1 strategy (“take a ticket”) should be preferred if -2> -8q 1 , i.e. if q 1 > 1/4. Otherwise, you should prefer strategy A 2 . If we assume that each car has an equal chance of being visited by the controller, and the number of cars is k and the controller is r (r≤k), we can assume that q 1 ≈ r / k. So if a tram car has more than 1 controller, it is more profitable to take a ticket!
It can be shown that it is equivalent to the criterion:
r ==.
Those. Bayes criterion minimizes the average risk.
Еще одно важное положение: когда известны вероятности состояний природы q 1 , q 2 , …, q n игроку А нет смысла пользоваться смешанными стратегиями. Действительно, если игрок применит смешанную стратегию p A =(p 1 , p 2 , …, p m ), то средний выигрыш будет:
Но не может быть больше максимальной из осредняемых величин, т.е. max a i . Those. применение любой смешанной стратегии p A не может быть выгоднее для игрока А, чем применение оптимальной чистой стратегии.
Критерий Лапласа .
Часто состояния природы нельзя оценить даже из предыдущего опыта, или это стоит очень дорого (привлечение экспертов и аналитиков). Тогда применяется следующий принцип: мы не можем отдать предпочтение ни одному из состояний природы, а поэтому считаем их равновероятными, то есть q 1 =q 2 =…=q n =1/n. Этот принцип называется принципом недостаточного основания Лапласа. На нем основан критерий Лапласа.
Показателем эффективности стратегии A i по критерию Лапласа называется среднее арифметическое выигрышей i-й строки:
, i=1,2,…,m.
Оптимальной по критерию Лапласа считается стратегия , показатель эффективности которой максимален.
,
Очевидно, что критерий Лапласа есть частный случай критерия Байеса при q 1 =q 2 =…=q n =1/n.
Иногда, если мы знаем, какие состояния природы более вероятны, а какие мене, но насколько – не знаем, можно назначить вероятности состояний, пропорциональными членам убывающей арифметической прогрессии
q 1 :q 2 :q 3 :…q n =n:(n-1):…:1,
или с учетом того, что q 1 +q 2 +q 3 +…+q n =1,
, i=(1,2,…,n)
Критерий Вальда (максиминный критерий или критерий крайнего пессимизма).
Критерий Вальда основан на гипотезе антагонизма. Она состоит в предположении, что среда ведет себя «наихудшим» образом для игрока.
В соответствии с критерием Вальда выбирается та стратегия, для которой показатель эффективности
, i=1,…,m, (минимум строки)
а стратегия .
Оптимальная стратегия по Вальду гарантирует при любых состояниях природы выигрыш, не меньший чем максимин. Число называют гарантированной оценкой, а критерий Вальда иногда называют критерием гарантированного результата.
Критерий Вальда ориентирует игрока на крайне осторожное, осмотрительное поведение. Поэтому он называется критерием крайнего пессимизма.
Часто критерием Вальдта пользуются и в обиходе, что подтверждается пословицами: «Семь раз отмерь – один раз отрежь», «Береженого Бог бережет», «Лучше синица в руках, чем журавль в небе».
Критерий максимакса (критерий крайнего оптимизма или критерий азартного игрока).
Показатель эффективности стратегии A i по этому критерию:
, i=1,…,m (максимум строки)
В качестве эффективной стратегии по этому критерию является стратегия с максимальным показателем эффективности
Это критерий максимального оптимизма, гарантирующий игроку максимальный выигрыш. Он соответствует поговоркам: «Пан или пропал», «Кто не рискует, тот не выигрывает», «Плох тот солдат, кто не мечтает стать генералом».
Критерий Сэвиджа (критерий крайнего пессимизма, критерий минимального риска).
Оптимальная стратегия по Сэвиджу минимизирует максимальный риск
Критерии Вальда и Сэвиджа неэквивалентны (покажем ниже). Критерий Сэвиджа выражает сожаление о том, насколько выбранная стратегия не оказалась наилучшей. По сравнению с критерием Вальда в нем придается большее значение все-таки выигрышу, а не проигрышу.
Критерий Гурвица. (критерий обобщенного максимума или критерий пессимизма-оптимизма).
В критерии Гурвица вводится показатель оптимума . Оптимальной считается стратегия:
Платежная матрица дополняется столбцом, содержащим средневзвешенное наименьшего и наибольшего результатов для каждой строки. Выбираются те варианты, в строках которых стоят наибольшие элементы этого столбца.
При =1 критерий Гурвица превращается в критерий максимаксный, а при =0 – критерий Вальда. Чем больше к единице, тем больше оптимизма и пессимизма. Выбор достаточно сложен. На практике используют =0,3 – 0,7. Чем опаснее ситуация, тем более должно быть ближе к единице, тем больше надо подстраховаться.
Example. Для защиты информации от программных воздействий П j разработаны три программных продукта A 1 ,A 2 ,A 3 .
Матрица эффективности выглядит так:
П j A i |
П 1 |
П 2 |
П 3 |
П 4 |
A 1 |
0.1 |
0.5 |
0.1 |
0.2 |
A 2 |
0.2 |
0.3 |
0.2 |
0.4 |
A 3 |
0.1 |
0.4 |
0.4 |
0.3 |
Например: П 1 – сетевые вирусы, П 2 – попытки несанкционированного доступа, П 3 – хакерские атаки, П 4 – копирование.
1) Критерий Байеса. Считаем, что известны
q 1 =0,4 , p 2 =0,2 , q 3 =0,1 , q 4 =0,3
K(A 1 )=0,4*0,1+0,2*0,5+0,1*0,1+0,3*0,2=0,21
K(A 2 )=0,4*0,2+0,2*0,3+0,1*0,2+0,3*0,4= 0,28
K(A 3 )=0,4*0,1+0,2*0,4+0,1*0,4+0,3*0,3=0,25
Оптимальное решение – A 2 .
2) Критерий Лапласа.
q 1 =q 2 =q 3 =q 4 =1/n=0,25
K(A 1 )=0,25(0,1+0,5+0,1+0,2)=0,225
K(A 2 )=0,25(0,2+0,3+0,2+0,4)=0,275
K(A 3 )=0,25(0,1+0,4+0,4+0,3)= 0,3
По критерию Лапласа – оптимальное решение - A 3 .
3) Критерий Вальда
K(A 1 )=min(0,1; 0,5; 0,1; 0,2)=0,1
K(A 2 )=min(0,2; 0,3; 0,2; 0,4)= 0,2
K(A 3 )=min(0,1; 0,4; 0,4; 0,3)=0,1
Оптимальное решение - A 2 .
4) Критерий азартного игрока
K(A 1 )=max(0,1; 0,5; 0,1; 0,2)= 0,5
K (A 2 ) = max (0.2; 0.3; 0.2; 0.4) = 0.4
K (A 3 ) = max (0.1; 0.4; 0.4; 0.3) = 0.4
5) Savage Criterion
Loss matrix
r ij
P j A i |
P 1 |
P 2 |
P 3 |
P 4 |
A 1 |
0.1 |
0 |
0.3 |
0.2 |
A 2 |
0 |
0.2 |
0.2 |
0 |
A 3 |
0.1 |
0.1 |
0 |
0.1 |
K (A 1 ) = max (0.1; 0; 0.3; 0.2) = 0.3
K (A 2 ) = max (0; 0,2; 0,2; 0) = 0,2
K (A 3 ) = max (0.1; 0.1; 0; 0.1) = 0, 1
The optimal solution is A3. (Does not coincide with Wald).
6) Hurwitz Criterion
Take = 0.6
K (A 1 ) = 0.6 * 0.5 + (1-0.6) * 0.1 = 0.34
K (A 2 ) = 0.6 * 0.4 + (1-0.6) * 0.2 = 0.32
K (A 3 ) = 0.6 * 0.4 + (1-0.6) * 0.1 = 0.28
Оптимальный вариант – A 1 .
Сравнительные результаты:
A i |
K(A i ) |
|||||
Байес |
Лаплас |
Вальд |
Максимакс |
Гурвиц |
Сэвидж |
|
A 1 |
0.21 |
0,225 |
0.1 |
0.5 |
0.34 |
0.3 |
A 2 |
0.28 |
0,275 |
0.2 |
0.4 |
0,32 |
0.2 |
A 3 |
0.25 |
0,300 |
0.1 |
0.4 |
0.28 |
0.1 |
What to choose?
На выбор критерия влияет ряд факторов:
- природа конкретной ситуации и ее цель (в одних операциях риск допустим, в других – нужен гарантированный результат);
- причины неопределенности (действительно ли действует природа, ил все-таки сознательный противник);
- характер лица, принимающего решение (одни люди склонны к риску в надежде на успех, другие всегда действуют осторожно).
Как оценить склонность к риску, оптимизм, пессимизм?
Проводится лотерея. Стоимость билета 10 у.е. За эти деньги игрок с равной вероятностью (p=0,5) может ничего не выиграть или выиграть 100 у.е. Один человек просто не купит билет, другой готов заплатить за него 50, или даже 60 у.е. – третий.
Безусловным денежным эквивалентом (БДЭ) игры называется максимальная сумма денег, которую ЛПР готово заплатить за участие в лотерее, или минимальная сумма денег, за которую он готов отказаться от игры. Каждый индивид имеет свой БДЭ.
The individual for whom the BDE coincides with the expected monetary value (ALC), that is, with the price of the game, with the average gain, is conventionally called an objectivist, an individual for whom BDE ≠ ALC is a subjectivist. ALC is considered as the price of the game (by the criterion of Bayes-Laplace). If the subjectivist is prone to risk, then his BDE> ALC (optimist, gambler), if not inclined, BDE
Comments
To leave a comment
Mathematical methods of research operations. The theory of games and schedules.
Terms: Mathematical methods of research operations. The theory of games and schedules.