You get a bonus - 1 coin for daily activity. Now you have 1 coin

14.5. Frequency probability estimate

Lecture



In practice, it is often necessary to estimate an unknown probability.   14.5.  Frequency probability estimate developments   14.5.  Frequency probability estimate by its frequency   14.5.  Frequency probability estimate at   14.5.  Frequency probability estimate independent experiences.

This task is closely related to those considered in the previous ones.   14.5.  Frequency probability estimate . Indeed, the frequency of the event   14.5.  Frequency probability estimate at   14.5.  Frequency probability estimate independent experiments is nothing more than the arithmetic average of the observed values ​​of   14.5.  Frequency probability estimate which in each individual experience takes the value 1 if the event   14.5.  Frequency probability estimate appeared, and 0, if not appeared:

  14.5.  Frequency probability estimate . (14.5.1)

Recall that the expected value   14.5.  Frequency probability estimate equally   14.5.  Frequency probability estimate ; its dispersion   14.5.  Frequency probability estimate where   14.5.  Frequency probability estimate . The mathematical expectation of the arithmetic mean is also equal to   14.5.  Frequency probability estimate

  14.5.  Frequency probability estimate (14.5.2)

i.e. assessment   14.5.  Frequency probability estimate for   14.5.  Frequency probability estimate is unbiased.

Variance of magnitude   14.5.  Frequency probability estimate equals

  14.5.  Frequency probability estimate . (14.5.3)

It is possible to prove that this variance is minimally possible, i.e.   14.5.  Frequency probability estimate for   14.5.  Frequency probability estimate is effective.

Thus, as a point estimate for an unknown probability   14.5.  Frequency probability estimate it is reasonable to take frequency in all cases   14.5.  Frequency probability estimate . The question arises about the accuracy and reliability of such an assessment, i.e., about building a confidence interval for the probability   14.5.  Frequency probability estimate .

Although this problem is a special case of the previously considered confidence interval problem for mathematical expectation, it is still advisable to solve it separately. The specificity here is that the magnitude   14.5.  Frequency probability estimate - a discontinuous random variable with only two possible values: 0 and 1. In addition, its expectation   14.5.  Frequency probability estimate and variance   14.5.  Frequency probability estimate linked by functional dependence. This simplifies the task of building a confidence interval.

We first consider the simplest case, when the number of experiments   14.5.  Frequency probability estimate relatively high and the probability   14.5.  Frequency probability estimate not too big and not too small. Then we can assume that the frequency of the event   14.5.  Frequency probability estimate there is a random variable whose distribution is close to normal. Calculations show that this assumption can be used even for not very large values.   14.5.  Frequency probability estimate : enough for both quantities   14.5.  Frequency probability estimate and   14.5.  Frequency probability estimate there were more than four. We will assume that these conditions are met and the frequency   14.5.  Frequency probability estimate can be considered distributed according to the normal law. The parameters of this law will be:

  14.5.  Frequency probability estimate ;   14.5.  Frequency probability estimate . (14.5.4)

Suppose first that the quantity   14.5.  Frequency probability estimate we know. Assign confidence probability   14.5.  Frequency probability estimate and find such an interval   14.5.  Frequency probability estimate to value   14.5.  Frequency probability estimate fell into this interval with probability   14.5.  Frequency probability estimate :

  14.5.  Frequency probability estimate . (14.5.5)

Since the value   14.5.  Frequency probability estimate distributed normally then

  14.5.  Frequency probability estimate ,

from where as in   14.5.  Frequency probability estimate 14.3,

  14.5.  Frequency probability estimate ,

Where   14.5.  Frequency probability estimate - inverse function of the normal distribution function   14.5.  Frequency probability estimate .

For determining   14.5.  Frequency probability estimate , As in   14.5.  Frequency probability estimate 14.3, can be denoted

  14.5.  Frequency probability estimate .

Then

  14.5.  Frequency probability estimate , (14.5.6)

Where   14.5.  Frequency probability estimate determined from table 14.3.1.

So with probability   14.5.  Frequency probability estimate it can be argued that

  14.5.  Frequency probability estimate . (14.5.7)

Actual value   14.5.  Frequency probability estimate unknown to us; however, inequality (14.5.7) will have the probability   14.5.  Frequency probability estimate regardless of whether we know or do not know the probability   14.5.  Frequency probability estimate . Getting from experience a specific frequency value   14.5.  Frequency probability estimate , it is possible, using inequality (14.5.7), to find the interval   14.5.  Frequency probability estimate which with probability   14.5.  Frequency probability estimate covers the point   14.5.  Frequency probability estimate . Indeed, we transform this inequality to the form

  14.5.  Frequency probability estimate (14.5.8)

and give it a geometric interpretation. We will postpone the frequency on the x-axis   14.5.  Frequency probability estimate and the ordinate is the probability   14.5.  Frequency probability estimate (fig. 14.5.1).

  14.5.  Frequency probability estimate

Fig. 14.5.1.

Geometric location of points whose coordinates   14.5.  Frequency probability estimate and   14.5.  Frequency probability estimate satisfy the inequality (14.5.8), will be the inner part of the ellipse passing through the points   14.5.  Frequency probability estimate and   14.5.  Frequency probability estimate and having tangents at these points parallel to the axis   14.5.  Frequency probability estimate . Since the value   14.5.  Frequency probability estimate there can be neither negative nor greater than one, then the region   14.5.  Frequency probability estimate corresponding to inequality (14.5.8), it is necessary to restrict left and right straight lines   14.5.  Frequency probability estimate and   14.5.  Frequency probability estimate . Now possible for any value   14.5.  Frequency probability estimate obtained from experience, build a confidence interval   14.5.  Frequency probability estimate which with probability   14.5.  Frequency probability estimate will cover an unknown value   14.5.  Frequency probability estimate . For this we draw through the point   14.5.  Frequency probability estimate straight line parallel to the ordinate axis; on this straight border area   14.5.  Frequency probability estimate will cut off the confidence interval   14.5.  Frequency probability estimate . Really point   14.5.  Frequency probability estimate with random abscissa   14.5.  Frequency probability estimate and non-random (but unknown) ordinate   14.5.  Frequency probability estimate with probability   14.5.  Frequency probability estimate gets inside the ellipse, i.e. spacing   14.5.  Frequency probability estimate with probability   14.5.  Frequency probability estimate will cover the point   14.5.  Frequency probability estimate .

The size and configuration of the “confidence ellipse” depends on the number of experiments.   14.5.  Frequency probability estimate . The more   14.5.  Frequency probability estimate , the more the ellipse is stretched and the narrower the confidence interval.

Confidence limits   14.5.  Frequency probability estimate and   14.5.  Frequency probability estimate can be found from relation (14.5.8), replacing the inequality sign with equality. Solving the resulting quadratic equation with respect to   14.5.  Frequency probability estimate , we get two roots:

  14.5.  Frequency probability estimate (14.5.9)

Confidence interval for probability   14.5.  Frequency probability estimate will be

  14.5.  Frequency probability estimate .

Example 1. Event frequency   14.5.  Frequency probability estimate in a series of 100 experiments was   14.5.  Frequency probability estimate . Determine the 90% confidence interval for the probability.   14.5.  Frequency probability estimate developments   14.5.  Frequency probability estimate .

Decision. First of all, we check the applicability of the normal law; for this we estimate the values   14.5.  Frequency probability estimate and   14.5.  Frequency probability estimate . Assuming roughly   14.5.  Frequency probability estimate get

  14.5.  Frequency probability estimate ;   14.5.  Frequency probability estimate .

Both values ​​are much greater than four; normal law is applicable. From table 14.3.1 for   14.5.  Frequency probability estimate we find   14.5.  Frequency probability estimate . By the formulas (14.5.9) we have

  14.5.  Frequency probability estimate ;   14.5.  Frequency probability estimate ;   14.5.  Frequency probability estimate .

Note that when increasing   14.5.  Frequency probability estimate magnitudes   14.5.  Frequency probability estimate and   14.5.  Frequency probability estimate in formulas (14.5.9) tend to zero; in the limit, the formulas take the form

  14.5.  Frequency probability estimate (14.5.10)

These formulas can also be obtained directly by using the approximate method of constructing a confidence interval for the expectation given in   14.5.  Frequency probability estimate 14.3. Formulas (14.5.10) can be used for large   14.5.  Frequency probability estimate (on the order of hundreds), if only the probability   14.5.  Frequency probability estimate not too large and not too small (for example, when both quantities are   14.5.  Frequency probability estimate and   14.5.  Frequency probability estimate about 10 or more).

Example 2. Made 200 experiments; event frequency   14.5.  Frequency probability estimate turned out to be   14.5.  Frequency probability estimate . Construct an 85% confidence interval for the probability of an event approximately (using formulas (14.5.10)). Compare the result with the exact corresponding formulas (14.5.9).

Decision.   14.5.  Frequency probability estimate ; according to table 14.3.1 we find   14.5.  Frequency probability estimate . Multiplying it by

  14.5.  Frequency probability estimate ,

will get

  14.5.  Frequency probability estimate ,

where do we find the approximate confidence interval

  14.5.  Frequency probability estimate .

By the formulas (14.5.9) we find more accurate values.   14.5.  Frequency probability estimate ;   14.5.  Frequency probability estimate which almost do not differ from the approximate.

Above, we considered the question of constructing a confidence interval for the case of a sufficiently large number of experiments where the frequency can be considered normally distributed. With a small number of experiments (and also if the probability   14.5.  Frequency probability estimate very large or very small) such an assumption cannot be used. In this case, the confidence interval is built based not on the approximate, but on the exact law of frequency distribution. It is easy to verify that this is the binomial distribution discussed in Chapters 3 and 4. Indeed, the number of occurrences of an event   14.5.  Frequency probability estimate at   14.5.  Frequency probability estimate experiments are distributed according to the binomial law: the probability that an event   14.5.  Frequency probability estimate will appear exactly   14.5.  Frequency probability estimate times equals

  14.5.  Frequency probability estimate , (14.5.11)

and frequency   14.5.  Frequency probability estimate there is nothing more than the number of occurrences of an event divided by the number of experiences.

Based on this distribution, you can build a confidence interval   14.5.  Frequency probability estimate similar to the way we built it, based on the normal law for large   14.5.  Frequency probability estimate (p. 331).

Suppose first that the probability   14.5.  Frequency probability estimate we know and find the frequency range   14.5.  Frequency probability estimate ,   14.5.  Frequency probability estimate in which with probability   14.5.  Frequency probability estimate event frequency   14.5.  Frequency probability estimate .

For the case of large   14.5.  Frequency probability estimate we used the normal distribution law and took an interval symmetric with respect to the expectation. The binomial distribution (14.5.11) does not have symmetry. In addition, (due to the fact that the frequency is a discontinuous random variable) of the interval, the probability of hitting it is exactly equal to   14.5.  Frequency probability estimate may not exist. Therefore, we choose as the interval   14.5.  Frequency probability estimate ,   14.5.  Frequency probability estimate the smallest interval, the probability of falling to the left of which and to the right of which will be greater   14.5.  Frequency probability estimate .

Similar to the way we built the area   14.5.  Frequency probability estimate for a normal law (fig. 14.5.1), it will be possible for each   14.5.  Frequency probability estimate and given   14.5.  Frequency probability estimate construct an area within which the probability value   14.5.  Frequency probability estimate It is compatible with the observed value of the frequency p *.

In fig. 14.5.2 shows the curves limiting such areas for different   14.5.  Frequency probability estimate at confidence level   14.5.  Frequency probability estimate . The frequency is plotted on the abscissa.   14.5.  Frequency probability estimate , ordinate - probability   14.5.  Frequency probability estimate . Each pair of curves corresponding to this   14.5.  Frequency probability estimate , determines the confidence interval of probabilities corresponding to a given frequency value. Strictly speaking, the boundaries of the regions should be stepped (due to frequency discontinuity), but for convenience, they are depicted as smooth curves.

In order to use such curves to find a confidence interval   14.5.  Frequency probability estimate The following construction should be performed (see Fig. 14.5.2): the frequency value observed in the experiment should be put off along the abscissa axis   14.5.  Frequency probability estimate , draw a straight line through this point parallel to the ordinate axis and mark the points of intersection of the line with a pair of curves corresponding to the given number of experiments   14.5.  Frequency probability estimate ; the projections of these points on the y-axis and give the boundaries   14.5.  Frequency probability estimate ,   14.5.  Frequency probability estimate confidence interval   14.5.  Frequency probability estimate

  14.5.  Frequency probability estimate

Fig. 14.5.2.

With a given   14.5.  Frequency probability estimate The curves limiting the "confidence region" are determined by the equations:

  14.5.  Frequency probability estimate ; (14.5.12)

  14.5.  Frequency probability estimate (14.5.13)

Where   14.5.  Frequency probability estimate - the number of occurrences of the event:

  14.5.  Frequency probability estimate .

Solving equation (14.5.12) for   14.5.  Frequency probability estimate , you can find the lower bound   14.5.  Frequency probability estimate “Trust area”; similarly from (14.5.13) you can find   14.5.  Frequency probability estimate .

In order not to solve these equations anew each time, it is convenient to pre-tabulate (or present graphically) solutions for several typical values ​​of confidence probability.   14.5.  Frequency probability estimate . For example, in the book of I. V. Dunin-Barkovsky and N. V. Smirnov, "Theory of Probability and Mathematical Statistics in Engineering" there are tables   14.5.  Frequency probability estimate and   14.5.  Frequency probability estimate for   14.5.  Frequency probability estimate and   14.5.  Frequency probability estimate . From the same book borrowed graph pic. 14.5.2.

Example 3. Find the confidence limits   14.5.  Frequency probability estimate and   14.5.  Frequency probability estimate for the probability of an event, if in 50 experiments its frequency was   14.5.  Frequency probability estimate . Confidence probability   14.5.  Frequency probability estimate .

Decision. By building (see dashed line in Fig. 14.5.2) for   14.5.  Frequency probability estimate and   14.5.  Frequency probability estimate we find:   14.5.  Frequency probability estimate ;   14.5.  Frequency probability estimate .

Пользуясь методом доверительных интервалов, можно приближенно решить и другой важный для практики вопрос: каково должно быть число опытов   14.5.  Frequency probability estimate для того, чтобы с доверительной вероятностью 3 ожидать, что ошибка от замены вероятности частотой не превзойдет заданного значения?

При решении подобных задач удобнее не пользоваться непосредственно графиками типа рис. 14.5.2, а перестроить их, представив доверительные границы как функции от числа опытов   14.5.  Frequency probability estimate .

Пример 4. Проведено 25 опытов, в которых событие   14.5.  Frequency probability estimate произошло 12 раз. Найти ориентировочно число опытов   14.5.  Frequency probability estimate , которое понадобится для того, чтобы с вероятностью   14.5.  Frequency probability estimate ошибка от замены вероятности частотой не превзошла 20%.

Decision. Определяем предельно допустимую ошибку:

  14.5.  Frequency probability estimate .

Using the curves in fig. 14.5.2, we will construct a new graph: on the abscissa axis we postpone the number of experiments   14.5.  Frequency probability estimate , on the ordinate axis - the confidence limits for the probability (Fig. 14.5.3).

  14.5.  Frequency probability estimate

Fig. 14.5.3.

The average straight line parallel to the x-axis corresponds to the observed event frequency   14.5.  Frequency probability estimate .Above and below the straight line are   14.5.  Frequency probability estimate drawn curves.  14.5.  Frequency probability estimate and   14.5.  Frequency probability estimate depicting lower and upper confidence limits depending on   14.5.  Frequency probability estimate .The area between the curves, which determines the confidence interval, is shaded. In the immediate vicinity of a straight   14.5.  Frequency probability estimate double-hatching, a narrower area of ​​20% permissible error is shown. From fig.14.5.3 it is seen that the error falls to the permissible value when the number of experiments is   14.5.  Frequency probability estimate about 100

Note that after performing the required number of experiments, a new verification of the accuracy of determining the probability by frequency may be needed, since in the general case a different frequency value will be obtained that differs from that observed in previous experiments. In this case, it may turn out that the number of experiments is still not enough to ensure the required accuracy, and it will have to be slightly increased. However, the first approximation obtained by the method described above can serve as an indicative preliminary planning of a series of experiments in terms of the time required for them, money costs, etc.

In practice, sometimes you have to meet with a peculiar problem of determining the confidence interval for the probability of an event when the frequency obtained from experience is zero. Such a task is usually associated with experiments in which the probability of the event of interest to us is very small (or, conversely, very high - then the probability of the opposite event is small).

Let, for example, tests of some product on the reliability of work. As a result of testing, the product did not fail once. It is required to find the maximum possible probability of failure.

We set this task in a general form. Produced   14.5.  Frequency probability estimate independent experiments, none of which event  14.5.  Frequency probability estimate Did not happen. Confidence level set   14.5.  Frequency probability estimate ; требуется построить доверительный интервал для вероятности   14.5.  Frequency probability estimate события   14.5.  Frequency probability estimate , точнее - найти его верхнюю границу   14.5.  Frequency probability estimate так как нижняя   14.5.  Frequency probability estimate , естественно, равна нулю.

Поставленная задача является частным случаем общей задачи о доверительном интервале для вероятности, но ввиду своих особенностей заслуживает отдельного рассмотрения. Прежде всего, приближенный метод построения доверительного интервала (на основе замены закона распределения частоты нормальным), изложенный в начале данного   14.5.  Frequency probability estimate , здесь неприменим, так как вероятность   14.5.  Frequency probability estimate очень мала. Точный метод построения доверительного интервала на основе биномиального распределения в данном случае применим, но может быть существенно упрощен.

We will argue as follows. As a result   14.5.  Frequency probability estimate опытов наблюдено событие   14.5.  Frequency probability estimate , состоящее в том, что   14.5.  Frequency probability estimate не появилось ни разу. Требуется найти максимальное значение   14.5.  Frequency probability estimate , которое «совместимо» с наблюденным в опыте событием   14.5.  Frequency probability estimate , если считать «несовместимыми» с   14.5.  Frequency probability estimate те значения   14.5.  Frequency probability estimate , для которых вероятность события   14.5.  Frequency probability estimate меньше, чем   14.5.  Frequency probability estimate .

Очевидно, для любой вероятности   14.5.  Frequency probability estimate события   14.5.  Frequency probability estimate вероятность наблюденного события   14.5.  Frequency probability estimate equals

  14.5.  Frequency probability estimate .

Полагая   14.5.  Frequency probability estimate , получим уравнение для   14.5.  Frequency probability estimate :

  14.5.  Frequency probability estimate , (14.5.14)

from where

  14.5.  Frequency probability estimate . (14.5.15)

Пример 5. Вероятность   14.5.  Frequency probability estimate самопроизвольного срабатывания взрывателя при падении снаряда с высоты   14.5.  Frequency probability estimate неизвестна, но предположительно весьма мала. Произведено 100 опытов, в каждом из которых снаряд роняли с высоты   14.5.  Frequency probability estimate , но ни в одном опыте взрыватель не сработал. Определить верхнюю границу   14.5.  Frequency probability estimate 90%-го доверительного интервала для вероятности   14.5.  Frequency probability estimate .

Decision. По формуле (14.5.15)

  14.5.  Frequency probability estimate ,

  14.5.  Frequency probability estimate ;

  14.5.  Frequency probability estimate ;   14.5.  Frequency probability estimate .

Рассмотрим еще одну задачу, связанную с предыдущей. Event   14.5.  Frequency probability estimate с малой вероятностью   14.5.  Frequency probability estimate не наблюдалось в серии из   14.5.  Frequency probability estimate опытов ни разу. Задана доверительная вероятность   14.5.  Frequency probability estimate . Каково должно быть число опытов   14.5.  Frequency probability estimate для того, чтобы верхняя доверительная граница для вероятности события была равна заданному значению   14.5.  Frequency probability estimate ?

Решение сразу получается из формулы (14.5.14):

  14.5.  Frequency probability estimate . (14.5.16)

Пример 6. Сколько раз нужно убедиться в безотказной работе изделия для того, чтобы с гарантией 95% утверждать, что в практическом применении оно будет отказывать не более чем в 5% всех случаев?

Decision. По формуле (14.5.16) при   14.5.  Frequency probability estimate ,   14.5.  Frequency probability estimate we have:

  14.5.  Frequency probability estimate .

Округляя в большую сторону, получим:

  14.5.  Frequency probability estimate .

Имея в виду ориентировочный характер всех расчетов подобного рода, можно предложить вместо формул (14.5.15) и (14.5.16) более простые приближенные формулы. Их можно получить, предполагая, что число появлений события   14.5.  Frequency probability estimate at   14.5.  Frequency probability estimate опытах распределено по закону Пуассона с математическим ожиданием   14.5.  Frequency probability estimate . Это предположение приближенно справедливо в случае, когда вероятность   14.5.  Frequency probability estimate очень мала (см. гл. 5.   14.5.  Frequency probability estimate 5.9). Then

  14.5.  Frequency probability estimate ,

и вместо формулы (14.5.15) получим:

  14.5.  Frequency probability estimate , (14.5.17)

а вместо формулы (14.5.16)

  14.5.  Frequency probability estimate . (14.5.18)

Example 7. Find an approximate value   14.5.  Frequency probability estimate for the conditions of example 5.

Decision. By the formula (14.5.14) we have:

  14.5.  Frequency probability estimate ,

i.e. the same result, which is obtained by the exact formula in Example 5.

Example 8. Find an approximate value   14.5.  Frequency probability estimate for the conditions of example 6.

Decision. By the formula (14.5.18) we have:

  14.5.  Frequency probability estimate .

Rounding up in a big way, we find   14.5.  Frequency probability estimate that it differs little from the result   14.5.  Frequency probability estimate obtained in Example 6.


Comments


To leave a comment
If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.
To reply

Probability theory. Mathematical Statistics and Stochastic Analysis

Terms: Probability theory. Mathematical Statistics and Stochastic Analysis