You get a bonus - 1 coin for daily activity. Now you have 1 coin

15.3. Aggregate functions and query results

Lecture



Aggregate functions (in SQL / 89 standard, they are called functions over sets) are defined in SQL / 89 by the following syntax rules:

   :: =
    COUNT (*) |  
             |  
  :: =
    {AVG |  MAX |  MIN |  SUM |  COUNT}
     (DISTNICT )
  :: =
    {AVG |  MAX |  MIN |  SUM} ([ALL] ) 

As can be seen from these rules, five standard aggregate functions are defined in the SQL / 89 standard: COUNT is the number of rows or values, MAX is the maximum value, MIN is the minimum value, SUM is the total value, and AVG is the average value.

15.3.1. Semantics of aggregate functions

Aggregate functions are designed to calculate a value for a given set of rows. Such a set of rows can be a group of rows if the aggregate function is applied to a grouped table, or the whole table. For all aggregate functions except COUNT (*), the actual (i.e. required by semantics) calculation order is as follows: a list of values ​​is made based on the parameters of the aggregate function from a given set of rows. Then, the function is calculated using this list of values. If the list is empty, the value of the COUNT function for it is 0, and the value of all other functions is null.

Let T denote the type of values ​​from this list. Then the result of calculating the COUNT function is the exact number with scale and accuracy determined in the implementation. The result type of the MAX and MIN functions coincides with T. When calculating the SUM and AVG functions, type T should not be the type of character strings, and the result type of the function is the type of exact numbers defined by the scale and accuracy in the implementation if T is the type of exact numbers and the type of approximate numbers with a defined in the implementation of accuracy, if T - the type of approximate numbers.

The calculation of the COUNT (*) function is performed by counting the number of rows in a given set. All rows are considered different, even if they consist of a single column with a value of null in all rows.

If the aggregate function is specified with the DISTINCT keyword, then the list of values ​​is constructed from the values ​​of the specified column. (We emphasize that in this case, the calculation of arithmetic expressions is not allowed!) Further, undefined values ​​are removed from this list, and duplicate values ​​are eliminated. Then the specified function is calculated.

If the aggregate function is specified without the DISTINCT keyword (or with the ALL keyword), then the list of values ​​is formed from the values ​​of the arithmetic expression calculated for each row of a given set. Next, the undefined values ​​are removed from the list, and the aggregate function is calculated. Please note that in this case the COUNT function is not allowed!

Note: both of the limitations indicated in the two previous paragraphs are more technical than fundamental, and may not be present in particular implementations. However, these are limitations of the SQL / 89 standard, and should be adhered to in mobile programming.

15.3.2. Query results

Aggregate functions can be used wisely in the specification of the cursor, the selector operator, and the subquery after the SELECT keyword (we will call all such constructions in this subsection as a selection list, not forgetting that in the case of a subquery, this list consists of only one element), and in the condition sample section HAVING. The standard allows more exotic uses of aggregate functions in subqueries (an aggregate function on a group of external query tuples), but in practice they are very rare.

Consider the various cases of application of aggregate functions in the sample list, depending on the type of table expression.

If the result of the table expression R is not a grouped table, then the appearance of at least one aggregate function from the set of rows R in the selection list causes R to be implicitly considered as a grouped table consisting of one (or zero) groups with missing grouping columns. Therefore, in this case, the list of sampling does not allow direct use of the specifications of the R strings: they must all be within the specifications of the aggregate functions. The result of the query is a table consisting of no more than one line, obtained by applying aggregate functions to R.

The situation is similar in the case when R is a grouped table, but the table expression does not contain the GROUP BY clause (and, therefore, contains the HAVING clause). If, in the case of the previous paragraph, there were two options for forming the sample list: only with direct indication of the R columns or only with their indication inside the specifications of aggregate functions, then in this case only the second option is possible. The result of a table expression is explicitly declared as a grouped table consisting of one group, and the result of a query can only be generated by applying aggregate functions to this group of rows. Again, the result of the query is a table consisting of no more than one row, obtained by applying aggregate functions to R.

Finally, consider the case when R is a “real” grouped table, i.e. The table expression contains a GROUP BY clause and, therefore, at least one grouping column is defined. In this case, the selection list generation rules are fully consistent with the selection rules for the HAVING section selection condition: it allows direct use of the specification of grouping columns, and the specifications of the remaining R columns can appear only within the specifications of aggregate functions. The result of the query is a table, the number of rows in which is equal to the number of groups in R, and each row is formed based on the values ​​of the grouping columns and aggregate functions for this group.


Comments


To leave a comment
If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.
To reply

Databases IBM System R - relational DBMS

Terms: Databases IBM System R - relational DBMS