Lecture
We describe our idea of the RSS interface, which does not correspond exactly to any of the publications from the list of references, but is rather a compilation, consistent with the final publications.
As we have already noted, at the RSS level there is no naming of database objects used at the SQL level. Instead of the names of objects, their unique identifiers are used, which are direct or indirect addresses of internal descriptors of objects in external memory for permanent objects or in RAM for temporary objects.
The following groups of operations can be distinguished:
Scan group operations allow you to sequentially, in the order determined by the type of scan, read the relation or list tuples that satisfy the required conditions. The group includes OPEN, NEXT and CLOSE operations, meaning, respectively, the start of the scan, the requirement of the next tuple that satisfies the conditions, and the end of the scan.
For relationships, two scanning modes are possible: by ratio and by index. When scanning against a relation, the only parameter of the OPEN operation is the identifier of the relation (including, by the way, the identifier of the segment in which this relation is stored). Due to the fact that several relations in a single page of tuples of multiple relationships are allowed in System R, scanning with respect to the system implies sequential viewing of all pages of a segment with highlighting of the tuples that are included in this relationship; This is a very expensive way to scan a relationship. In this case, the order of sampling tuples is determined by their physical placement in the segment pages, i.e. predefined by the system.
When you open a relationship scan for one of its indices, the number of parameters for the OPEN operation includes the index identifier defined earlier in the fields of this relationship. In addition, you can specify a scan range in terms of the field values that make up the index key. When opening a scan by index, the initial scan pointer is set to the position of the index B-tree that corresponds to the left border of the specified range. The scanning process consists in successive advancing along the leaf vertices of the index until reaching the right limit of the scanning range with sampling tids and reading the corresponding tuples. It is easy to see that if a scan is performed not on a clustered index, then in the worst case it may take as many reads of data pages from external memory as there were tids, i.e. scanning efficiency by index is determined by the "latitude" of the specified scanning range. In this case, of course, there is the advantage that the scan order corresponds to the order of increasing or decreasing the index key values.
Finally, when scanning the list, as well as when scanning with respect, the only parameter of the OPEN operation is the list identifier, but, unlike scanning with respect, this scanning is as efficient as possible: only pages containing tuples from the given list are readable, and the scan order is the same as the listing of tuples in the list or the order of the list, if it is ordered.
As a result of successful execution of the opening scan operation (if there are no errors in the parameters), a scan identifier is generated and returned, which is used as an input parameter to other operations of this group.
The NEXT operation reads the next tuple of the specified scan that satisfies the conditions of the operation. A condition is the disjunctive normal form of simple conditions relating to the values of the indicated relationship fields. A simple condition is a condition of the form <field identifier op constant>, where op is a comparison operation <, <=,>,> =, =, or! =. The general condition is a parameter of the NEXT operation. The semantics of the NEXT operation are as follows: starting from the current scan position, the relationship tuples are selected in the order determined by the scan type until a tuple is found whose field values satisfy the specified condition; this tuple is the result of the operation; if when sampling the next tuple, the right border of the scan range is reached (the right border of the key value when scanning by index or the last tuple of a relation or list when scanning without an index), a special indication of the result is generated. After that, the only sensible action is to close the scan - CLOSE operation.
A CLOSE operation can be performed in a given transaction with respect to any previously open scan, regardless of its state (i.e., regardless of whether the right limit of the scan range is reached during the scan). The operation parameter is the scan identifier, and its execution causes the identifier to become invalid (and, accordingly, the service structures of the RSS memory related to this scan are destroyed).
The group of operations for creating and destroying permanent and temporary database objects includes the operations of creating tables (CREATE TABLE), lists (CREATE LIST), indexes (CREATE IMAGE) and destroying any of these objects (DROP TABLE, DROP LIST and DROP IMAGE). The input parameter of the operations of creating tables and lists is the specifier of the object structure, i.e. The number of object fields and their type specifiers. In addition, when specifying relationship fields, the assumption or non-admission of undefined values of fields in tuples of this relation or list is specified (undefined values are encoded in a special way; any comparison operation of a constant of this type with an uncertain value produces, by definition, a FALSE value, except for the comparison operation for matching a special literal constant NULL). As a result of performing these operations, a handle is created in the service relation of relationship descriptors or RAM (depending on whether a permanent object or a temporary object is created), and an object identifier is generated that serves as an input parameter to other operations related to the corresponding object (in particular, the OPEN operations when opening an object scan).
The input parameters of the CREATE IMAGE operation are the identifier of the table for which the index is being created, a list of field numbers whose values make up the index key, and ordering attributes in ascending or descending order for all of the key fields. In addition, the index uniqueness attribute can be specified, i.e. prohibition of the presence of duplicate keys in this index. If the operation is performed with respect to the table that is empty at this moment, the operation is as simple as for the creation of tables and lists: a handle is created in the service relation of index handles and the index identifier is returned (which is used as an input parameter open scan operations by index).
If by the time the index is created, the corresponding table is not empty (and this is allowed), then the operation becomes significantly more expensive, since when it is executed, the actual creation of the B-tree of the index occurs, which requires at least one consecutive review of the relationship. In this case, if the created index has a unique attribute, then it is controlled when creating a B-tree, and if the uniqueness is violated, then the operation is not performed (that is, the index is not created). From this it follows that although the creation of indices in dynamics is not prohibited, it is more efficient to create all the indices on this table before it is filled. Note that the creation of a clustered index for a non-empty relationship is prohibited, since the corresponding clustering of the relationship cannot be obtained without its restructuring.
DROP TABLE, DROP LIST and DROP IMAGE operations can be performed at any time regardless of the state of the objects. The execution of the operation leads to the destruction of the corresponding object and, consequently, the invalidity of its identifier.
It should be noted that the bulk operations on permanent objects (CREATE IMAGE, DROP TABLE and DROP IMAGE) require additional overhead due to the need to ensure the possibility of rollback of the transaction, resulting in the need to perform massive backward actions. This especially affects the operation of destroying non-empty tables, since it requires the journalization of all contained in them at the time of the destruction of the tuples. Therefore, although the destruction of non-empty tables is not prohibited, it must be borne in mind that this is a very expensive operation.
The group of operations for modifying relations and lists includes inserting a tuple into a relation or a list (INSERT), deleting a tuple from a relation (DELETE), and modifying a tuple with respect to (UPDATE).
The parameters for inserting a tuple are a relation or list identifier and a set of tuple field values. Among the field values may be literal null values NULL. Naturally, when performing an operation, the tolerance of undefined values in the corresponding fields is controlled. When entering a tuple into a clustered relation, a search for a place in the segment under a tuple is performed using a clustered index: the system tries to insert a tuple into a data page that already contains tuples with the same or similar values of the clustering fields. When a tuple is entered into a nonclustered relation, the place under the tuple is allocated in the first suitable data page. Finally, when inserting a tuple into the list, it is placed at the end of the list.
When a tuple is entered in a constant relation, all indexes defined on this relation are corrected. This is actually expressed in the insertion of a new record into all B-index trees. In this case, overflow of one or several pages of the index may occur, causing the part of the records to be transfused into neighboring pages or the splitting of pages. If the index is defined with a unique attribute, then compliance with this condition is checked, and if it is violated, the insert operation is considered to be unfulfilled. This shows that the operation of inserting a tuple is all the more demanding the more indices defined for this table (this also applies to the operations of deleting and modifying tuples).
As a result of successful execution of the operation of inserting a tuple into the relation, the tid of the new tuple is generated, which is reported as a response parameter and can be further used as a direct parameter of the operations of deleting and modifying the relation tuples. When a tuple is added to the list, the tid value is not generated (the lists allow only sequential scanning and adding new tuples to the end, no indexes can be defined over them, therefore indirect addressing of tuples of lists through tids is not required).
Deletion and modification of tuples are allowed only for tuples of permanent tables. Naturally, to perform these operations, it is necessary to identify the corresponding tuple. The RSS interface allows for two ways of such identification: using the tid of the tuple (explicit addressing) and using the identifier open by this scan time. The first option is possible because the tid of the tuple is reported as the response parameter of the operation of entering the tuple in the permanent table. When identifying a tuple using a scan identifier, this refers to a tuple read using the last NEXT operation. If such identification is performed with a DELETE or UPDATE operation, which affects the scanning order (ie, scanning is performed by an index and the modification operation changes the tuple field included in the index key), then the current scan tuple is lost and its identifier cannot be used to identify tuple before performing the next NEXT operation.
The only parameter of the DELETE operation is the tuple identifier (tid or scan identifier). The parameters of the UPDATE operation include, in addition to this identifier, the specification of the tuple fields to be changed (a list of field numbers and their new values). Among the values there may be literal images of undefined values if the corresponding relation fields allow for the storage of undefined values.
When performing a DELETE operation, all indexes defined on this relationship are corrected. An UPDATE operation can also cause index correction if it affects the fields that are part of their keys.
In addition to the “atomic” operations of scanning and modifying tables and lists, the RSS interface includes one “macro-operation”, which allows you to build a list sorted by the value of the specified fields in a single access to RSS. This operation — BUILDLIST — includes scanning a specified relation or list, creating a new list that includes the specified fields of selectable tuples, and sorting the constructed list according to the values of the specified fields. The identifier of the newly constructed sorted list is the response parameter of the operation.
Accordingly, the BUILDLIST operation parameters are a set of parameters for opening a scan (any scanning method is allowed), a list of field numbers that make up the tuples of a new list, and a list of field numbers that need to be sorted (as in the case of creating a new index, for each from these fields indicate the requirement for sorting by ascending or descending values of this field). A separate parameter of the BUILDLIST operation is a feature, according to the value of which duplicate tuples in the new list are allowed or not allowed. Looking ahead, we note that the admission or non-admission of duplicates in a sorted list depends on the purpose for which it is built. For example, if the list is built to perform a join operation, then there should be no duplicates in it. If the list is built to calculate aggregate functions (COUNT, AVG, etc.), then duplicates cannot be removed from it. We will discuss this and related issues in more detail in connection with the problems of query optimization in System R.
The RSS operation of adding a field to an existing relation allows you to dynamically change the table layout. The parameters of the CHANGE operation are the identifier of the existing table and the specification of the new field (its type). When performing an operation, only the descriptor of this relationship changes in the service relation of relationship descriptors. As we already noted in the previous subsection, before the first UPDATE operation affecting the new table field is performed, in reality no new tuple of the table will allocate memory for the new field. By default, the values of a new field in all tuples of a table that have not yet explicitly entered a value are considered undefined. Thus, for any field dynamically added to an existing table, the storage of undefined values cannot be prohibited.
Each RSS operation is performed within a certain transaction. The RSS interface includes a set of transaction control operations: start a transaction (BEGIN TRANSACTION), end a transaction (END TRANSACTION), set a save point (SAVE), and roll back to the specified save point or before the start of the transaction (RESTORE).
We have not noted this before, but in fact, when accessing any RSS function, except BEGIN TRANSACTION, one more parameter should be specified - the transaction identifier. This identifier is generated by the BEGIN TRANSACTION operation, which does not require any input parameters.
At any point in the transaction, prior to the END TRANSACTION operation, this transaction can be rolled back, i.e. reverse execution of all changes made in this transaction, and restoration of the scans state. The rollback can be made before the start of the transaction (in this case it is meaningless to speak about the restoration of the scans state) or before the save point set earlier in the transaction.
The save point is set using the SAVE operation. When performing this operation, the state of the scans of this transaction opened by the time SAVE is executed, and the coordinates of the last record of changes in the database in the log made on behalf of this transaction, are stored. The response parameter of the SAVE operation (it does not require direct parameters, except for the transaction identifier) is the savepoint identifier. This identifier can later be used as a direct parameter of the RESTORE operation, during which the database is restored according to the log, using records of its changes from this transaction to the state in which the database was located by the time the specified save point was established. In addition, according to the local information in the operative memory associated with the transaction, the state of its scans is restored.A rollback to the beginning of a transaction is also initiated by a call to a RESTORE operation, but with an indication of some predefined savepoint identifier.
When performing their transactions, System R users are isolated from each other, i.e. do not feel that the system operates in multiplayer mode. This is achieved due to the presence in the RSS mechanism of implicit synchronization (we will discuss this more fully in the next subsection). For now, we only note that until the end of the transaction, no database changes made within this transaction can be used in other transactions (an attempt to use such data leads to temporary synchronization locks for these transactions). When performing an END TRANSACTION operation, the changes made in this transaction are "committed", i.e. they become visible in other transactions. In reality, this means removing synchronization captures from database objects that have been changed in a transaction. Therefore,that after performing END TRANSACTION, individual rollbacks of this transaction are impossible. RSS simply invalidates the identifier of this transaction, and after completing the transaction, it discards all transactions with that identifier.
The last operation of the RSS interface is an explicit synchronization operation LOCK. This operation allows you to set an explicit synchronization capture on the specified relation (the operation parameter is the table identifier). Performing the LOCK operation ensures that no other transaction until the end of this one can change this relationship (insert a new tuple into it, delete or modify an existing one), if a capture relationship is set to read, or even read any tuple of this relationship, if capture is set to change mode.
Из всего, что говорилось раньше по поводу подхода к синхронизации в System R и соответствующего разбиения системы на уровни, следует нелогичность наличия этой операции в интерфейсе RSS. На самом деле, логически эта операция избыточна, т.е. если бы ее не было, можно вполне реализовать SQL на оставшейся части операций. До изложения материала следующего подраздела об этом трудно говорить, но предварительно заметим, что операция LOCK введена в интерфейс RSS для возможности оптимизации выполнения запросов. Дело в том, что, как видно из описания интерфейса, он покортежный. Следовательно, и информация для синхронизации носит достаточно узкий характер. В то же время на уровне SQL имеется более полная информация. Например, если обрабатывается предложение SQL DELETE FROM EMP, то известно, что будут удалены все кортежи указанной таблицы. Понятно, что как бы не реализовывался механизм синхронизации в RSS, в данном случае выгоднее сообщить сразу, что изменения касаются всего отношения. Но снова забегая вперед, заметим, что ситуации в компиляторе SQL, когда очевидна выгода от использования явной синхронизации, достаточно редки. Пользоваться этим средством можно только очень осмотрительно, потому что неоправданные захваты таких крупных объектов могут резко ограничить степень асинхронности выполнения транзакций.
Comments
To leave a comment
Databases IBM System R - relational DBMS
Terms: Databases IBM System R - relational DBMS