You get a bonus - 1 coin for daily activity. Now you have 1 coin

The Basics of Modern Databases Preface

Lecture



Foreword

The subject of the course are database management systems (DBMS). This is a very important topic, without a thorough acquaintance with which in our time it is impossible to be not only a qualified programmer, but even a competent computer user.

Note that there is very little literature in the Russian language devoted to the subject of the DBMS. It is impossible to recommend one or more books, the content of which would cover the material of this course. Among the best are the books of K. Data “Introduction to database systems” (Science, 1980) and the “Guide to relational DBMS DB2” (Finance and Statistics, 1988), as well as the book by J. Ullman “Fundamentals of database systems” (Finance and statistics, 1983). Although these books are somewhat outdated (several supplemented editions have already been published in English), they are worth reading.

The main purpose of this course is a systematic introduction to the ideas and methods used in modern relational database management systems. The course does not deal with any one popular DBMS; The material presented is equally relevant to any modern system. As experience shows, without knowledge of database fundamentals it is difficult to work with concrete systems on a serious level, no matter how well they are documented.

The introductory part of the course covers the main differences between file systems and database management systems. Based on an analysis of the capabilities of modern file systems, areas of applications are distinguished in which it suffices to use files, as well as those for which databases are necessary. In the same part of the course, basic functions are discussed and typical DBMS organization is given. In conclusion, the introductory part briefly discusses the main characteristics of the early, dorielating systems.

The second part of the course contains the necessary theoretical basis on which the further presentation is based. The basic concepts of the relational data model are introduced, the basic properties of relations are discussed, two basic mechanisms for manipulating data are discussed: relational algebra and relational calculus. The following describes the principles of normalization, on which the classical approach to the design of relational databases is based. Finally, a more modern approach to database design is described, based on the use of semantic data models.

In the third part of the course, two classic examples of experimental projects of relational DBMS are discussed, which, in fact, serve as the basis for all modern commercial DBMS. Considered projects System R and Ingres. The idea is to demonstrate balanced sets of solutions applied in these two significantly different systems.

The fourth part of the course is dedicated to the internal organization of modern multi-user relational DBMS. The methods of organizing the external memory of databases and the data structures used are considered. The concept of a transaction is introduced and the known methods of managing asynchronously executed transactions are analyzed. Discusses the need for logging database changes and the relationship of logging algorithms with the policy of managing memory buffers. Finally, the ways of using journal and archive information for restoring databases after various failures are considered.

The fifth part of the course discusses the language of SQL relational databases in detail. The original SQL concepts are discussed on the basis of the version of the language developed in the framework of the System R project. It then briefly describes the history of SQL after System R and, in particular, the international standardization activities. After that, the most common language standard, SQL-89, is systematically presented. Discusses how to use SQL when programming application systems. The fifth part of the course ends with a review of the most important features that appeared in the last standard of the SQL-92 language.

The sixth part of the course is devoted to problems that are solved in the compilers of the SQL language. These problems are centered around the need to obtain efficient programs that execute statements, which are initially formulated in the declarative language SQL. To solve problems in SQL compilers one has to apply various optimizing techniques, knowledge of which helps to understand the operation of any particular system.

In the seventh part of the course the use of client-server architecture in modern DBMS is considered. The principles of interaction between client and server parts of the system, and, in particular, the importance of using remote procedure call protocols to ensure the organization of heterogeneous client-server organizations are considered. A typical separation of functions between clients and servers and the following requirements for hardware capabilities are discussed.

Part eight deals with the specific problems of distributed databases. The categories of distributed systems, issues of naming, administration, maintaining the reliability and availability of data, features of transaction management and compilation of queries are analyzed.

Finally, the ninth part of the final course is devoted to the directions and state of research in the field of databases. Projects and studies related to the so-called next-generation database systems are considered; the basic principles of the organization of systems of object-oriented databases; rule-based database systems, including active and deductive databases.


Content

  1. Introduction
    • Lecture 1. Databases and file systems
      • 1.1. File systems
        • 1.1.1. File structures
        • 1.1.2. File Naming
        • 1.1.3. File protection
        • 1.1.4. Multi-user access mode
      • 1.2. File scopes
      • 1.3. Information System Needs
    • Lecture 2. DBMS Functions. Typical organization DBMS. Examples
      • 2.1. The main functions of the DBMS
        • 2.1.1. Direct data management in external memory
        • 2.1.2. Managing RAM Buffers
        • 2.1.3. Transaction management
        • 2.1.4. Journaling
        • 2.1.5. Database language support
      • 2.2. Typical organization of modern DBMS
      • 2.3. Example: System R
    • Lecture 3. Early approaches to the organization of the database. Systems based on inverted lists, hierarchical and network DBMS. Examples Strengths and weaknesses of early systems
      • 3.1. Main features of systems based on inverted lists
        • 3.1.1. Data structures
        • 3.1.2. Data manipulation
        • 3.1.3. Integrity constraints
      • 3.2. Hierarchical systems
        • 3.2.1. Hierarchical data structures
        • 3.2.2. Data manipulation
        • 3.2.3. Integrity constraints
      • 3.3. Network systems
        • 3.3.1. Network data structures
        • 3.3.2. Data manipulation
        • 3.3.3. Integrity constraints
      • 3.4. Advantages and disadvantages
  2. Theoretical basis
    • Lecture 4. General concepts of the relational approach to database organization. Basic concepts and terms
      • 4.1. Basic concepts of relational databases
        • 4.1.1. Data type
        • 4.1.2. Domain
        • 4.1.3. Relationship scheme, database scheme
        • 4.1.4. Tuple, attitude
      • 4.2. Fundamental properties of relationships
        • 4.2.1. No duplicate tuples
        • 4.2.2. Lack of ordering tuples
        • 4.2.3. Lack of attribute ordering
        • 4.2.4. Atomicity of attribute values
      • 4.3. Relational data model
        • 4.3.1. general characteristics
        • 4.3.2. Entity and link integrity
    • Lecture 5. Basic means of manipulating relational data
      • 5.1. Relational algebra
        • 5.1.1. General interpretation of relational operations
        • 5.1.2. Relational algebra closure and rename operation
        • 5.1.3. Features of set-theoretic operations of relational algebra
        • 5.1.4. Special relational operations
      • 5.2. Relational calculus
        • 5.2.1. Tuple variables and well-constructed formulas
        • 5.2.2. Target lists and expressions of relational calculus
        • 5.2.3. Relational Domain Calculus
    • Lecture 6. Designing relational databases
      • 6.1. Designing relational databases using normalization
        • 6.1.1. Second normal form
        • 6.1.2. Third normal form
        • 6.1.3. Normal form of Boyce-Codd
        • 6.1.4. Fourth normal form
        • 6.1.5. Fifth normal form
      • 6.2. Semantic data modeling, ER charts
        • 6.2.1. Semantic data models
        • 6.2.2. Basic Entity-Relationship Model Concepts
        • 6.2.3. Normal forms of ER-schemes
        • 6.2.4. More complex elements of the ER-model
        • 6.2.5. Getting the relational schema from the ER schema
  3. Two classic experimental systems
    • Lecture 7. System R: general system organization, basics of the SQL language
      • 7.1. Terminology used
      • 7.2. The main objectives of System R and their connection to the system architecture
      • 7.3. Organization of external memory in System R databases
      • 7.4. RSS interface
      • 7.5. Synchronization in System R
      • 7.6. Logging and Recovery in System R
    • Lecture 8. Ingres: general organization of the system, Quel language basics
      • 8.1. Ingres DBMS History
      • 8.2. Ingres as a UNIX-oriented DBMS. Dynamic system structure: a set of processes
      • 8.3. Data structures, access methods, data access interfaces
      • 8.4. General characteristics of the language QUEL. EQUEL programming language
      • 8.5. General approach to the organization of representations, integrity constraints and access control
  4. Internal organization of relational DBMS
    • Lecture 9. External memory structures, index organization methods
      • 9.1. Relationship storage
      • 9.2. Indices
        • 9.2.1. B-trees
        • 9.2.2. Hashing
      • 9.3. Journal information
      • 9.4. Service information
    • Lecture 10. Transaction Management, Transaction Serialization
      • 10.1. Transactions and Database Integrity
      • 10.2. User isolation
      • 10.3. Transaction serialization
    • Lecture 11. Transaction Serialization Methods
      • 11.1. Synchronization captures
        • 11.1.1. Granular Synchronization Hooks
        • 11.1.2. Predicative Synchronization Captures
        • 11.1.3. Dead ends, recognition and destruction
      • 11.2. Timestamp method
    • Lecture 12. Journaling DB Changes
      • 12.1. Logging and buffering
      • 12.2. Individual transaction rollback
      • 12.3. Recovery after a mild failure
      • 12.4. Physical database consistency
      • 12.5. Hard Failure Recovery
  5. Relational Database Language SQL
    • Lecture 13. SQL language. Functions and main features
      • 13.1. SEQUEL / SQL DBMS System R
        • 13.1.1. Queries and data manipulation statements
        • 13.1.2. Operators for defining and manipulating the database schema
        • 13.1.3. Definitions of integrity constraints and triggers
        • 13.1.4. Database Views
        • 13.1.5. Definition of control structures
        • 13.1.6. Authorization of access to relationships and their fields
        • 13.1.7. Save points and transaction rollbacks
        • 13.1.8. Embedded SQL
        • 13.1.9. Dynamic SQL
      • 13.2. SQL language in commercial implementations
      • 13.3. SQL standardization
    • Lecture 14. Standard SQL Database Language
      • 14.1. Data types
      • 14.2. Schema Definition Tools
        • 14.2.1. Schema definition operator
        • 14.2.2. Table definition
        • 14.2.3. Column definition
        • 14.2.4. Defining table integrity constraints
        • 14.2.5. View definitions
        • 14.2.6. Privilege definition
    • Lecture 15. The SQL language. Means of data manipulation
      • 15.1. Query structure
        • 15.1.1. Cursor Specification
        • 15.1.2. Sampling operator
        • 15.1.3. Subquery
      • 15.2. Table expression
        • 15.2.1. FROM section
        • 15.2.2. Section WHERE
        • 15.2.3. GROUP BY section
        • 15.2.4. HAVING section
      • 15.3. Aggregate functions and query results
        • 15.3.1. Semantics of aggregate functions
        • 15.3.2. Query results
    • Lecture 16. The use of SQL in application programming.
      • 16.1. Module language or embedded SQL?
      • 16.2. Module Language
        • 16.2.1. Procedure definition
      • 16.3. Embedded SQL
      • 16.4. A set of data manipulation statements
        • 16.4.1. Cursor-related operators
        • 16.4.2. Single data manipulation operators
      • 16.5. Dynamic SQL in Oracle V.6
        • 16.5.1. Training operator
        • 16.5.2. Operator get the description of the prepared operator
        • 16.5.3. Operator perform the prepared statement
        • 16.5.4. Working with dynamic SQL statements through cursors
    • Lecture 17. Some features of SQL / 92 and SQL-3
      • 17.1. Descriptor memory allocator
      • 17.2. Memory freeing operator
      • 17.3. Operator get information from the SQL descriptor area
      • 17.4. Descriptor Setup Operator
      • 17.5. Training operator
      • 17.6. Operator rejection of the prepared operator
      • 17.7. Operator request query description of the prepared operator
      • 17.8. Operator perform the prepared statement
      • 17.9. Immediate preparation operator
      • 17.10. Cursor declaration operator over a dynamically prepared select statement
      • 17.11. Cursor definition operator over dynamically prepared select statement
      • 17.12. The operator to open a cursor associated with a dynamically prepared sample statement
      • 17.13. The operator of reading the line on the cursor associated with a dynamically prepared sample statement
      • 17.14. The cursor closing operator associated with a dynamically prepared sample statement
      • 17.15. The positional delete operator on the cursor associated with a dynamically prepared sample operator
      • 17.16. The operator of the positional modification of the cursor associated with a dynamically prepared sample operator
      • 17.17. Prepared Positional Delete Operator
      • 17.18. Prepared positional modification operator
      • 17.19. Summary of SQL-3 features
        • 17.19.1. Data types
        • 17.19.2. Some other SQL-3 properties
  6. SQL compilers
    • Lecture 18. SQL compilers. Optimization problems
      • 18.1. General request processing
      • 18.2. Syntax query optimization
        • 18.2.1. Simple logical query transformations
        • 18.2.2 Query Transformations with Reordering Relational Operations
        • 18.2.3 Bringing queries with subqueries nested to queries with connections
      • 18.3. Semantic query optimization
        • 18.3.1. Query Transformations Based on Semantic Information
        • 18.3.2. Using semantic information in query optimization
      • 18.4. Selection and evaluation of alternative query plans
        • 18.4.1. Generation of plans
        • 18.4.2. Query Plan Cost Estimate
        • 18.4.3. More accurate estimates
  7. DBMS in the client-server architecture
    • Lecture 19. Client-server architecture
      • 19.1. Open systems
      • 19.2. Clients and servers of local networks
      • 19.3. Client-server system architecture
      • 19.4. Database servers
        • 19.4.1. Principles of interaction between client and server parts
        • 19.4.2. Benefits of Remote Procedure Call Protocols
        • 19.4.3. Typical separation of functions between clients and servers
        • 19.4.4. Requirements for hardware capabilities and basic software clients and servers
  8. Distributed Databases
    • Lecture 20. Distributed DB
      • 20.1. Varieties of distributed systems
      • 20.2. System R * Distributed Database Management System
        • 20.2.1. Object naming and distributed directory organization
        • 20.2.2. Distributed query compilation
        • 20.2.3. Transaction Management and Synchronization
      • 20.3. Integrated or federated systems and multi-databases
  9. Current research and development directions
    • Lecture 21. Next generation database management systems
      • 21.1. Orientation to Extended Relational Model
      • 21.2. Abstract data types
      • 21.3. Generation of application-oriented database systems
      • 21.4. Rule-based query optimization
      • 21.5. Support for historical information and temporal queries
    • Lecture 22. Object-oriented DBMS
      • 22.1. The connection of object-oriented DBMS with the general concepts of object-oriented approach
      • 22.2. Object Oriented Data Models
      • 22.3. Object-Oriented Database Programming Languages
        • 22.3.1. Loss of correspondence between programming languages ​​and query languages ​​in relational DBMS
        • 22.3.2. Programming languages ​​OODB as object-oriented languages ​​with the support of stable (persistent) objects
        • 22.3.3. Examples of programming languages ​​OOBD
      • 22.4. Object Oriented Database Query Languages
        • 22.4.1. Explicit navigation as a result of overcoming loss of compliance
        • 22.4.2. Non-Navigation Query Languages
        • 22.4.3. Query optimization problems
      • 22.5. Examples of object-oriented DBMS
        • 22.5.1. ORION project
        • 22.5.2. O2 project
    • Lecture 23. Rule-based database systems.
      • 23.1. Extensional and intensional parts of the database
      • 23.2. Active databases
      • 23.3. Deductive databases
created: 2014-09-27
updated: 2024-11-14
213



Rating 9 of 10. count vote: 2
Are you satisfied?:



Comments


To leave a comment
If you have any suggestion, idea, thanks or comment, feel free to write. We really value feedback and are glad to hear your opinion.
To reply

Databases IBM System R - relational DBMS

Terms: Databases IBM System R - relational DBMS