Lecture
Consider the various streaming models that are implemented in modern operating systems (preemptive, cooperative threads). We will also take a brief look at how threads and synchronization tools are implemented in the Win32 API and Posix Threads. Although scripting languages are more popular on Habré, however everyone should know the basics;)
System call (syscall). This concept, you will meet quite often in this article, but despite the power of the sound, its definition is quite simple :) The system call is the process of calling the kernel function from the user application. Kernel mode - code that runs in the processor's zero protection ring (ring0) with maximum privileges. User Mode - code executed in the third processor protection ring (ring3) has lower privileges. If the code in ring3 will use one of the prohibited instructions (for example, rdmsr / wrmsr, in / out, attempt to read the cr3 register, cr4, etc.), a hardware exception will occur and the user process whose code was executed by the processor in most cases will be interrupted . A system call performs a transition from kernel mode to user mode by calling the syscall / sysenter instruction, int2eh to Win2k, int80h to Linux, etc.
And so, what is the flow? A thread is the essence of the operating system, the process of executing a set of instructions on a processor, more precisely, of program code. Threads general purpose - parallel execution of two or more different tasks on a processor. As you can guess, the threads were the first step towards a multitasking OS. The OS scheduler, guided by the thread priority, distributes time slices between different threads and sets the threads to execute.
Along with the flow, there is also such an entity as a process. Process (process) - nothing more than a kind of abstraction that encapsulates all the resources of the process (open files, files mapped in memory ...) and their descriptors, streams, etc. Each process has at least one thread. Each process also has its own virtual address space and execution context, and the threads of one process share the process address space.
Every thread, like every process, has its own context. A context is a structure in which the following elements are stored:
It should also be noted that in the case of the execution of a system call by a thread and a transition from user mode to kernel mode, the stack stack changes to the kernel stack. When switching the flow of one process to another, the OS updates some processor registers that are responsible for virtual memory mechanisms (for example, CR3), since different processes have different virtual address space. Here I do not specifically touch upon aspects regarding the kernel mode, since such things are specific to a single OS.
In general, the following recommendations are true:
Fiber (fiber) - lightweight stream running in user mode. The fiber will require significantly fewer resources, and in some cases it allows minimizing the number of system calls and, consequently, increasing productivity. Usually, the fibers are executed in the context of the thread that created them and will only require saving the processor registers when switching them. Somehow, but the fibers did not find their popularity. They were implemented at one time in a variety of BSD OS, but over time they were thrown out. The Win32 API also implements the fiber mechanism, but it is used only to facilitate porting software written for another OS. It should be noted that either the process level scheduler is responsible for switching the fibers, or the switching must be implemented in the application itself, in other words manually :)
Since the classification of flows is an ambiguous question, I propose to classify them in the following way:
As I mentioned, threads can be created not only in kernel mode, but also in user mode. There can be several thread schedulers in the OS:
So. Model 1: 1 is the simplest model. According to its principles, any thread created in any process is controlled directly by the OS kernel scheduler. Those. we have a mapping of 1 to 1 user process flow to kernel flow. This model has been implemented in Linux since the 2.6 kernel, as well as Windows.
The N: M model maps some number of N user process threads to M kernel mode threads. Simply put, we have some kind of hybrid system, when some threads are put to run in the OS scheduler, and most of them are in the process flow planner or thread library. As an example, GNU Portable Threads. This model is rather difficult to implement, but it has greater performance, since a significant number of system calls can be avoided.
Model N: 1 . As you probably guessed, a lot of user process threads are mapped to one thread of the OS kernel. For example fiber.
In the days of DOS, when single-tasking OSs ceased to satisfy the consumer, programmers and architects conceived of implementing a multi-tasking OS. The simplest solution was the following: take the total number of threads, determine some minimum interval for executing one thread, and take and divide between all the brat-threads and the execution time equally. So the concept of cooperative multitasking (cooperative multitasking), i.e. all threads are executed alternately, with equal execution time. No other thread can displace the currently executing thread. This very simple and obvious approach has found its application in all versions of Mac OS up to Mac OS X, also in Windows up to Windows 95, and Windows NT. Until now, cooperative multitasking has been used in Win32 to run 16 bit applications. Also for compatibility, cooperative multitasking is used by the stream manager in Carbon applications for Mac OS X.
However, cooperative multitasking over time showed its inconsistency. The volumes of data stored on the hard drives grew, the speed of data transmission in networks also increased. It became clear that some of them should have a higher priority, such as device interrupt service flows, processing synchronous IO operations, etc. At this time, each thread and process in the system acquired such a property as priority. Read more about the priorities of threads and processes in the Win32 API you can read in the book of Jeffrey Richter, we will not stop at this;) Thus, a stream with a higher priority can force out a stream with a smaller one. Such a principle formed the basis of preemptive multitasking ( preemptive multitasking ). Now all modern operating systems use this approach, with the exception of the implementation of fibers in user mode.
As we have already discussed, the implementation of the flow scheduler can be carried out at different levels. So:
If you are still not tired, I offer a small overview of the API for working with threads and synchronization tools in the win32 API. If you are already familiar with the material, feel free to skip this section;)
Threads in Win32 are created using the CreateThread function, where a pointer to a function is passed (let's call it a thread function), which will be executed in the thread created. A thread is considered complete when the thread function is executed. If you want to ensure that the thread is complete, you can use the TerminateThread function, but do not abuse it! This function "kills" the stream, and does not always do it correctly. The ExitThread function will be called implicitly when the thread function ends, or you can call this function yourself. Its main task is to free the stack of the thread and its handle, i.e. kernel structures that serve this thread.
Stream in Win32 can be in sleep state (suspend). You can “put a stream to sleep” by calling the SuspendThread function, and wake it up by calling ResumeThread, you can also put the flow into a sleep state when creating it by setting the CreateSread function's CreateSread parameter. It is not surprising if you do not see such functionality in cross-platform libraries, such as boost :: threads and QT. It's very simple, pthreads just doesn't support this functionality.
Means of synchronization in Win32 are of two types: implemented at the user level, and at the kernel level. The first ones are critical sections ( critical section ), the second set includes mutexes ( mutex ), events ( event ) and semaphores ( semaphore ).
The critical sections are a lightweight synchronization mechanism that works at the user process level and does not use heavy system calls. It is based on the mechanism of mutual locks or spin locks ( spin lock ). A thread that wishes to protect certain data from race conditions calls the EnterCliticalSection / TryEnterCriticalSection function. If the critical section is free, the thread takes it; if not, the thread is blocked (that is, it does not execute and does not eat off the processor time) until the section is released by another thread by calling the LeaveCriticalSection function. These functions are atomic, i.e. You can not worry about the integrity of your data;)
Much has been said about mutexes, events and semaphores, so I’ll not stop at them in detail. It should be noted that all these mechanisms have common features:
It is difficult to imagine which of * nix of similar operating systems does not implement this standard. It should be noted that pthreads are also used in various real-time operating systems (RTOS), therefore the requirement for this library (or rather the standard) is stricter. For example, the pthread stream cannot sleep. There are also no events in pthread, but there is a much more powerful mechanism - conditional variables, which more than covers all the necessary needs.
Let's talk about the differences. For example, a thread in pthreads can be canceled (cancel), i.e. just removed from the execution via the pthread_cancel system call while waiting for a mutex or condition variable to be released, at the time of the pthread_join call (the calling thread is blocked until the thread for which the function was called) stops and so on. d. There are separate calls for working with mutexes and semaphores, such as pthread_mutex_lock / pthread_mutex_unlock, etc.
Conditional variables (cv) are commonly used in pairs with mutexes in more complex cases. If the mutex simply blocks the thread, until another thread releases it, then cv creates conditions where the thread can block itself until any unblocking condition occurs. For example, the cv mechanism helps to emulate events in the pthreads environment. So, the pthread_cond_wait system call waits until the thread has been notified that a specific event has occurred. pthread_cond_signal notifies one thread from the queue that the cv worked. pthread_cond_broadcast notifies all threads that caused pthread_cond_wait to trigger cv.
In order to structure your understanding of what threads are (this word is translated into Russian as “threads” almost everywhere except books on the Win32 API, where it is translated as “threads”) and how they differ from processes, you can use the following two definitions:
Очень важно понять, что thread – это концептуально именно виртуальный процессор и когда мы пишем реализацию threads в ядре ОС или в user-level библиотеке, то мы решаем именно задачу «размножения» центрального процессора во многих виртуальных экземплярах, которые логически или даже физически (на SMP, SMT и multi-core CPU платформах) работают параллельно друг с другом.
На основном, концептуальном уровне, нет никакого «контекста». Контекст – это просто название той структуры данных, в которую ядро ОС или наша библиотека (реализующая threads) сохраняет регистры виртуального процессора , когда она переключается между ними, эмулируя их параллельную работу. Переключение контекстов – это способ реализации threads , а не более фундаментальное понятие, через которое нужно определять thread.
При подходе к определению понятия thread через анализ API конкретных ОС обычно вводят слишком много сущностей – тут тебе и процессы, и адресные пространства, и контексты, и переключения этих контекстов, и прерывания от таймера, и кванты времени с приоритетами, и даже «ресурсы», привязанные к процессам (в противовес threads). И все это сплетено в один клубок и зачастую мы видим, что идем по кругу, читая определения. Увы, это распространенный способ объяснять суть threads в книгах, но такой подход сильно путает начинающих программистов и привязывает их понимание к конкретике реализации.
Понятное дело, что все эти термины имеют право на существование и возникли не случайно, за каждым из них стоит какая-то важная сущность. Но среди них нужно выделить главные и второстепенные (введенные для реализации главных сущностей или навешанные на них сверху, уже на следующих уровнях абстракции).
Главная идея thread – это виртуализация регистров центрального процессора – эмуляция на одном физическом процессоре нескольких логических процессоров, каждый из которых имеет свое собственное состояние регистров (включая указатель команд) и работает параллельно с остальными.
Главное свойство процесса в контексте этого разговора – наличие у него своих собственных страничных таблиц, образующих его индивидуальное address space . The process is not in itself executable.
We can say in the definition that “every process in the system always has at least one thread”. Otherwise, address space is logically devoid of sense for the user if it is not visible at least to one virtual processor (thread). Therefore, it is logical that all modern OSs destroy the address space (complete the process) when the last thread working on this address space is completed. And you can not say in the definition of the process that it has "at least one thread." Moreover, at the lower system level a process (as a rule) can exist as an OS object even without having threads in it.
Если Вы посмотрите исходники, например, ядра Windows, то Вы увидите, что адресное пространство и прочие структуры процесса конструируются до создания в нем начальной нити (начальной thread для этого процесса). По сути, изначально в процессе не существует threads вообще. В Windows можно даже создать thread в чужом адресном пространстве через user-level API…
Если смотреть на thread как на виртуальный процессор – то его привязка к адресному пространству представляет собой загрузку в виртуальный регистр базы станичных таблиц нужного значения. :) Тем более, что на нижнем уровне именно это и происходит – каждый раз при переключении на thread, связанную с другим процессом, ядро ОС перезагружает регистр указателя на страничные таблицы (на тех процессорах, которые не поддерживают на аппаратном уровне работу со многими пространствами одновременно).
Comments
To leave a comment
Operating Systems and System Programming
Terms: Operating Systems and System Programming