DPC++ graph scheduler class. More...
#include <detail/scheduler/scheduler.hpp>
Classes | |
class | GraphBuilder |
Graph builder class. More... | |
class | GraphProcessor |
Graph Processor provides interfaces for enqueueing commands and their dependencies to the underlying runtime. More... | |
struct | StreamBuffers |
Stream buffers structure. More... | |
Public Member Functions | |
EventImplPtr | addCG (std::unique_ptr< detail::CG > CommandGroup, QueueImplPtr Queue) |
Registers a command group, and adds it to the dependency graph. More... | |
EventImplPtr | addCopyBack (Requirement *Req) |
Registers a command group, that copies most recent memory to the memory pointed by the requirement. More... | |
void | waitForEvent (EventImplPtr Event) |
Waits for the event. More... | |
void | removeMemoryObject (detail::SYCLMemObjI *MemObj) |
Removes buffer from the graph. More... | |
void | cleanupFinishedCommands (EventImplPtr FinishedEvent) |
Removes finished non-leaf non-alloca commands from the subgraph (assuming that all its commands have been waited for). More... | |
EventImplPtr | addHostAccessor (Requirement *Req) |
Adds nodes to the graph, that update the requirement with the pointer to the host memory. More... | |
void | releaseHostAccessor (Requirement *Req) |
Unblocks operations with the memory object. More... | |
void | allocateStreamBuffers (stream_impl *, size_t, size_t) |
Allocate buffers in the pool for a provided stream. More... | |
void | deallocateStreamBuffers (stream_impl *) |
Deallocate all stream buffers in the pool. More... | |
QueueImplPtr | getDefaultHostQueue () |
Scheduler () | |
~Scheduler () | |
Static Public Member Functions | |
static Scheduler & | getInstance () |
static MemObjRecord * | getMemObjRecord (const Requirement *const Req) |
Protected Types | |
using | RWLockT = std::shared_timed_mutex |
using | ReadLockT = std::shared_lock< RWLockT > |
using | WriteLockT = std::unique_lock< RWLockT > |
Protected Member Functions | |
void | acquireWriteLock (WriteLockT &Lock) |
Provides exclusive access to std::shared_timed_mutex object with deadlock avoidance. More... | |
void | cleanupCommands (const std::vector< Command * > &Cmds) |
void | waitForRecordToFinish (MemObjRecord *Record, ReadLockT &GraphReadLock) |
This function waits on all of the graph leaves which somehow use the memory object which is represented by Record . More... | |
Static Protected Member Functions | |
static void | enqueueLeavesOfReqUnlocked (const Requirement *const Req, std::vector< Command * > &ToCleanUp) |
Protected Attributes | |
GraphBuilder | MGraphBuilder |
RWLockT | MGraphLock |
std::vector< Command * > | MDeferredCleanupCommands |
std::mutex | MDeferredCleanupMutex |
QueueImplPtr | DefaultHostQueue |
std::recursive_mutex | StreamBuffersPoolMutex |
std::unordered_map< stream_impl *, StreamBuffers * > | StreamBuffersPool |
Friends | |
class | Command |
class | DispatchHostTask |
class | queue_impl |
class | event_impl |
class | stream_impl |
void | initStream (StreamImplPtr, QueueImplPtr) |
DPC++ graph scheduler class.
The Scheduler is a part of DPC++ RT which ensures correct execution of command groups. To achieve this Scheduler manages acyclic dependency graph (which can have independent sub-graphs) that consists of several types of nodes that represent specific commands:
As the main input Scheduler takes a command group and returns an event representing it, so it can be waited on later. When a new command group comes, Scheduler adds one or more nodes to the graph depending on the command groups' requirements. For example, if a new command group is submitted to the SYCL context which has the latest data for all the requirements, Scheduler adds a new "Execute command group" command making it dependent on all commands affecting new command group's requirements. But if one of the requirements has no up-to-date instance in the context which the command group is submitted to, Scheduler additionally inserts copy memory command (together with allocate memory command if needed).
A simple graph looks like:
Where nodes represent commands and edges represent dependencies between them. There are three commands connected by arrows which mean that before executing second command group the first one must be executed. Also before executing the first command group memory allocation must be performed.
At some point Scheduler enqueues commands to the underlying devices. To do this, Scheduler performs topological sort to get the order in which commands should be enqueued. For example, the following graph (D depends on B and C, B and C depends on A) will be enqueued in the following order:
The Scheduler is split up into two parts: graph builder and graph processor.
To build dependencies, Scheduler needs to memorize memory objects and commands that modify them.
To detect that two command groups access the same memory object and create a dependency between them, Scheduler needs to store information about the memory object.
To ensure thread safe execution of methods, Scheduler provides access to the graph that's guarded by a read-write mutex (analog of shared mutex from C++17).
A read-write mutex allows concurrent access to read-only operations, while write operations require exclusive access.
All the methods of GraphBuilder lock the mutex in write mode because these methods can modify the graph. Methods of GraphProcessor lock the mutex in read mode as they are not modifying the graph.
There are two sources of errors that needs to be handled in Scheduler:
If an error occurs during command enqueue process, the Command::enqueue method returns the faulty command. Scheduler then reschedules the command and all dependent commands (if any).
An error with command processing can happen in underlying runtime, in this case Scheduler is notified asynchronously (using callback mechanism) what triggers rescheduling.
Definition at line 358 of file scheduler.hpp.
|
protected |
Definition at line 453 of file scheduler.hpp.
|
protected |
Definition at line 452 of file scheduler.hpp.
|
protected |
Definition at line 454 of file scheduler.hpp.
cl::sycl::detail::Scheduler::Scheduler | ( | ) |
Definition at line 384 of file scheduler.cpp.
References cl::sycl::detail::getSyclObjImpl().
cl::sycl::detail::Scheduler::~Scheduler | ( | ) |
Definition at line 393 of file scheduler.cpp.
References cl::sycl::detail::pi::PI_TRACE_BASIC, and cl::sycl::detail::pi::trace().
|
protected |
Provides exclusive access to std::shared_timed_mutex object with deadlock avoidance.
Lock | is an instance of WriteLockT, created with std::defer_lock |
Definition at line 415 of file scheduler.cpp.
EventImplPtr cl::sycl::detail::Scheduler::addCG | ( | std::unique_ptr< detail::CG > | CommandGroup, |
QueueImplPtr | Queue | ||
) |
Registers a command group, and adds it to the dependency graph.
It's called by SYCL's queue.submit.
CommandGroup | is a unique_ptr to a command group to be added. |
Definition at line 73 of file scheduler.cpp.
References cl::sycl::detail::Command::getEvent(), cl::sycl::detail::initStream(), cl::sycl::detail::Command::MDeps, cl::sycl::detail::EnqueueResultT::MResult, cl::sycl::detail::Command::MUsers, and PI_INVALID_OPERATION.
EventImplPtr cl::sycl::detail::Scheduler::addCopyBack | ( | Requirement * | Req | ) |
Registers a command group, that copies most recent memory to the memory pointed by the requirement.
Req | is a requirement that points to the memory where data is needed. |
Definition at line 173 of file scheduler.cpp.
References cl::sycl::detail::Command::getEvent(), cl::sycl::detail::Command::getQueue(), cl::sycl::detail::EnqueueResultT::MResult, and PI_INVALID_OPERATION.
Referenced by cl::sycl::detail::SYCLMemObjT::updateHostMemory().
EventImplPtr cl::sycl::detail::Scheduler::addHostAccessor | ( | Requirement * | Req | ) |
Adds nodes to the graph, that update the requirement with the pointer to the host memory.
Assumes the host pointer contains the latest data. New operations with the same memory object that have side effects are blocked until releaseHostAccessor(Requirement *Req) is callled.
Req | is the requirement to be updated. |
Definition at line 301 of file scheduler.cpp.
References cl::sycl::detail::Command::getEvent(), cl::sycl::detail::EnqueueResultT::MResult, and PI_INVALID_OPERATION.
void cl::sycl::detail::Scheduler::allocateStreamBuffers | ( | stream_impl * | Impl, |
size_t | StreamBufferSize, | ||
size_t | FlushBufferSize | ||
) |
Allocate buffers in the pool for a provided stream.
Impl | to the stream object |
StreamBufferSize | of the stream buffer |
FlushBufferSize | of the flush buffer for a single work item |
Definition at line 370 of file scheduler.cpp.
Referenced by cl::sycl::detail::stream_impl::stream_impl().
|
protected |
Definition at line 440 of file scheduler.cpp.
Referenced by cl::sycl::detail::DispatchHostTask::operator()().
void cl::sycl::detail::Scheduler::cleanupFinishedCommands | ( | EventImplPtr | FinishedEvent | ) |
Removes finished non-leaf non-alloca commands from the subgraph (assuming that all its commands have been waited for).
FinishedEvent | is a cleanup candidate event. |
Definition at line 233 of file scheduler.cpp.
References cl::sycl::detail::deallocateStreams().
Referenced by cl::sycl::detail::event_impl::cleanupCommand().
void cl::sycl::detail::Scheduler::deallocateStreamBuffers | ( | stream_impl * | Impl | ) |
Deallocate all stream buffers in the pool.
Impl | to the stream object |
Definition at line 378 of file scheduler.cpp.
|
staticprotected |
Definition at line 354 of file scheduler.cpp.
References cl::sycl::detail::MemObjRecord::MReadLeaves, cl::sycl::detail::SYCLMemObjI::MRecord, cl::sycl::detail::EnqueueResultT::MResult, cl::sycl::detail::AccessorImplHost::MSYCLMemObj, cl::sycl::detail::MemObjRecord::MWriteLeaves, and PI_INVALID_OPERATION.
|
inline |
Definition at line 442 of file scheduler.hpp.
|
static |
Definition at line 209 of file scheduler.cpp.
Referenced by cl::sycl::detail::stream_impl::accessGlobalBuf(), cl::sycl::detail::stream_impl::accessGlobalFlushBuf(), cl::sycl::detail::stream_impl::accessGlobalOffset(), cl::sycl::detail::event_impl::cleanupCommand(), cl::sycl::detail::stream_impl::flush(), cl::sycl::detail::Command::processDepEvent(), cl::sycl::detail::stream_impl::stream_impl(), cl::sycl::detail::SYCLMemObjT::updateHostMemory(), cl::sycl::detail::event_impl::wait(), and cl::sycl::detail::event_impl::wait_and_throw().
|
static |
Definition at line 436 of file scheduler.cpp.
References cl::sycl::detail::SYCLMemObjI::MRecord, and cl::sycl::detail::AccessorImplHost::MSYCLMemObj.
void cl::sycl::detail::Scheduler::releaseHostAccessor | ( | Requirement * | Req | ) |
Unblocks operations with the memory object.
Req | is a requirement that points to the memory object being unblocked. |
Definition at line 338 of file scheduler.cpp.
References cl::sycl::detail::AccessorImplHost::MBlockedCmd, and cl::sycl::detail::Command::MEnqueueStatus.
void cl::sycl::detail::Scheduler::removeMemoryObject | ( | detail::SYCLMemObjI * | MemObj | ) |
Removes buffer from the graph.
The lifetime of memory object descriptor begins when the first command group that uses the memory object is submitted and ends when "removeMemoryObject(...)" method is called which means there will be no command group that uses the memory object. When removeMemoryObject is called Scheduler will enqueue and wait on all release commands associated with the memory object, which effectively guarantees that all commands accessing the memory object are complete and then the resources allocated for the memory object are freed. Then all the commands affecting the memory object are removed.
This member function is used by buffer and image.
MemObj | is a memory object that points to the buffer being removed. |
Definition at line 261 of file scheduler.cpp.
References cl::sycl::detail::deallocateStreams().
Referenced by cl::sycl::detail::SYCLMemObjT::updateHostMemory().
void cl::sycl::detail::Scheduler::waitForEvent | ( | EventImplPtr | Event | ) |
Waits for the event.
This operation is blocking. For eager execution mode this method invokes corresponding function of device API.
Event | is a pointer to event to wait on. |
Definition at line 213 of file scheduler.cpp.
Referenced by cl::sycl::detail::event_impl::wait().
|
protected |
This function waits on all of the graph leaves which somehow use the memory object which is represented by Record
.
The function is called upon destruction of memory buffer.
Record | memory record to await graph leaves of to finish |
GraphReadLock | locked graph read lock |
GraphReadLock will be unlocked/locked as needed. Upon return from the function, GraphReadLock will be left in locked state.
Definition at line 29 of file scheduler.cpp.
References cl::sycl::detail::Command::getEvent(), cl::sycl::detail::AllocaCommandBase::getReleaseCmd(), cl::sycl::detail::MemObjRecord::MAllocaCommands, cl::sycl::detail::MemObjRecord::MReadLeaves, cl::sycl::detail::EnqueueResultT::MResult, cl::sycl::detail::MemObjRecord::MWriteLeaves, PI_INVALID_OPERATION, and cl::sycl::detail::Command::resolveReleaseDependencies().
|
friend |
Definition at line 775 of file scheduler.hpp.
|
friend |
Definition at line 776 of file scheduler.hpp.
|
friend |
Definition at line 778 of file scheduler.hpp.
|
friend |
Definition at line 19 of file scheduler_helpers.cpp.
|
friend |
Definition at line 777 of file scheduler.hpp.
|
friend |
Definition at line 810 of file scheduler.hpp.
|
protected |
Definition at line 773 of file scheduler.hpp.
|
protected |
Definition at line 770 of file scheduler.hpp.
|
protected |
Definition at line 771 of file scheduler.hpp.
|
protected |
Definition at line 767 of file scheduler.hpp.
Referenced by cl::sycl::detail::Command::processDepEvent().
|
protected |
Definition at line 768 of file scheduler.hpp.
Referenced by cl::sycl::detail::DispatchHostTask::operator()(), and cl::sycl::detail::event_impl::wait_and_throw().
|
protected |
Definition at line 824 of file scheduler.hpp.
Referenced by cl::sycl::detail::stream_impl::accessGlobalBuf(), cl::sycl::detail::stream_impl::accessGlobalFlushBuf(), and cl::sycl::detail::stream_impl::flush().
|
protected |
Definition at line 814 of file scheduler.hpp.