DPC++ Runtime
Runtime libraries for oneAPI DPC++
cl::sycl::detail::Scheduler::GraphProcessor Class Reference

Graph Processor provides interfaces for enqueueing commands and their dependencies to the underlying runtime. More...

#include <detail/scheduler/scheduler.hpp>

Static Public Member Functions

static void waitForEvent (EventImplPtr Event, ReadLockT &GraphReadLock, std::vector< Command * > &ToCleanUp, bool LockTheLock=true)
 Waits for the command, associated with Event passed, is completed. More...
 
static bool enqueueCommand (Command *Cmd, EnqueueResultT &EnqueueResult, std::vector< Command * > &ToCleanUp, BlockingT Blocking=NON_BLOCKING)
 Enqueues the command and all its dependencies. More...
 

Detailed Description

Graph Processor provides interfaces for enqueueing commands and their dependencies to the underlying runtime.

Member functions of this class do not modify the graph.

Command enqueueing

Commands are enqueued whenever they come to the Scheduler. Each command has enqueue method which takes vector of events that represents dependencies and returns event which represents the command. GraphProcessor performs topological sort to get the order in which commands have to be enqueued. Then it enqueues each command, passing a vector of events that this command needs to wait on. If an error happens during command enqueue, the whole process is stopped, the faulty command is propagated back to the Scheduler.

The command with dependencies that belong to a context different from its own can't be enqueued directly (limitation of OpenCL runtime). Instead, for each dependency, a proxy event is created in the target context and linked using OpenCL callback mechanism with original one. For example, the following SYCL code:

// The ContextA and ContextB are different OpenCL contexts
sycl::queue Q1(ContextA);
sycl::queue Q2(ContextB);
Q1.submit(Task1);
Q2.submit(Task2);

is translated to the following OCL API calls:

void event_completion_callback(void *data) {
// Change status of event to complete.
clSetEventStatus((cl_event *)data, CL_COMPLETE); // Scope of Context2
}
// Enqueue TASK1
EventTask1 = clEnqueueNDRangeKernel(Q1, TASK1, ..); // Scope of Context1
// Read memory to host
ReadMem = clEnqueueReadBuffer(A, .., /*Deps=*/EventTask1); // Scope of
// Context1
// Create user event with initial status "not completed".
UserEvent = clCreateUserEvent(Context2); // Scope of Context2
// Ask OpenCL to call callback with UserEvent as data when "read memory
// to host" operation is completed
clSetEventCallback(ReadMem, event_completion_callback,
/*data=*/UserEvent); // Scope of Context1
// Enqueue write memory from host, block it on user event
// It will be unblocked when we change UserEvent status to completed in
// callback.
WriteMem =
clEnqueueWriteBuffer(A, .., /*Dep=*/UserEvent); // Scope of Context2
// Enqueue TASK2
EventTask2 =
clEnqueueNDRangeKernel(TASK, .., /*Dep=*/WriteMem); // Scope of
// Context2

The alternative approach that has been considered is to have separate dispatcher thread that would wait for all events from the Context other then target Context to complete and then enqueue command with dependencies from target Context only. Alternative approach makes code significantly more complex and can hurt performance on CPU device vs chosen approach with callbacks.

Definition at line 731 of file scheduler.hpp.

Member Function Documentation

◆ enqueueCommand()

bool cl::sycl::detail::Scheduler::GraphProcessor::enqueueCommand ( Command Cmd,
EnqueueResultT EnqueueResult,
std::vector< Command * > &  ToCleanUp,
BlockingT  Blocking = NON_BLOCKING 
)
static

Enqueues the command and all its dependencies.

Parameters
EnqueueResultis set to specific status if enqueue failed.
ToCleanUpcontainer for commands that can be cleaned up.
Returns
true if the command is successfully enqueued.

The function may unlock and lock GraphReadLock as needed. Upon return the lock is left in locked state.

Definition at line 49 of file graph_processor.cpp.

References cl::sycl::detail::Command::enqueue(), cl::sycl::detail::Command::getPreparedHostDepsEvents(), cl::sycl::detail::Command::isEnqueueBlocked(), cl::sycl::detail::Command::isSuccessfullyEnqueued(), cl::sycl::detail::DepDesc::MDepCommand, and cl::sycl::detail::Command::MDeps.

◆ waitForEvent()

void cl::sycl::detail::Scheduler::GraphProcessor::waitForEvent ( EventImplPtr  Event,
ReadLockT GraphReadLock,
std::vector< Command * > &  ToCleanUp,
bool  LockTheLock = true 
)
static

Waits for the command, associated with Event passed, is completed.

Parameters
GraphReadLockread-lock which is already acquired for reading
ToCleanUpcontainer for commands that can be cleaned up.
LockTheLockselects if graph lock should be locked upon return

The function may unlock and lock GraphReadLock as needed. Upon return the lock is left in locked state if and only if LockTheLock is true.

Definition at line 24 of file graph_processor.cpp.

References cl::sycl::detail::BLOCKING, cl::sycl::detail::getCommand(), cl::sycl::detail::Command::getEvent(), and cl::sycl::detail::EnqueueResultT::MResult.


The documentation for this class was generated from the following files:
cl::sycl::info::queue
queue
Definition: info_desc.hpp:229