Skip to content

Commit d56ace5

Browse files
authored
Fix doc warnings for invalid tag (#410)
* Clean up white space * Fix missing end of line markup warning Signed-off-by: Russell McGuire <russell.w.mcguire@intel.com>
1 parent 39a29ea commit d56ace5

2 files changed

Lines changed: 88 additions & 89 deletions

File tree

scripts/core/EXT_DeviceUsablememProperties.rst

Lines changed: 7 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,9 @@ from templates import helper as th
1010

1111
.. _ZE_extension_device_usablemem_size_properties:
1212

13-
======================================
13+
================================================
1414
Device Usable Memory Size Properties Extension
15-
======================================
15+
================================================
1616

1717
API
1818
----
@@ -29,11 +29,11 @@ API
2929

3030

3131
Extended Device Properties
32-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
32+
~~~~~~~~~~~~~~~~~~~~~~~~~~~
3333

34-
Users may wish get the status of the available allocatable/usable memory. Since this is a transient information based on the overall state of the device usage, user would need to invoke the extension to obtain the information at each point of interest. This extension provides extended information about the usable memory size available as part of the device. The extension introduces the ${x}_device_usablemem_size_ext_properties_t struct which can be passed to $xDeviceGetProperties via the `pNext` member of $x_device_properties_t.
34+
Users may wish get the status of the available allocatable/usable memory. Since this is a transient information based on the overall state of the device usage, user would need to invoke the extension to obtain the information at each point of interest. This extension provides extended information about the usable memory size available as part of the device. The extension introduces the ${x}_device_usablemem_size_ext_properties_t struct which can be passed to ${x}DeviceGetProperties via the `pNext` member of ${x}_device_properties_t.
3535

36-
The following psuedo-code demonstrates a sequence for obtaining extended information about the usable memory size
36+
The following psuedo-code demonstrates a sequence for obtaining extended information about the usable memory size:
3737

3838
.. parsed-literal::
3939
@@ -49,9 +49,8 @@ The following psuedo-code demonstrates a sequence for obtaining extended informa
4949
pUsablememProps.stype = ZE_STRUCTURE_TYPE_DEVICE_USABLEMEM_SIZE_EXT_PROPERTIES;
5050
deviceProperties.stype = ZE_STRUCTURE_TYPE_DEVICE_PROPERTIES;
5151
deviceProperties.pNext = pUsablememProps;
52-
52+
5353
//obtain device and extended memory properties
5454
zeDeviceGetProperties(Deviceh, &deviceProperties);
55-
56-
55+
...
5756

scripts/core/PROG.rst

Lines changed: 81 additions & 81 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,8 @@
1-
2-
<%
1+
<%
32
OneApi=tags['$OneApi']
43
x=tags['$x']
54
X=x.upper()
6-
%>
7-
8-
<%!
5+
%><%!
96
from parse_specs import _version_compare_less, _version_compare_gequal
107
%>
118

@@ -442,69 +439,69 @@ An application can query properties of a physical memory object using ${x}Physic
442439
The following pseudo-code demonstrates querying properties of a physical memory object:
443440

444441
.. parsed-literal::
445-
446-
// Set up the request for an exportable allocation
447442
448-
ze_external_memory_export_desc_t export_desc = {
449-
ZE_STRUCTURE_TYPE_EXTERNAL_MEMORY_EXPORT_DESC,
450-
nullptr, // pNext
443+
// Set up the request for an exportable allocation
444+
445+
ze_external_memory_export_desc_t export_desc = {
446+
ZE_STRUCTURE_TYPE_EXTERNAL_MEMORY_EXPORT_DESC,
447+
nullptr, // pNext
451448
ZE_EXTERNAL_MEMORY_TYPE_FLAG_OPAQUE_FD
452-
};
449+
};
453450
454-
ze_physical_mem_desc_t alloc_desc = {
455-
.stype = ZE_STRUCTURE_TYPE_PHYSICAL_MEM_DESC,
456-
.pNext = &export_desc,
457-
.flags = 0,
458-
.size = 1024
451+
ze_physical_mem_desc_t alloc_desc = {
452+
.stype = ZE_STRUCTURE_TYPE_PHYSICAL_MEM_DESC,
453+
.pNext = &export_desc,
454+
.flags = 0,
455+
.size = 1024
459456
};
460457
461-
ze_physical_mem_handle_t hPhysicalMemory;
458+
ze_physical_mem_handle_t hPhysicalMemory;
462459
463460
${x}PhysicalMemCreate(hContext, hDevice, &alloc_desc, &hPhysicalMemory)
464-
465-
// Set up the request to export the external memory handle
466461
467-
ze_external_memory_export_fd_t export_fd = {
468-
ZE_STRUCTURE_TYPE_EXTERNAL_MEMORY_EXPORT_FD,
469-
nullptr, // pNext
470-
ZE_EXTERNAL_MEMORY_TYPE_FLAG_OPAQUE_FD,
471-
0 // [out] fd
472-
};
462+
// Set up the request to export the external memory handle
463+
464+
ze_external_memory_export_fd_t export_fd = {
465+
ZE_STRUCTURE_TYPE_EXTERNAL_MEMORY_EXPORT_FD,
466+
nullptr, // pNext
467+
ZE_EXTERNAL_MEMORY_TYPE_FLAG_OPAQUE_FD,
468+
0 // [out] fd
469+
};
473470
474-
// Link the export request into the query
471+
// Link the export request into the query
475472
476-
ze_physical_mem_properties_t physicalMemProperties = {
477-
ZE_STRUCTURE_TYPE_PHYSICAL_MEM_PROPERTIES
478-
};
473+
ze_physical_mem_properties_t physicalMemProperties = {
474+
ZE_STRUCTURE_TYPE_PHYSICAL_MEM_PROPERTIES
475+
};
479476
480-
physicalMemProperties.pNext = &export_fd;
477+
physicalMemProperties.pNext = &export_fd;
481478
482-
${x}PhysicalMemGetProperties(hContext, hPhysicalMemory, &physicalMemProperties)
479+
${x}PhysicalMemGetProperties(hContext, hPhysicalMemory, &physicalMemProperties)
483480
484-
// User sends exportFd.fd to a peer process
481+
// User sends exportFd.fd to a peer process
485482
int imported_fd = /\* fd received from peer process \*/;
486-
// For importing reuse existing structs
483+
// For importing reuse existing structs
487484
488-
ze_external_memory_import_fd_t import_fd = {
485+
ze_external_memory_import_fd_t import_fd = {
489486
490-
.stype = ZE_STRUCTURE_TYPE_EXTERNAL_MEMORY_IMPORT_FD,
487+
.stype = ZE_STRUCTURE_TYPE_EXTERNAL_MEMORY_IMPORT_FD,
491488
492-
.pNext = nullptr,
489+
.pNext = nullptr,
493490
494491
.flags = ZE_EXTERNAL_MEMORY_TYPE_FLAG_OPAQUE_FD,
495492
496-
.fd = imported_fd
493+
.fd = imported_fd
497494
498495
};
499496
500-
ze_physical_mem_desc_t alloc_desc = {
497+
ze_physical_mem_desc_t alloc_desc = {
501498
502-
.stype = ZE_STRUCTURE_TYPE_PHYSICAL_MEM_DESC,
499+
.stype = ZE_STRUCTURE_TYPE_PHYSICAL_MEM_DESC,
503500
504-
.pNext = &import_fd,
501+
.pNext = &import_fd,
505502
506-
.flags = 0,
507-
.size = 1024
503+
.flags = 0,
504+
.size = 1024
508505
};
509506
510507
${x}PhysicalMemCreate(hContext, hDevice, &alloc_desc, &physicalMemImporter);
@@ -784,7 +781,7 @@ The following pseudo-code demonstrates how to import a Linux dma_buf as an exter
784781
The following pseudo-code demonstrates how to import a Linux dma_buf as an external memory handle for Physical Memory:
785782

786783
.. parsed-literal::
787-
784+
788785
// Set up the request to import the external memory handle
789786
${x}_external_memory_import_fd_t import_fd = {
790787
${X}_STRUCTURE_TYPE_EXTERNAL_MEMORY_IMPORT_FD,
@@ -793,11 +790,11 @@ The following pseudo-code demonstrates how to import a Linux dma_buf as an exter
793790
fd
794791
};
795792
796-
ze_physical_mem_desc_t allocDesc = {
797-
.stype = ZE_STRUCTURE_TYPE_PHYSICAL_MEM_DESC,
798-
.pNext = &import_fd,
799-
.flags = 0,
800-
.size = 1024
793+
ze_physical_mem_desc_t allocDesc = {
794+
.stype = ZE_STRUCTURE_TYPE_PHYSICAL_MEM_DESC,
795+
.pNext = &import_fd,
796+
.flags = 0,
797+
.size = 1024
801798
};
802799
803800
${x}PhysicalMemCreate(hContext, hDevice, &allocDesc, &physicalMemImporter);
@@ -1273,12 +1270,12 @@ A kernel timestamp event is a special type of event that records device timestam
12731270
.. parsed-literal::
12741271
12751272
// Get timestamp frequency
1276-
%if _version_compare_gequal(ver, "1.1"):
1273+
%if _version_compare_gequal(ver, "1.1"):
12771274
const double timestampFreq = NS_IN_SEC / deviceProperties.timerResolution;
1278-
%endif
1279-
%if _version_compare_less(ver, "1.1"):
1275+
%endif
1276+
%if _version_compare_less(ver, "1.1"):
12801277
const uint64_t timestampFreq = deviceProperties.timerResolution;
1281-
%endif
1278+
%endif
12821279
const uint64_t timestampMaxValue = ~(-1L << deviceProperties.kernelTimestampValidBits);
12831280
12841281
// Create event pool
@@ -1331,39 +1328,43 @@ A kernel timestamp event is a special type of event that records device timestam
13311328
double contextTimeInNs = ( tsResult->context.kernelEnd >= tsResult->context.kernelStart )
13321329
? ( tsResult->context.kernelEnd - tsResult->context.kernelStart ) * timestampFreq
13331330
: (( timestampMaxValue - tsResult->context.kernelStart) + tsResult->context.kernelEnd + 1 ) * timestampFreq;
1331+
13341332
...
1335-
1333+
1334+
13361335
Event synchronization mode
1337-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1336+
~~~~~~~~~~~~~~~~~~~~~~~~~~~
1337+
13381338
User can adjust Event synchronization modes by passing ${x}_event_sync_mode_desc_t struct as pNext during Event creation.
13391339

13401340
Low power wait
1341-
^^^^^^^^^^^^^^^^^^
1341+
^^^^^^^^^^^^^^^
13421342

1343-
When ${X}_EVENT_SYNC_MODE_FLAG_LOW_POWER_WAIT flag is enabled, driver will optimize Event host synchronization calls like ${x}EventHostSynchronize to use CPU threads more efficiently. For example, instead of active polling on memory location, it may use OS methods to sleep CPU thread.
1343+
When ${X}_EVENT_SYNC_MODE_FLAG_LOW_POWER_WAIT flag is enabled, driver will optimize Event host synchronization calls like ${x}EventHostSynchronize to use CPU threads more efficiently. For example, instead of active polling on memory location, it may use OS methods to sleep CPU thread.
13441344
Changing this mode may impact completion latency.
13451345

13461346
Interrups
1347-
^^^^^^^^^^^^^^^^^^^^^
1347+
^^^^^^^^^^
13481348

1349-
When ${X}_EVENT_SYNC_MODE_FLAG_SIGNAL_INTERRUPT flag is enabled, driver may program additional GPU commands related to signaling Event on the Device. Those commands will generate system interrupt.
1350-
Interrupt may be used as additional signal to wake up CPU thread that is waiting for Event completion in low power mode.
1349+
When ${X}_EVENT_SYNC_MODE_FLAG_SIGNAL_INTERRUPT flag is enabled, driver may program additional GPU commands related to signaling Event on the Device. Those commands will generate system interrupt.
1350+
Interrupt may be used as additional signal to wake up CPU thread that is waiting for Event completion in low power mode.
13511351
Driver may select which API calls are applicable for generating interrupts.
13521352

1353-
Additionally, user may provide external interrupt id (${X}_EVENT_SYNC_MODE_FLAG_EXTERNAL_INTERRUPT_WAIT). OS methods will be used for Event host synchronization calls, to optimize waiting for completion. Similar to low power mode.
1353+
Additionally, user may provide external interrupt id (${X}_EVENT_SYNC_MODE_FLAG_EXTERNAL_INTERRUPT_WAIT). OS methods will be used for Event host synchronization calls, to optimize waiting for completion. Similar to low power mode.
13541354
It can be used only with Counter Based Events.
13551355

13561356
.. _counter-based-events:
1357+
13571358
Counter Based Events
1358-
~~~~~~~~~~~~~~~~~~~~~~~
1359+
~~~~~~~~~~~~~~~~~~~~
13591360

1360-
This type of event, referred to as a Counter Based (CB) Event, does not require an event pool, as the related allocations are managed internally by the driver. This reduces the overhead on the host for managing pool allocations.
1361-
The CB Event can only be signaled on the GPU using an in-order command list.
1361+
This type of event, referred to as a Counter Based (CB) Event, does not require an event pool, as the related allocations are managed internally by the driver. This reduces the overhead on the host for managing pool allocations.
1362+
The CB Event can only be signaled on the GPU using an in-order command list.
13621363

13631364
Every in-order command list has an internal submission counter that is updated with each append call. This counter manages internal in-order dependencies. The next append call waits for that counter implicitly.
1364-
Note that some operations may be optimized, and the counter value may not directly correspond to the number of append calls.
1365+
Note that some operations may be optimized, and the counter value may not directly correspond to the number of append calls.
13651366

1366-
When a CB Event is passed as a signal event, it points to a specific counter value and memory location. Since the command list manages the counter allocation, this method avoids producing additional GPU memory operations (except timestamps). As a result, users do not need to explicitly control event completion before reusing it.
1367+
When a CB Event is passed as a signal event, it points to a specific counter value and memory location. Since the command list manages the counter allocation, this method avoids producing additional GPU memory operations (except timestamps). As a result, users do not need to explicitly control event completion before reusing it.
13671368

13681369
Key features
13691370
^^^^^^^^^^^^^^^^^^^^^
@@ -1390,32 +1391,31 @@ Regular Event rely on memory state controlled by the user (explicit Reset calls)
13901391
${x}CommandListAppendLaunchKernel(cmdList3, kernel, &groupCount, &event1, 0, nullptr); // Replace state. Assigned counter=Y on memory CL3_alloc
13911392
13921393
// Event1 is implicitly reset to different state.
1393-
// cmdList2 can be still running on GPU. It waits for counter=X on memory CL1_alloc.
1394+
// cmdList2 can be still running on GPU. It waits for counter=X on memory CL1_alloc.
13941395
// Its also safe to delete Event object.
13951396
13961397
${x}EventHostSynchronize(event1, UINT32_MAX); // wait for counter=Y on memory CL3_alloc
13971398
13981399
IPC sharing
1399-
^^^^^^^^^^^^^^^^^^^^^
1400+
^^^^^^^^^^^
14001401
As mentioned previously, signaling CB Event replaces its state. This is why IPC sharing is one-directional. Opened event can be used only for waiting/querying (on host and GPU).
14011402

1402-
Both Event object (original and shared) are independent. There is no need to wait for completion before reusing.
1403-
Second process points to the original state until ${x}EventCounterBasedCloseIpcHandle is called.
1404-
Original Event state may be changed without waiting for completion. Second process is not affected.
1403+
Both Event object (original and shared) are independent. There is no need to wait for completion before reusing.
1404+
Second process points to the original state until ${x}EventCounterBasedCloseIpcHandle is called.
1405+
Original Event state may be changed without waiting for completion. Second process is not affected.
14051406

14061407
Counter Based Event has dedicated API calls to handle IPC operations:${x}EventCounterBasedGetIpcHandle, ${x}EventCounterBasedOpenIpcHandle, ${x}EventCounterBasedCloseIpcHandle
14071408

14081409
**Timestamps are not allowed for IPC sharing.**
14091410

14101411
Obtaining counter memory and value
1411-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1412-
User may obtain counter memory location and value using ${x}EventCounterBasedGetDeviceAddress. For example, waiting for completion outside the L0 Driver.
1413-
If Event state is replaced by new append call or ${x}CommandQueueExecuteCommandLists that signals such Event, below API must be called again to obtain new data.
1412+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1413+
User may obtain counter memory location and value using ${x}EventCounterBasedGetDeviceAddress. For example, waiting for completion outside the L0 Driver. If Event state is replaced by new append call or ${x}CommandQueueExecuteCommandLists that signals such Event, below API must be called again to obtain new data.
14141414

14151415
Multi directional dependencies on Regular command lists
1416-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1417-
Regular command list with overlapping dependencies may be executed multiple times. For example, two command lists are executed in parallel with bi-directional dependencies.
1418-
Its important to understand counter (Event) state transition, to correctly reflect users intention.
1416+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1417+
Regular command list with overlapping dependencies may be executed multiple times. For example, two command lists are executed in parallel with bi-directional dependencies.
1418+
Its important to understand counter (Event) state transition, to correctly reflect users intention.
14191419

14201420

14211421
.. parsed-literal::
@@ -1425,8 +1425,8 @@ Its important to understand counter (Event) state transition, to correctly refle
14251425
V |
14261426
regularCmdList2: (wait for A) -------------> (B) -----> (D)
14271427
1428-
In this example, all Events are synchronized to "ready" state after the first execution.
1429-
It means that second execution of `regularCmdList1` waits again for "ready" `{1->2->3}` state of `regularCmdList2` (first execution) instead of `{4->5->6}`.
1428+
In this example, all Events are synchronized to "ready" state after the first execution.
1429+
It means that second execution of `regularCmdList1` waits again for "ready" `{1->2->3}` state of `regularCmdList2` (first execution) instead of `{4->5->6}`.
14301430
This is because `regularCmdList2` was not yet executed for the second time. And their counters were not updated.
14311431

14321432
First execution:
@@ -1452,13 +1452,13 @@ Second execution:
14521452
14531453
Different approach:
14541454

1455-
To avoid above situation, user must remove all bi-directional dependencies. By using single command list (if possible) or split the workload into different command lists with single-directional dependencies.
1455+
To avoid above situation, user must remove all bi-directional dependencies. By using single command list (if possible) or split the workload into different command lists with single-directional dependencies.
14561456

14571457
Using Counter Based Events for such scenarios is not always the most optimal usage mode. It may be better to use Regular Events with explicit Reset calls.
14581458

14591459
External synchronization allocation
14601460
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1461-
User may optionally specify externally managed counter allocation and value. This can be done by passing ${x}_event_counter_based_external_sync_allocation_desc_t as extension of ${x}_event_counter_based_desc_t
1461+
User may optionally specify externally managed counter allocation and value. This can be done by passing ${x}_event_counter_based_external_sync_allocation_desc_t as extension of ${x}_event_counter_based_desc_t
14621462

14631463
Requirements:
14641464

@@ -1472,7 +1472,7 @@ Requirements:
14721472
External aggregate storage
14731473
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14741474

1475-
Aggregated storage event is a special use case for CB Events. It can be signaled from multiple append calls, but waiting requires only one memory compare operation.
1475+
Aggregated storage event is a special use case for CB Events. It can be signaled from multiple append calls, but waiting requires only one memory compare operation.
14761476
It can be created by passing ${x}_event_counter_based_external_aggregate_storage_desc_t as extension of ${x}_event_counter_based_desc_t.
14771477

14781478
Requirements:

0 commit comments

Comments
 (0)