HCC
HCC is a single-source, C/C++ compiler for heterogeneous computing. It's optimized with HSA (http://www.hsafoundation.com/).
|
Represents a physical accelerated computing device. More...
#include <hc.hpp>
Public Member Functions | |
accelerator () | |
Constructs a new accelerator object that represents the default accelerator. More... | |
accelerator (const std::wstring &path) | |
Constructs a new accelerator object that represents the physical device named by the "path" argument. More... | |
accelerator (const accelerator &other) | |
Copy constructs an accelerator object. More... | |
accelerator & | operator= (const accelerator &other) |
Assigns an accelerator object to "this" accelerator object and returns a reference to "this" object. More... | |
accelerator_view | get_default_view () const |
Returns the default accelerator_view associated with the accelerator. More... | |
accelerator_view | create_view (execute_order order=execute_in_order, queuing_mode mode=queuing_mode_automatic) |
Creates and returns a new accelerator view on the accelerator with the supplied queuing mode. More... | |
bool | operator== (const accelerator &other) const |
Compares "this" accelerator with the passed accelerator object to determine if they represent the same underlying device. More... | |
bool | operator!= (const accelerator &other) const |
Compares "this" accelerator with the passed accelerator object to determine if they represent different devices. More... | |
bool | set_default_cpu_access_type (access_type type) |
Sets the default_cpu_access_type for this accelerator. More... | |
std::wstring | get_device_path () const |
Returns a system-wide unique device instance path that matches the "Device Instance Path" property for the device in Device Manager, or one of the predefined path constants cpu_accelerator. | |
std::wstring | get_description () const |
Returns a short textual description of the accelerator device. | |
unsigned int | get_version () const |
Returns a 32-bit unsigned integer representing the version number of this accelerator. More... | |
bool | get_has_display () const |
This property indicates that the accelerator may be shared by (and thus have interference from) the operating system or other system software components for rendering purposes. More... | |
size_t | get_dedicated_memory () const |
Returns the amount of dedicated memory (in KB) on an accelerator device. More... | |
bool | get_supports_double_precision () const |
Returns a Boolean value indicating whether this accelerator supports double-precision (double) computations. More... | |
bool | get_supports_limited_double_precision () const |
Returns a boolean value indicating whether the accelerator has limited double precision support (excludes double division, precise_math functions, int to double, double to int conversions) for a parallel_for_each kernel. | |
bool | get_is_debug () const |
Returns a boolean value indicating whether the accelerator supports debugging. | |
bool | get_is_emulated () const |
Returns a boolean value indicating whether the accelerator is emulated. More... | |
bool | get_supports_cpu_shared_memory () const |
Returns a boolean value indicating whether the accelerator supports memory accessible both by the accelerator and the CPU. | |
access_type | get_default_cpu_access_type () const |
Get the default cpu access_type for buffers created on this accelerator. | |
size_t | get_max_tile_static_size () |
Returns the maximum size of tile static area available on this accelerator. | |
std::vector< accelerator_view > | get_all_views () |
Returns a vector of all accelerator_view associated with this accelerator. | |
void * | get_hsa_am_region () const |
Returns an opaque handle which points to the AM region on the HSA agent. More... | |
void * | get_hsa_am_system_region () const |
Returns an opaque handle which points to the AM system region on the HSA agent. More... | |
void * | get_hsa_am_finegrained_system_region () const |
Returns an opaque handle which points to the AM system region on the HSA agent. More... | |
void * | get_hsa_kernarg_region () const |
Returns an opaque handle which points to the Kernarg region on the HSA agent. More... | |
bool | is_hsa_accelerator () const |
Returns if the accelerator is based on HSA. | |
hcAgentProfile | get_profile () const |
Returns the profile the accelerator. More... | |
void | memcpy_symbol (const char *symbolName, void *hostptr, size_t count, size_t offset=0, hcCommandKind kind=hcMemcpyHostToDevice) |
void | memcpy_symbol (void *symbolAddr, void *hostptr, size_t count, size_t offset=0, hcCommandKind kind=hcMemcpyHostToDevice) |
void * | get_symbol_address (const char *symbolName) |
void * | get_hsa_agent () const |
Returns an opaque handle which points to the underlying HSA agent. More... | |
bool | get_is_peer (const accelerator &other) const |
Check if other is peer of this accelerator. More... | |
std::vector< accelerator > | get_peers () const |
Return a std::vector of this accelerator's peers. More... | |
unsigned int | get_cu_count () const |
Return the compute unit count of the accelerator. | |
int | get_seqnum () const |
Return the unique integer sequence-number for the accelerator. More... | |
bool | has_cpu_accessible_am () |
Return true if the accelerator's memory can be mapped into the CPU's address space, and the CPU is allowed to access the memory directly with CPU memory operations. More... | |
Kalmar::KalmarDevice * | get_dev_ptr () const |
Static Public Member Functions | |
static std::vector< accelerator > | get_all () |
Returns a std::vector of accelerator objects (in no specific order) representing all accelerators that are available, including reference accelerators and WARP accelerators if available. More... | |
static bool | set_default (const std::wstring &path) |
Sets the default accelerator to the device path identified by the "path" argument. More... | |
static accelerator_view | get_auto_selection_view () |
Returns an accelerator_view which when passed as the first argument to a parallel_for_each call causes the runtime to automatically select the target accelerator_view for executing the parallel_for_each kernel. More... | |
Friends | |
class | accelerator_view |
Represents a physical accelerated computing device.
An object of this type can be created by enumerating the available devices, or getting the default device.
|
inline |
Constructs a new accelerator object that represents the default accelerator.
This is equivalent to calling the constructor
The actual accelerator chosen as the default can be affected by calling accelerator::set_default().
|
inlineexplicit |
Constructs a new accelerator object that represents the physical device named by the "path" argument.
If the path represents an unknown or unsupported device, an exception will be thrown.
The path can be one of the following:
[in] | path | The device path of this accelerator. |
|
inline |
Copy constructs an accelerator object.
This function does a shallow copy with the newly created accelerator object pointing to the same underlying device as the passed accelerator parameter.
[in] | other | The accelerator object to be copied. |
|
inline |
Creates and returns a new accelerator view on the accelerator with the supplied queuing mode.
[in] | qmode | The queuing mode of the accelerator_view to be created. See "Queuing Mode". The default value would be queueing_mdoe_automatic if not specified. |
|
inlinestatic |
Returns a std::vector of accelerator objects (in no specific order) representing all accelerators that are available, including reference accelerators and WARP accelerators if available.
|
inlinestatic |
Returns an accelerator_view which when passed as the first argument to a parallel_for_each call causes the runtime to automatically select the target accelerator_view for executing the parallel_for_each kernel.
In other words, a parallel_for_each invocation with the accelerator_view returned by get_auto_selection_view() is the same as a parallel_for_each invocation without an accelerator_view argument.
For all other purposes, the accelerator_view returned by get_auto_selection_view() behaves the same as the default accelerator_view of the default accelerator (aka accelerator().get_default_view() ).
|
inline |
Returns the amount of dedicated memory (in KB) on an accelerator device.
There is no guarantee that this amount of memory is actually available to use.
|
inline |
Returns the default accelerator_view associated with the accelerator.
The queuing_mode of the default accelerator_view is queuing_mode_automatic.
|
inline |
This property indicates that the accelerator may be shared by (and thus have interference from) the operating system or other system software components for rendering purposes.
A C++ AMP implementation may set this property to false should such interference not be applicable for a particular accelerator.
|
inline |
Returns an opaque handle which points to the underlying HSA agent.
|
inline |
Returns an opaque handle which points to the AM system region on the HSA agent.
This region can be used to allocate finegrained system memory which is accessible from the specified accelerator.
|
inline |
Returns an opaque handle which points to the AM region on the HSA agent.
This region can be used to allocate accelerator memory which is accessible from the specified accelerator.
|
inline |
Returns an opaque handle which points to the AM system region on the HSA agent.
This region can be used to allocate system memory which is accessible from the specified accelerator.
|
inline |
Returns an opaque handle which points to the Kernarg region on the HSA agent.
|
inline |
Returns a boolean value indicating whether the accelerator is emulated.
This is true, for example, with the reference, WARP, and CPU accelerators.
|
inline |
Check if other
is peer of this accelerator.
|
inline |
Return a std::vector of this accelerator's peers.
peer is other accelerator which can access this accelerator's device memory using map_to_peer family of APIs.
|
inline |
Returns the profile the accelerator.
|
inline |
Return the unique integer sequence-number for the accelerator.
Sequence-numbers are assigned in monotonically increasing order starting with 0.
|
inline |
Returns a Boolean value indicating whether this accelerator supports double-precision (double) computations.
When this returns true, supports_limited_double_precision also returns true.
|
inline |
Returns a 32-bit unsigned integer representing the version number of this accelerator.
The format of the integer is major.minor, where the major version number is in the high-order 16 bits, and the minor version number is in the low-order bits.
|
inline |
Return true if the accelerator's memory can be mapped into the CPU's address space, and the CPU is allowed to access the memory directly with CPU memory operations.
Typically this is enabled with "large BAR" or "resizeable BAR" address mapping.
|
inline |
Compares "this" accelerator with the passed accelerator object to determine if they represent different devices.
[in] | other | The accelerator object to be compared against. |
|
inline |
Assigns an accelerator object to "this" accelerator object and returns a reference to "this" object.
This function does a shallow assignment with the newly created accelerator object pointing to the same underlying device as the passed accelerator parameter.
other | The accelerator object to be assigned from. |
|
inline |
Compares "this" accelerator with the passed accelerator object to determine if they represent the same underlying device.
[in] | other | The accelerator object to be compared against. |
|
inlinestatic |
Sets the default accelerator to the device path identified by the "path" argument.
See the constructor accelerator(const std::wstring& path) for a description of the allowable path strings.
This establishes a process-wide default accelerator and influences all subsequent operations that might use a default accelerator.
[in] | path | The device path of the default accelerator. |
|
inline |
Sets the default_cpu_access_type for this accelerator.
The default_cpu_access_type is used for arrays created on this accelerator or for implicit array_view memory allocations accessed on this this accelerator.
This method only succeeds if the default_cpu_access_type for the accelerator has not already been overriden by a previous call to this method and the runtime selected default_cpu_access_type for this accelerator has not yet been used for allocating an array or for an implicit array_view memory allocation on this accelerator.
[in] | default_cpu_access_type | The default cpu access_type to be used for array/array_view memory allocations on this accelerator. |