pub struct GpuGlobal<'a, T: ?Sized> { /* private fields */ }Expand description
Used to distinguish different memory spaces in GPU programming. GpuGlobal represents global memory space. See shared::GpuShared for shared memory space. When chunking or atomic operations are needed, GpuGlobal is owned by chunk or atomic struct. This ensures that the user cannot access the data without using chunk or atomic operations.
Implementations§
Source§impl<'a, T> GpuGlobal<'a, [T]>
impl<'a, T> GpuGlobal<'a, [T]>
Sourcepub fn chunk_to_scope<CS, Map: ScopeUniqueMap<CS>>(
self,
_scope: CS,
m: Map,
) -> GlobalGroupChunk<'a, T, CS, Map>where
CS: ChunkScope<FromScope = Grid>,
pub fn chunk_to_scope<CS, Map: ScopeUniqueMap<CS>>(
self,
_scope: CS,
m: Map,
) -> GlobalGroupChunk<'a, T, CS, Map>where
CS: ChunkScope<FromScope = Grid>,
Convert GpuGlobal to GlobalGroupChunk in one step.
See ChunkScope for more details about chunk scope.
Source§impl<'a, T> GpuGlobal<'a, [T]>
impl<'a, T> GpuGlobal<'a, [T]>
Sourcepub fn flatten<T2>(self) -> GpuGlobal<'a, [T2]>where
&'a [T]: VecFlatten<T2>,
pub fn flatten<T2>(self) -> GpuGlobal<'a, [T2]>where
&'a [T]: VecFlatten<T2>,
Useful to optimize code with vector load/store. If length of the slice is not a multiple of N, the remaining elements will be ignored. For now, we only use flatten for global memory. For shared memory, user can use GpuShared<[[T; N]]> directly.
pub fn is_empty(&self) -> bool
pub fn len(&self) -> usize
Trait Implementations§
Source§impl<'a, 'b: 'a, T: ?Sized> HostToDev<GpuGlobal<'a, T>> for &'a mut TensorViewMut<'b, T>
Available on non-crate feature codegen_tests only.Allow host-side T to device-side T
impl<'a, 'b: 'a, T: ?Sized> HostToDev<GpuGlobal<'a, T>> for &'a mut TensorViewMut<'b, T>
codegen_tests only.Allow host-side T to device-side T
impl<'a, T: Sized> !DerefMut for GpuGlobal<'a, T>
Never implement DerefMut to prevent direct mutable access to the data. This ensures that the user cannot access the data without using chunk or atomic operations.