tilelang.language.copy_op ========================= .. py:module:: tilelang.language.copy_op .. autoapi-nested-parse:: Copy operations exposed on the TileLang language surface. Functions --------- .. autoapisummary:: tilelang.language.copy_op.copy tilelang.language.copy_op.c2d_im2col Module Contents --------------- .. py:function:: copy(src, dst, *, coalesced_width = None, disable_tma = False, eviction_policy = None, annotations = None, loop_layout = None) Copy data between memory regions. :param src: Source memory region :type src: Union[tir.Buffer, tir.BufferLoad, tir.BufferRegion] :param dst: Destination memory region :type dst: Union[tir.Buffer, tir.BufferLoad, tir.BufferRegion] :param coalesced_width: Width for coalesced memory access. Defaults to None. :type coalesced_width: Optional[int], keyword-only :param disable_tma: Whether to disable TMA acceleration. Defaults to False. :type disable_tma: bool, keyword-only :param eviction_policy: Cache eviction policy. Defaults to None. :type eviction_policy: Optional[str], keyword-only :param annotations: Additional annotations dict. If provided, coalesced_width, disable_tma, and eviction_policy can also be specified here. Values in annotations take precedence over individual arguments. :type annotations: Optional[dict], keyword-only :param loop_layout: A parallel loop layout hint for the SIMT copy (only valid for normal SIMT copy; incompatible with TMA/LDSM/STSM/TMem). When provided, it is attached to the outermost parallel loop generated by this copy. :type loop_layout: Optional[Fragment], keyword-only :raises TypeError: If copy extents cannot be deduced from arguments :returns: A handle to the copy operation :rtype: tir.Call Range handling notes: - Accepts `Buffer`/`BufferRegion`/`BufferLoad` on either side. Extents are derived as follows: `Buffer -> shape`, `BufferRegion -> [r.extent]`, `BufferLoad -> extents from its inferred/encoded region`. - Normally, we require the extents of both sides to be the same. If they differ, the copy instruction follows an internal rule to select one side as the base range and create iteration space. This may generate unexpected code. And if some dimensions are 1, unexpected errors may happen. - Small Optimization: If both `src` and `dst` are scalar `BufferLoad` without region extents, lowers to a direct store: `dst[...] = src[...]`. - Syntactic Sugar: TileLang supports passing the head address of a buffer to represent the whole buffer if there are no ambiguity. For example, T.copy(A, A_shared[i, j]). To support this, we need some special shape checking. But remember currently we don't support something like "broadcast". - The finalized extents are encoded with `tl.region` via `to_buffer_region` and passed through to the backend; low-level loop construction and any scope-specific decisions happen during lowering. .. py:function:: c2d_im2col(img, col, nhw_step, c_step, kernel, stride, dilation, pad, eviction_policy = None) Perform im2col transformation for 2D convolution. :param img: Input image buffer :type img: tir.Buffer :param col: Output column buffer :type col: tir.Buffer :param nhw_step: Step size for batch and spatial dimensions :type nhw_step: tir.PrimExpr :param c_step: Step size for channel dimension :type c_step: tir.PrimExpr :param kernel: Kernel size :type kernel: int :param stride: Stride of the convolution :type stride: int :param dilation: Dilation rate :type dilation: int :param pad: Padding size :type pad: int :returns: A handle to the im2col operation :rtype: tir.Call