Skip to content
版 本

REGISTER_NONE_TILING

产 品 支 持 情 况

产 品

是 否 支 持

Ascend 950PR/Ascend 950DT

Atlas A3 训 练 系 列 产 品/Atlas A3 推 理 系 列 产 品

Atlas A2 训 练 系 列 产 品/Atlas A2 推 理 系 列 产 品

Atlas 200I/500 A2 推 理 产 品

x

Atlas 推 理 系 列 产 品AI Core

x

Atlas 推 理 系 列 产 品Vector Core

x

Atlas 训 练 系 列 产 品

x

功 能 说 明

在Kernel侧 使 用 标 准C++语 法 自 定 义 的TilingData结 构 体 时,若 用 户 不 确 定 需 要 注 册 哪 些 结 构 体,可 使 用 该 接 口 告 知 框 架 侧 需 使 用 未 注 册 的 标 准C++语 法 来 定 义TilingData,并 配 套GET_TILING_DATA_WITH_STRUCTGET_TILING_DATA_MEMBERGET_TILING_DATA_PTR_WITH_STRUCT来 获 取 对 应 的TilingData。

函 数 原 型

Text
REGISTER_NONE_TILING

参 数 说 明

约 束 说 明

  • 暂 不 支 持Kernel直 调 工 程。
  • 使 用GET_TILING_DATA需 提 供 默 认 注 册 的TilingData结 构 体,但 本 接 口 不 注 册TilingData结 构 体,故 不 支 持 与5.11.1-GET_TILING_DATA组 合 使 用。
  • 不 支 持 和REGISTER_TILING_DEFAULTREGISTER_TILING_FOR_TILINGKEY混 用,即 不 支 持 注 册TilingData结 构 体 的 场 景 与 非 注 册 场 景 混 合 使 用。

调 用 示 例

Text
# Tiling模 板 库 提 供 方,无 法 预 知 用 户 实 例 化 何 种TilingData结 构 体
template <class BrcDag>
struct BroadcastBaseTilingData {
    int32_t scheMode;
    int32_t shapeLen;
    int32_t ubSplitAxis;
    int32_t ubFormer;
    int32_t ubTail;
    int64_t ubOuter;
    int64_t blockFormer;
    int64_t blockTail;
    int64_t dimProductBeforeUbInner;
    int64_t elemNum;
    int64_t blockNum;
    int64_t outputDims[BROADCAST_MAX_DIMS_NUM];
    int64_t outputStrides[BROADCAST_MAX_DIMS_NUM];
    int64_t inputDims[BrcDag::InputSize][2]; // 整 块 + 尾 块
    int64_t inputBrcDims[BrcDag::CopyBrcSize][BROADCAST_MAX_DIMS_NUM];
    int64_t inputVecBrcDims[BrcDag::VecBrcSize][BROADCAST_MAX_DIMS_NUM];
    int64_t inputStrides[BrcDag::InputSize][BROADCAST_MAX_DIMS_NUM];
    int64_t inputBrcStrides[BrcDag::CopyBrcSize][BROADCAST_MAX_DIMS_NUM];
    int64_t inputVecBrcStrides[BrcDag::VecBrcSize];
    char scalarData[BROADCAST_MAX_SCALAR_BYTES];
};

template <uint64_t schMode, class BrcDag> class BroadcastSch {
public:
    __aicore__ inline explicit BroadcastSch(GM_ADDR& tmpTiling)
        : tiling(tmpTiling)
    {}
    template <class... Args>
    __aicore__ inline void Process(Args... args)
    {
        REGISTER_NONE_TILING; // 告 知 框 架 侧 使 用 未 注 册 的TilingData结 构 体
        if constexpr (schMode == 1) {
            GET_TILING_DATA_WITH_STRUCT(BroadcastBaseTilingData<BrcDag>, tilingData, tiling);
            GET_TILING_DATA_MEMBER(BroadcastBaseTilingData<BrcDag>, blockNum, blockNumVar, tiling);
            TPipe pipe;
            BroadcastNddmaSch<BrcDag, false> sch(&tilingData); // 获 取Schedule
            sch.Init(&pipe, args...);
            sch.Process();
        }   else if constexpr (schMode == 202) {
            GET_TILING_DATA_PTR_WITH_STRUCT(BroadcastOneDimTilingDataAdvance, tilingDataPtr, tiling);
            BroadcastOneDimAdvanceSch<BrcDag> sch(tilingDataPtr); // 获 取Schedule
            sch.Init(args...);
            sch.Process();
        }
    }
public:
    GM_ADDR tiling;
};
Text
#用 户 通 过 传 入schMode, OpDag模 板 参 数 来 实 例 化 模 板 库
using namespace AscendC;
template <uint64_t schMode>
__global__ __aicore__ void mul(GM_ADDR x1, GM_ADDR x2, GM_ADDR y, GM_ADDR workspace, GM_ADDR tiling)
{
    if constexpr (std::is_same<DTYPE_X1, int8_t>::value) {
        // int8
        using OpDag = MulDag::MulInt8Op::OpDag;
        BroadcastSch<schMode, OpDag> sch(tiling);
        sch.Process(x1, x2, y);
    } else if constexpr (std::is_same<DTYPE_X1, uint8_t>::value) {
        // uint8
        using OpDag = MulDag::MulUint8Op::OpDag;
        BroadcastSch<schMode, OpDag> sch(tiling);
        sch.Process(x1, x2, y);
    }
}

免 责 声 明:本 站 内 容 由 asc-devkit 仓 master 分 支 自 动 编 译 生 成,属 于 持 续 开 发 版 本,可 能 存 在 缺 陷,仅 供 预 览 与 参 考。如 需 稳 定 及 商 用 资 料,请 查 阅 官 方 昇 腾 社 区