.. _sdk-rel-notes-cumulative.rst: SDK Release Notes ================= The following are the release notes for the Cerebras SDK. .. _v1-3-0: Version 1.3.0 ------------- Released 13 December 2024 .. note:: The Cerebras Wafer-Scale Cluster appliance running Cerebras ML Software 2.3 supports SDK 1.2. `See here for SDK 1.2 documentation `_. The Cerebras Wafer-Scale Cluster appliance running Cerebras ML Software 2.4 supports SDK 1.3, the current version of SDK software. New features and enhancements ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - CSL language and compiler enhancements: - For DSD definitions, a tensor access expression is now shorthand for a ``comptime_struct`` with ``extent``, ``stride``, and ``base_address`` fields. DSDs can now also be specified using these fields directly, for example: .. code-block:: csl // These two definitions are equivalent: var my_dsd = @get_dsd(mem1d_dsd, .{ .extent = 10, .stride = 2, .base_address = &my_arr }); var my_dsd = @get_dsd(mem1d_dsd, .{ .tensor_access = |i|{10} -> my_arr[2*i] }); ``stride`` is an optional parameter with default value 1. See :ref:`language-dsds-mem1d-tensor-access` for more information. - Memory DSD properties can now take runtime values when using the individual field specification format. However, ``mem4d_dsd`` extent and stride must still be comptime known. - Introduces inline functions, which are expanded during semantic analysis. See :ref:`language-syntax` for more information. - Introduces labeled ``break`` and the ability to break values from blocks. See :ref:`language-syntax` for more information. - Improves performance of CSL's parser, potentially improving program compile times. - Improves DSR allocation diagnostics when using DSDs. Upon failure to allocate, diagnostics now contain information about operations that prevent a DSR from being allocated. - CSL library enhancements: - Introduces a ```` library which provides wrappers around DSD op builtins that select an appropriate builtin depending on the underlying data types, enabling more concise and flexible code when supporting multiple data types. See :ref:`language-libraries-dsd-ops` for more information. - ``SdkRuntime`` host runtime enhancements: - Introduces a strided version of ``memcpy_h2d`` for strided host-to-device data transfers. See ``memcpy_h2d_stride`` in :ref:`sdkruntime-api-reference`. - Introduces row and column broadcast variants of ``memcpy_h2d`` for host-to-device row and column broadcasts. See ``memcpy_h2d_colbcast`` and ``memcpy_h2d_rowbcast`` in :ref:`sdkruntime-api-reference`. Also see the example program :ref:`sdkruntime-row-col-broadcast`. - Example programs: - Introduces a new example program :ref:`sdkruntime-row-col-broadcast` to demonstrate row and column broadcasts for host-to-device data transfers. - Introduces a new example program :ref:`sdkruntime-game-of-life` which implements Conway's Game of Life. Resolved issues ~~~~~~~~~~~~~~~ - Fixes an issue in the ```` library where messages were limited to only 16 wavelets. The maximum message size is 32 wavelets. - Fixes bugs in the ```` library in which ``encode_payload()`` could index out of bounds, and not set ``NOCE`` bit on unused commands. - Fixes a bug in which sequential ``@map`` operations within a function would not be able to reuse DSRs. Known issues ~~~~~~~~~~~~ - The ``25-pt-stencil``, ``histogram-torus``, and ``spmv-hypersparse`` benchmark examples are not yet supported on WSE-3. - Instruction traces in the SDK GUI are not yet supported on WSE-3. - The bandwidth of memory transfers saturates at around 8 IO channels. .. _v1-2-0: Version 1.2.0 ------------- Released 28 June 2024 .. note:: The Cerebras Wafer-Scale Cluster appliance running Cerebras ML Software 2.2 supports SDK 1.1. `See here for SDK 1.1 documentation `_. The Cerebras Wafer-Scale Cluster appliance running Cerebras ML Software 2.3 supports SDK 1.2, the current version of SDK software. New features and enhancements ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - CSL language and compiler enhancements: - Introduces ``inline`` ``for``-loops, which are unrolled at compile time. The body of an ``inline`` ``for``-loop may assign to a ``comptime`` variable. For example: .. code-block:: csl fn length(comptime array: anytype) comptime_int { comptime var result = 0; // This loop will be inlined. inline for (array) |v| { result += 1; } return result; } - Introduces the ``@queue_flush`` and ``@set_empty_queue_handler`` builtin for WSE-3. See :ref:`language-builtins-wse3-qflush`. - Runtime ``on_control`` values in DSD operations are now supported. For example: .. code-block:: csl fn f(out: fabout_dsd, in: fabin_dsd, act_id: local_task_id) void { @fmovh(out, in, .{ .async = true, .on_control = .{ .activate = act_id }}); } - Improves ``void`` type semantics, enabling optionally specified module parameters and function arguments. - Significantly improves compile times for large programs. Compilation time for full-wafer programs may be improved as much as 10x. - CSL library enhancements: - Introduces a ```` library for runtime debug printing to the simulator log. See :ref:`language-libraries-simprint`. - Introduces a ```` library for creating control wavelet payloads. See :ref:`language-libraries-control`. - Introduces a ```` library for WSE-3 point-to-point communication. See :ref:`language-libraries-message-passing`. - Introduces the ``queue_flush`` module within the ```` library for WSE-3, which can be used for querying when a queue is flushed and to exit the flushed state. See :ref:`language-libraries-wse3-tile-config-queue-flush`. - Adds WSE-3 support to the ``collectives_2d`` library. - ``SdkRuntime`` host runtime enhancements: - Adds WSE-3 support for ``memcpy`` streaming mode. - Example programs: - Reorganizes and updates all tutorial example programs with WSE-3 support. - Introduces two new tutorial examples for switches, demonstrating use of the ```` library. See :ref:`sdkruntime-topic-06-switches` and :ref:`sdkruntime-topic-07-switches-entrypt`. - Introduces a new tutorial example to demonstrate the ```` library. See :ref:`sdkruntime-topic-13-simprint`. - Introduces a new tutorial example to demonstrate color swapping on WSE-2. See :ref:`sdkruntime-topic-14-color-swap`. - Adds WSE-3 support to the ``wide-multiplication``, ``residual``, ``mandelbrot``, ``gemv-collectives_2d``, ``gemv-checkerboard-pattern``, ``gemm-collectives_2d``, ``7pt-stencil-spmv``, ``bicgstab``, ``conjugateGradient``, ``preconditionedConjugateGradient``, and ``powerMethod`` benchmark example programs. Resolved issues ~~~~~~~~~~~~~~~ - Adds ``memcpy`` streaming support for WSE-3. - Adds WSE-3 support for the ```` library. - Fixes potential bug in the ```` library related to reconfiguring the library's colors. - Fixes potential bug in the ```` library related to reconfiguring the library's colors. Known issues ~~~~~~~~~~~~ - The ``25-pt-stencil``, ``histogram-torus``, and ``spmv-hypersparse`` benchmark examples are not yet supported on WSE-3. - The SDK GUI is not yet supported on WSE-3. - The bandwidth of memory transfers saturates at around 8 IO channels. Deprecations ~~~~~~~~~~~~ - The deprecated ``@get_color_id`` builtin to get the numerical value of a color is now removed. Use ``@get_int`` instead. - Use of ``@get_color`` on any ID other than a routable color ID is no longer supported. - ``tile_config.reg_ptr`` has been removed. Use ``@get_config`` and ``@set_config`` for direct manipulation of config space addresses. .. _v1-1-0: Version 1.1.0 ------------- Released 10 April 2024 This version of the Cerebras SDK is the first with experimental support for the WSE-3, the third generation Cerebras architecture. The WSE-3 is the wafer-scale processor powering the CS-3 Cerebras system. .. note:: The Cerebras Wafer-Scale Cluster appliance running Cerebras ML Software 2.0 supports SDK 0.9. `See here for SDK 0.9 documentation `_. The Cerebras Wafer-Scale Cluster appliance running Cerebras ML Software 2.1 supports SDK 1.0. `See here for SDK 1.0 documentation `_. The Cerebras Wafer-Scale Cluster appliance running Cerebras ML Software 2.2 supports SDK 1.1, the current version of SDK software. New features and enhancements ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - CSL language and compiler enhancements: - Introduces initial support for WSE-3. - Introduces ``ut_id`` type and ``@get_ut_id`` builtin for representing microthread IDs. This feature is WSE-3 only. - Introduces runtime ``@get_config`` and ``@set_config`` support. - Introduces ``i64`` and ``u64`` types, and support in ````, ````, and ```` libraries. Like ``i8`` and ``u8``, these types are not allowed in memory DSD tensors or ``@map``, nor as arguments to tasks. - CSL ``memcpy`` library enhancements: - ``memcpy/get_params`` no longer requires specifying a ``LAUNCH`` color for host kernel launch support. - The ``@rpc`` builtin is no longer necessary for host kernel launch support. The RPC server is now created internally. - Other CSL library enhancements: - Introduces ``reset_tsc_counter()`` function in ``