Taichi

1 Installation
2 Why Taichi?
3 Initilization
4 Field
5 Kernels
6 Functions
7 Parallel for-loops
8 Taichi-scope and Python-scope
9 Phases of a Taichi program

This is a small tutorial for Taichi based on Yuanming Hu’s 2020 lecture and the original Taichi paper.

1 Installation

Following similar instructions we put out for PA1, you can first build a virtual environment with python installed. Then, install Taichi as a module for this Python interpreter:

conda create --name cs5643 python=3.8
conda activate cs5643
pip install taichi

2 Why Taichi?

Taichi is a domain-specific language embedded in Python that specialized in generating high-performance code for sparse data structure. e.g. voxel grids, particles and 3D hash tables.

One of it’s major benefits is that it decouples the data structures from computation so that the interface for writing computation code is data structure-agnostic. It means that the same computation code could work against different data structures and Taichi leverages this decoupling to generate data structure optimized for locality at compile time.

In our exploration of Taichi, we will mainly use Taichi to write computation code and pay little attention to how Taichi help us to do parallelization with domain-specific optimization strategies. Thus, this tutorial will focus on helping you understand the domain-agnostic interface and how we can potentially use it to write numerical simulation code.

3 Initilization

Before start writing any other Taichi Code, you need to specify the backend architecture you would like to run parallel code on with the code ti.init(arch=ti.gpu). You can also directly specify the graphics API used by your driver by setting ti.init(arch=ti.x64/arm/cuda/opengl/metal/vulkan).

4 Field

Field is how data is being represented in Taichi. You can consider Field as multi-dimensional arrays whose element could be either a scalar ti.field, a Vector ti.Vector.field or a matrix ti.Matrix.field.

For example, we can create a 3-channel pixel field by writing: pixels = ti.Vector.field(3, dtype=float, shape=(W, H)). We can access elements of the field using indexing syntax pixels[i, j].

If you try to read or write from out-of-bound part of the field, the behavior of Taichi would be undefined and your code may fail silently. Thus, make sure your indices are always in-bounds.

5 Kernels

Taichi Kernels are instances of the data-structure agnostic interface we mentioned in the introduction. You specify the computation code you would like to run on Taichi Fields in Taichi Kernels. Syntactically, they are written in a form similar to the python functions, but they need to be

decorated with @ti.kernel
both their arguments and return values need to be type-hinted.

For example, if we would like to scroll an field img to the left and store the shifted version in another field named pixel, you can write a kernel as follows:

@ti.kernel
def paint(xShift: float):
    for i, j in pixels:
        # Parallized over all pixels
        pixels[i, j] = img[int(ti.round(i + xShift)) % W, j]

Notice that there is a decorator for the kernel and we explicitly write out the type of the argument to the kernel. Since the kernel has void as return value, you don’t need to label its type.

6 Functions

Taichi functions could be called by Taichi Kernels or other Taichi fuctions. They also look like python functions except that they need to be decorated by @ti.func. Note that Taichi functions are really limited compared to general functions in a general language:

Taichi functions can only return single value.
They are force-inlined, so recursion is not allowed.
Run-time branching isn’t allowed in Taichi functions, yet compile-time branching is allowed.

One use case for Taichi Functions would be a Taichi kernel applying a Taichi function to every single element of a Taichi Field. This would make the Taichi function a mapping function. If the mapping function is $f (x) = 3 x$ , the code would be as follows:

@ti.func
def triple(x):
    return x * 3
    
@ti.kernel
def triple_array():
    for i in range(128):
        a[i] = triple(a[i])

7 Parallel for-loops

Range-for Loops: It’s similar to Python for loops using range.
Struct-for loops: It iterates through elements of a field and the operations done on each single element is being parallelized.

Note that Taichi parallelizes loops that appear at the outermost scope(i.e. not inside any other loop or conditional statement). Previously, we made an example of storing a scrolled image for Taichi kernel. We would like to pull up that code back again to demonstrate the Struct-for loops.

@ti.kernel
def paint(xShift: float):
    for i, j in pixels:
        # Parallized over all pixels
        pixels[i, j] = img[int(ti.round(i + xShift)) % W, j]

8 Taichi-scope and Python-scope

Taichi-scope: Taichi Kernels and Functions code.
Python-scope: Code not included in Taichi Kernels and Functions. Code in Taichi-scope would be compiled by Taichi compiler to run in parallel. Thus, inherently, be careful of writing to shared global variable in the Parallel for-loops inside kernels. This may cause undefined behavior or contention for some global resources.

9 Phases of a Taichi program

Initialize your Taichi program with ti.init.
Construct your data with ti.field, ti.Vector.field and ti.Matrix.field
Run your Taichi Kernels and Functions on the data.

To put the previous image scrolling kernel to use. We may create one program that constructs a 2D image slice of HSV colorspace at the start and use the kernel to scroll that image. This program is stored here.