Taichi
Table of Contents
- 1 Installation
- 2 Why Taichi?
- 3 Initilization
- 4 Field
- 5 Kernels
- 6 Functions
- 7 Parallel for-loops
- 8 Taichi-scope and Python-scope
- 9 Phases of a Taichi program
This is a small tutorial for Taichi based on Yuanming Hu’s 2020 lecture and the original Taichi paper.
1 Installation
Following similar instructions we put out for PA1, you can first build a virtual environment with python installed. Then, install Taichi as a module for this Python interpreter:
conda create --name cs5643 python=3.8
conda activate cs5643
pip install taichi
2 Why Taichi?
Taichi is a domain-specific language embedded in Python that specialized in generating high-performance code for sparse data structure. e.g. voxel grids, particles and 3D hash tables.
One of it’s major benefits is that it decouples the data structures from computation so that the interface for writing computation code is data structure-agnostic. It means that the same computation code could work against different data structures and Taichi leverages this decoupling to generate data structure optimized for locality at compile time.
In our exploration of Taichi, we will mainly use Taichi to write computation code and pay little attention to how Taichi help us to do parallelization with domain-specific optimization strategies. Thus, this tutorial will focus on helping you understand the domain-agnostic interface and how we can potentially use it to write numerical simulation code.
3 Initilization
Before start writing any other Taichi Code, you need to specify the
backend architecture you would like to run parallel code on with the
code ti.init(arch=ti.gpu)
. You can also directly specify
the graphics API used by your driver by setting
ti.init(arch=ti.x64/arm/cuda/opengl/metal/vulkan)
.
4 Field
Field is how data is being represented in Taichi. You can consider
Field as multi-dimensional arrays whose element could be either a scalar
ti.field
, a Vector ti.Vector.field
or a matrix
ti.Matrix.field
.
For example, we can create a 3-channel pixel
field by
writing:
pixels = ti.Vector.field(3, dtype=float, shape=(W, H))
. We
can access elements of the field using indexing syntax
pixels[i, j]
.
If you try to read or write from out-of-bound part of the field, the behavior of Taichi would be undefined and your code may fail silently. Thus, make sure your indices are always in-bounds.
5 Kernels
Taichi Kernels are instances of the data-structure agnostic interface we mentioned in the introduction. You specify the computation code you would like to run on Taichi Fields in Taichi Kernels. Syntactically, they are written in a form similar to the python functions, but they need to be
- decorated with
@ti.kernel
- both their arguments and return values need to be type-hinted.
For example, if we would like to scroll an field img
to
the left and store the shifted version in another field named
pixel
, you can write a kernel as follows:
@ti.kernel
def paint(xShift: float):
for i, j in pixels:
# Parallized over all pixels
pixels[i, j] = img[int(ti.round(i + xShift)) % W, j]
Notice that there is a decorator for the kernel and we explicitly write out the type of the argument to the kernel. Since the kernel has void as return value, you don’t need to label its type.
6 Functions
Taichi functions could be called by Taichi Kernels or other Taichi
fuctions. They also look like python functions except that they need to
be decorated by @ti.func
. Note that Taichi functions are
really limited compared to general functions in a general language:
- Taichi functions can only return single value.
- They are force-inlined, so recursion is not allowed.
- Run-time branching isn’t allowed in Taichi functions, yet compile-time branching is allowed.
One use case for Taichi Functions would be a Taichi kernel applying a Taichi function to every single element of a Taichi Field. This would make the Taichi function a mapping function. If the mapping function is \(f(x) = 3x\), the code would be as follows:
@ti.func
def triple(x):
return x * 3
@ti.kernel
def triple_array():
for i in range(128):
a[i] = triple(a[i])
7 Parallel for-loops
- Range-for Loops: It’s similar to Python for loops using
range
. - Struct-for loops: It iterates through elements of a field and the operations done on each single element is being parallelized.
Note that Taichi parallelizes loops that appear at the outermost scope(i.e. not inside any other loop or conditional statement). Previously, we made an example of storing a scrolled image for Taichi kernel. We would like to pull up that code back again to demonstrate the Struct-for loops.
@ti.kernel
def paint(xShift: float):
for i, j in pixels:
# Parallized over all pixels
pixels[i, j] = img[int(ti.round(i + xShift)) % W, j]
8 Taichi-scope and Python-scope
- Taichi-scope: Taichi Kernels and Functions code.
- Python-scope: Code not included in Taichi Kernels and Functions. Code in Taichi-scope would be compiled by Taichi compiler to run in parallel. Thus, inherently, be careful of writing to shared global variable in the Parallel for-loops inside kernels. This may cause undefined behavior or contention for some global resources.
9 Phases of a Taichi program
- Initialize your Taichi program with
ti.init
. - Construct your data with
ti.field
,ti.Vector.field
andti.Matrix.field
- Run your Taichi Kernels and Functions on the data.
To put the previous image scrolling kernel to use. We may create one program that constructs a 2D image slice of HSV colorspace at the start and use the kernel to scroll that image. This program is stored here.