A broadcast operator process two tensors in different shapes. Normally, one of the operands has a particular dimension to be 1, which will be broadcast along the corresponding dimension of the other operator to perform the given calculation. Common scalar calculations can all be broadcast, such as elementary arithmetic and logical operations. Fig. 3.1.1 illustrates one broadcast add case between two 2-dimensional tensors. Broadcast operators are commonly seen in deep learning workloads, e.g. batch normalization.

In this section we will demonstrate how to perform a broadcast add between two 2-dimensional tensors. The following code defines the computation.

import numpy as np
import tvm
from tvm import te

# Save to the d2ltvm package.

shape1, shape2 : the shapes of the input tensors
"""
assert len(shape1) == 2 and len(shape2) == 2, \
"broadcast tensors should both be 2-dimension"
for i in range(len(shape1)):
assert shape1[i] == shape2[i] or shape1[i] == 1 or shape2[i] == 1, \
"tensor shapes do not fit for broadcasting"
A = te.placeholder(shape1, name='A')
B = te.placeholder(shape2, name='B')
m = shape1[0] if shape2[0] == 1 else shape2[0]
n = shape1[1] if shape2[1] == 1 else shape2[1]
f = lambda x, y: A[0 if shape1[0]==1 else x, 0 if shape1[1]==1 else y] + \
B[0 if shape2[0]==1 else x, 0 if shape2[1]==1 else y]
C = te.compute((m, n), f, name='C')
return A, B, C


Then we use it to perform the broadcast add illustrated in Fig. 3.1.1.

m = 3
n = 4
shape1 = (m, 1)
shape2 = (m, n)
s = te.create_schedule(C.op)
print(tvm.lower(s, [A, B], simple_mode=True))
mod = tvm.build(s, [A, B, C])

// attr [C] storage_scope = "global"
allocate C[float32 * 12]
produce C {
for (x, 0, 3) {
for (y, 0, 4) {
C[((x*4) + y)] = (A[x] + B[((x*4) + y)])
}
}
}


The printed pseudocode clearly depicts the process of a broadcast add. We verify the results as follows.

# Save to the d2ltvm package.
def get_bcast_data(shape1, shape2, constructor=None):
"""Return random tensors a, b
and empty tensor c to store broadcast results between a and b

shape1, shape2: shapes of input tensors
constructor : user-defined tensor constructor
"""
np.random.seed(0)
a = np.random.normal(size=shape1).astype("float32")
b = np.random.normal(size=shape2).astype("float32")
out_shape = (shape1[0] if shape2[0] == 1 else shape2[0],
shape1[1] if shape2[1] == 1 else shape2[1])
c = np.empty(out_shape, dtype='float32')
if constructor:
a, b, c = [constructor(x) for x in (a, b, c)]
return a, b, c
a, b, c = get_bcast_data(shape1, shape2, tvm.nd.array)
mod(a, b, c)


Note that broadcast is allowed to perform along multiple dimensions.

shape1 = (m, 1)
shape2 = (1, n)
s = te.create_schedule(C.op)
mod = tvm.build(s, [A, B, C])
a, b, c = get_bcast_data(shape1, shape2, tvm.nd.array)
mod(a, b, c)

(3, 1) (1, 4) (3, 4)