Google has unveiled its eighth-generation Tensor Processing Units (TPUs) by splitting them into two distinct chips: the training-focused TPU 8t and the inference-optimized TPU 8i. This move goes beyond simply designing two chips; it reflects Google’s capability to break down bottlenecks across different stages of the model lifecycle, restructuring chip design, interconnects, memory, scheduling, and software stacks.
Google’s split TPU chips signal shift from universal to specialized AI accelerators
30
Apr