New Trend – Increased Compute and storage capability at the edge

More and More newer edge devices have increased compute and storage capability at the edge.

At the Telsa’s AI Day event – it had reveled D1’s 354 chip nodes reportedly that have one teraflops (1,024 gflops). The entire chip was capable of up to 363 teraflops of compute as well as 10tbps of on-chip bandwidth/4tbps of off-chip bandwidth.

The D1 chip & training tile

The aforementioned precursor cluster relied predominantly on Nvidia’s A100 GPUs for acceleration. Not so for Dojo, which will consist almost entirely of Tesla’s decidedly unique D1 chip. D1 supported FP32, BFP16 (aka bfloat16 or brain floating point) and a new format called CFP8 (“configurable FP8”). Optimized for machine learning workloads, D1 (which consists of 354 “training nodes”) is manufactured using a 7nm process and, at just 645 square millimeters, contains 50 billion transistors.

Tesla put a strong emphasis on modularity across the hardware. D1 is equipped with 4TBps off-chip bandwidth on each of its lateral edges – all four equipped with connectors – allowing it to connect to and scale with other D1 chips without sacrificing speed.

The next step up is Tesla’s “training tile,” a wedge less than a cubic foot in size that contains 25 of the D1 chips. The training tile operates with similar modularity to the chip itself: power and cooling are conducted through the top of the tile, allowing its four lateral edges to be outfitted with high-output connectors designed for maximum bandwidth (a total of 36TB/s of off-tile bandwidth).

The performance capabilities of supercomputers are expressed using a standard rate for indicating the number of floating-point arithmetic calculations systems can perform on a per-second basis. (FLOPS)

Prefixes for representing orders of magnitude

Orders of magnitude (in base 10) are expressed using standard metric prefixes, which are abbreviated to single characters when prepended to other abbreviations, such as FLOPS and B (for byte):

PrefixAbbreviationOrder of magnitude
(as a factor of 10)
Computer performanceStorage capacity
giga-G109gigaFLOPS
(GFLOPS)
gigabyte
(GB)
tera-T1012teraFLOPS
(TFLOPS)
terabyte
(TB)
peta-P1015petaFLOPS
(PFLOPS)
petabyte
(PB)
exa-E1018exaFLOPS
(EFLOPS)
exabyte
(EB)
zetta-Z1021zettaFLOPS
(ZFLOPS)
zettabyte
(ZB)
yotta-Y1024yottaFLOPS
(YFLOPS)
yottabyte
(YB)

Terascale: Refers to methods and processes for using supercomputers capable of performing at least 1 TFLOPS or storage systems capable of storing at least 1 TB


Petascale: Refers to methods and processes for using supercomputers capable of performing at least 1 PFLOPS or storage systems capable of storing at least 1 PB


Exascale: Refers to methods and processes for using supercomputers capable of performing at least 1 EFLOPS or storage systems capable of storing at least 1 EB

CleanTechnica’s chip comparison – https://cleantechnica.com/

Understand orders of magnitude in computer performance


GigaFLOPS
A 1 gigaFLOPS (GFLOPS) computer system is capable of performing one billion (109) floating-point operations per second. To match what a 1 GFLOPS computer system can do in just one second, you’d have to perform one calculation every second for 31.69 years.

TeraFLOPS
A 1 teraFLOPS (TFLOPS) computer system is capable of performing one trillion (1012) floating-point operations per second. The rate 1 TFLOPS is equivalent to 1,000 GFLOPS. To match what a 1 TFLOPS computer system can do in just one second, you’d have to perform one calculation every second for 31,688.77 years.

PetaFLOPS
A 1 petaFLOPS (PFLOPS) computer system is capable of performing one quadrillion (1015) floating-point operations per second. The rate 1 PFLOPS is equivalent to 1,000 TFLOPS. To match what a 1 PFLOPS computer system can do in just one second, you’d have to perform one calculation every second for 31,688,765 years.

ExaFLOPS
A 1 exaFLOPS (EFLOPS) computer system is capable of performing one quintillion (1018) floating-point operations per second. The rate 1 EFLOPS is equivalent to 1,000 PFLOPS. To match what a 1 EFLOPS computer system can do in just one second, you’d have to perform one calculation every second for 31,688,765,000 years.

100 TB SSD

Nimbus Data is a leader in scalable and high-performance flash memory solutions for cloud infrastructure, AI, digital content, HPC, virtualization, databases, and much more. Our solutions include the Nimbus Data AFX storage operating system, ExaFlash® all-flash arrays, ExaDrive® solid state drives, and the ground-breaking Tectonic future-proof storage experience.

ExaDrive SSDs offer a unique balance of capacity, performance, energy efficiency, and value. With its 3.5” form factor, ExaDrive SSDs are plug-and-play compatible with SATA and SAS-based servers and JBODs, enabling you to upgrade from HDDs to SSDs quickly and affordably. ExaDrive SSDs offer more capacity, better performance, lower power consumption, and better reliability than nearline HDDs.

While NVMe SSDs are faster, they use substantially more power, offer less capacity, and have higher OpEx costs. ExaDrive strikes the ideal balance between nearline HDDs and NVMe SSDs, offering an ideal storage solution for the rapid growth in unstructured data.

Understand orders of magnitude in storage capacity


Gigabyte
A gigabyte is equal to one billion bytes. You can fit 4.7 GB of data on one single-sided DVD (each DVD is about 1.2 mm, or 0.047 inches, thick).

Terabyte
A terabyte is equal to one trillion (one thousand billion) bytes, or 1,000 GB. To hold 1 TB of data, you would need about 213 single-sided DVDs (a stack that’s about 255.6 mm, or 10.06 inches, tall).

Petabyte
A petabyte is equal to one quadrillion (one thousand trillion) bytes, or 1,000 TB. To hold 1 PB of data, you would need about 212,766 single-sided DVDs (a stack that’s about 255.3 meters, or 837.67 feet, tall).

Exabyte
An exabyte is equal to one quintillion (one thousand quadrillion) bytes, or 1,000 PB. To hold 1 EB, you would need about 212,765,958 single-sided DVDs (a stack that’s about 255.3 kilometers, or 158.65 miles, tall).