SIMULATED 14 NM AUTO PLACE (SOFT IP) STRESS TEST CASE
This 14 nm “auto placement” design simulates a floorplan constructed almost entirely automatically. It has some processor and memory arrays in the corners, but most of the standard cell regions and individual memory blocks are placed automatically. The blocks are also fairly small, from 50,000 to 200,000 placed standard cells (vs. other designs with larger floorplan-level blocks that may have several million placed standard cells). As a result there are almost 1,200 floorplan-level blocks in the top-level block of the design. As is typical for floorplans created automatically, there are numerous gaps between blocks and about 10% of the placed area is open space, as shown in the sample enlarged view above. The design is 10,219.7 microns wide and 11,717.7 microns high and after expanding all hierarchy, features:
45 small 32-bit processors
39 medium 32-bit processors
5 large 32-bit processors
64 large DSP cores
538 small 16-bit processors
258,791,768 standard cell placements
1,108,338,048 bits in all SRAMs
8,747,253,286 transistors in all circuits
At a larger scale, this design has a collection of parallel processing arrays, each optimized for a different operating style: streaming data, digital signal processing, 16-bit array processing, or 32-bit array processing.
There are sixteen streaming processors in the upper left corner of the design, controlled by two large general-purpose processors. Each general-purpose processor has six megabytes of on-chip static RAM. Each streaming processor has half a megabyte of local static RAM in addition to instruction and data caches
In the lower left corner of the design, there are 512 16-bit processors in an array. Each processor has 16 kilobytes of local static RAM; each cluster of 16 processors has 512 KB of static RAM dedicated to that cluster. The 512-processor cluster is in turn controlled by a large general-purpose 32-bit processor with two megabytes of static RAM.
There are 64 digital signal processing (DSP) cores in the upper right corner of the design. Each individual DSP core has 256 kilobytes of local static RAM. Additionally, each cluster of 16 DSP cores has three megabytes of shared static RAM. A ring of 32 small 32-bit processors surrounds the 64 DSP cores; each of these processors has 256 kilobytes of local static RAM.
The central area of the top level of the design includes individual static RAM blocks and processor cores, placed randomly. Sometimes the processor cores are tightly clustered with local static RAM and sometimes they are simply placed near static RAM buffers.
This design also uses a point-to-point Network on Chip (NoC) with various small buffers placed around and sometimes inside the top-level blocks.
Some placed and routed blocks (notably the DSP cores) use a standard cell library that does not have a row of contacts to substrate tiedown diffusion under the supply rail; instead the tiedown contacts are placed in the fill cells scattered throughout the standard cell areas.
Finally, there is a cluster of 32 medium-size 32-bit processors in the lower right corner of the design. Each processor has 512 kilobytes of local static RAM. Another large general-purpose 32-bit processor with two megabytes of static RAM controls this cluster.