In part I and part II of this blog series, Greg used Figure 1 below to partition the 3D space (no pun intended) into three parts, and discussed the left and middle portions encompassing '3D-SIC' and '3D-SoC'. In this third and final part of the blog, he will move to the far right and talk about 3D at the transistor level.
Figure 1: Overview and taxonomy of evolution from 3D-SIC to 3D-IC. (Source: Eric Bayne, “The 3-D Interconnect Technology Landscape”, IEEE Design and Test, May/June 2016, pp. 8-20 (reproduced with permission))
At the right edge of Figure 1 is the final, and finest, version of 3D: transistor-level 3D. Unlike 3D at the coarser levels described so far, transistor-level 3D is more of a straight-on attack at furthering Moore’s Law. As such, it has the potential to fit within our existing design flows and help silicon providers deliver continued cost and power benefits, as we’ve been accustomed to receiving from process node advances. For these reasons, it is a separate, parallel thread in the 3D story compared to what we’ve discussed so far. Those of you who are higher up in the design/product ecosystem may not have to worry about the details, but people who do care about these details, such as Arm’s Physical IP division, could literally find a whole new dimension here, providing opportunity to fold memories and logic into the third dimension.
There are several ways to go about stacking transistors. One possible path follows transistor scaling plan of record, shown in Figure 2. We switched from the planar MOSFET at the left to the FinFET so that we could increase the electric field confinement by the gate, allowing us to further shrink the physical gate length. The FinFET looks like it will get us around four technology nodes of use, meaning we will soon need to switch to a new device with even better gate confinement, and the most logical step is the 'Gate All-Around' (GAA) FET (also called a horizontal nanowire FET). A wrinkle is that, over the last few process nodes, we’ve made the FinFET’s fins so tall that we’ll need to stack 2-4 nanowires on top of each other to meet our performance goals for the first post-FinFET node (and yes, that would be true even if we use nanosheets instead of nanowires).
Figure 2: Possible transistor-level scaling path includes 3D transistors (Image source: IMEC)
The interesting development here is that to maintain sufficient transistor channel confinement, we force ourselves to learn how to, in effect, stack transistors. The GAA-FET may help us along the path from 5nm to 3nm to 2nm, and if that comes to pass, then it is not a huge leap of faith to think that once we’re good at making stacked N-type nanowires next to stacked P-type nanowires, we couldn’t then stack N-type on P-Type -- essentially folding an inverter and/or an SRAM bitcell over on top of itself and creating one Moore’s Law node of shrink without reducing the patterned dimensions.
Figure 3: From stacked nanowire FETs to CFETs enabling transistor density scaling (Image sources: Left IMEC. Right: my 'hack' illustrating a CFET)
In Figure 2 that next step beyond the GAA-FET is labelled 'CFET', or 'complimentary FET', since it contains stacked N and P nanowires. Furthermore, if we can pull off the CFET, there is a possibility that we could figure out how to make more than two layers of N/P FETs, and push transistor density even further without having to rely solely on pitch reduction.
What I’ve described above is sequential processing of transistor channels (nanowires) on one wafer, 'monolithically'. You could also transfer a device layer grown on a second wafer and overlay it on a transistor you’ve already built on a first wafer, using the wafer bonding technologies we’ve already discussed. Then, if you can figure out how to process the rest of the wiring at low enough temperatures so as to not destroy the lower transistors and metal layers, you can complete a transistor stack in that manner. One advantage that should not be discounted for this 2nd approach is that you can do some of your processing in parallel, on two different wafers, which will cut down on the total cycle time..and cycle times have become quite burdensome with the increasing process complexity of advanced technology nodes.
CEA-LETI have chosen an approach that is somewhat in-between these two methods for their "CoolCube" technology. They create transistors on a 1st wafer and a high quality channel material on a 2nd wafer. Then they bond the channel material from the 2nd wafer to the 1st wafer using "smart cut" technology that has been well established in the SOI community. Since they haven't patterned any features on the 2nd wafer, they are immune to any alignment issues during this wafer bonding step. This approach allows LETI to create a high-quality second channel layer (no thermal budget constraints) - just like we described with the CNT FETs used by Stanford in their 3D-SoC demonstration.
Figure 4: Leti “CoolCube” stacked transistors (Source: CEA-Leti)
This method is not 100% sequential, but it is often also described as 'monolithic', which is fair if you are describing the end result: one piece of silicon containing two or more layers of transistors. And, like the GAA-FET example above, if you can do this one time then you might be able to do it multiple times and provide a transistor density roadmap beyond simple pitch scaling. This technique opens up more options in heterogeneity between transistor layers, but there are advantages and disadvantages to either method in terms of cost, process complexity, and transistor quality. A key factor in the cost benefit will be yield, but I'm not that concerned about yield for any of these techniques, compared to the complexity of what we already do on wafers. In both cases-- CFET and CoolCube-- the early results look encouraging, and I think some form of transistor stacking will complement conventional 2D 'Moore's law' scaling in the future.
What is more concerning to me than cost is the power/performance-- 3D transistor stacking will increase the already burdensome parasitic resistance and capacitance found in nanometer transistors. This will present a stiff headwind against the benefit of shrinking the semi/global wiring between cells and blocks, and the final benefit cannot be known before detailed physical design analysis and benchmarking. There will also be thermal issues to consider, but it probably won't be as bad as you might think given the local wiring density and the fact that a great deal of heat dissipates through the wires. But, all said, the potential benefit of continuing transistor cost scaling is probably out there...
Leveraging the third dimension promises to be a fruitful path to advance the power, performance and cost of a wide variety of future computing systems. We are already seeing impressive results utilizing 3D stacked chips, with both interposer-based '2.5D' and true 3D techniques. While most examples to date have been in specialized high end products such as GPUs and image sensors, we should expect both TSV density and cost-effectiveness to continue to progress, eventually allowing us to partition and stack a broader range of solutions at the SoC level in order to provide benefit in terms of speed, bandwidth, and yield. More interestingly, 3D at the SoC level will give us a new ability to build SoCs from heterogenous parts, unencumbered by what can be co-integrated with leading edge CMOS. However, this new capability will present significant challenges in system-technology co-optimization because in many cases they will look nothing like a standard CMOS-based SoC. Concurrently, there is the potential that we may be able to take the underlying circuits into the third dimension, either at the cell level and/or at the transistor level, although a significant amount of design enablement would be needed to unlock this final 'layer' of 3D benefit. In most cases these 3D embodiments are not mutually exclusive-- it is possible that designers of future systems could to leverage four or five types of '3D' at once. Systems are already being constructed with combinations of interposer-based '2.5D' and true 3D stacked chips, and in the future the piece being stacked could be constructed with 3D-SoC, 3D-IC, and/or 3D transistors. All told, we can see the potential to both advance Moore’s Law type scaling and create entirely new computing entities using the third dimension.