Critical Area Analysis and Memory Redundancy

This week, I’m off to present a paper on Critical Area Analysis and Memory Redundancy. It’s at the 2010 IEEE North Atlantic Test Workshop in Hopewell Junction, NY, just up the road from Fishkill. IBM is in Fishkill. IBM invented CAA in what, the 1960’s? Venturing into IBM country to speak on Critical Area Analysis is kind of like being the court jester. I just hope they don’t say, “Off with his head.” :)

But seriously, it amazes me how little is known about this topic. There have been other papers on the subject. I’m merely bringing the topic up to date. I did come up with a way of writing out the formula that appears new, though the underlying principle is the same. Memory redundancy is about having repair resources available to fix an embedded RAM, typically an SRAM. Whatever structure you have repair resources for can be thought of as a unit. You have to calculate the unrepaired yield for all units, then adjust for repair resources. You have to add in the probabilities of having all units good, one unit bad, and so on until there are not enough repair resources to make repairs. It can make a dramatic difference in yield if the memories are large enough or defect rates are high enough. For details, see the Conference Proceedings of the 2010 NATW when they become available.

There are two extreme schools of thought on memory redundancy. One says, “Why bother? I can’t fix the logic, so don’t bother on the memories. Just pressure the foundry to reduce defect rates.” The other extreme says, “Redundancy is good. Put redundancy everywhere.” In between, the designer is either taking an educated guess, or just following the memory IP provider’s guidelines. Those guidelines may have been published when the process was new, and may be pessimistic. The only way to know for sure if adding redundancy helps yield significantly, or is just a waste of chip area and tester time, is to do memory redundancy analysis tied to current foundry defect rates. If a design is going to go through a re-spin, either to a half-node, or to add functionality, it may be the ideal time to ask, “Is the current memory redundancy scheme adequate, or is it overkill?” Calibre YieldAnalyzer has this capability. If you analyze the design with redundancy in mind, the redundancy configuration can be adjusted and the yield rapidly recalculated to facilitate a what-if analysis. It’s the best way to determine the optimal redundancy scheme based on actual foundry defect rates.

The downside of overdesign in this area is very real. Let’s say a large SOC is 50% embedded SRAM. If you add 2% to the area of each SRAM for redundancy, you just increased the chip area 1%. That’s 1% fewer die per wafer over the entire life of the design. It better be worth doing. There’s also tester time to consider. A chip tester is a large, expensive piece of hardware. Every millisecond it spends testing chips is accounted for. If you factor the depreciation cost of that hardware over its lifetime, every extra millisecond that tester spends testing embedded memories and applying the repair resources adds to chip cost. Again, that cost is over the entire life of the chip. Designers may have some idea of how much area they are adding, but the impact on good die vs. gross die may be missed without analysis. Designers probably have much less information about how redundancy overdesign impacts tester time and back-end manufacturing cost. I hope the NATW attendees can give me some perspective on that issue.

Post Author

Posted May 11th, 2010, by

Post Tags

, , , ,

Post Comments


About Simon’s DFM Corner

Technical aspects of DFM, DFM use models, applications and solutions. Simon’s DFM Corner


2 comments on this post | ↓ Add Your Own

Commented on March 20, 2012 at 7:51 am
By Duncan Walker

I just noticed this article. Probably the first discussion of critical area was by Bob Dennard in an IBM tech report in 1966. He also determined that defect size distributions were roughly 1/x^3. The first detailed exposition of critical area analysis was in Charlie Stapper’s May 1980 IBM Journal of R&D article on critical area analysis and its application to yield analysis and redundancy optimization in IBM DRAM. Their approach was to randomly generate (x,y) locations, and then randomly select disks from the size distribution, place them on the layout, and determine the resulting circuit failure, categorized as single bit, double bit in row, row, column, etc. I later automated that process with my VLASIC yield simulator.

Regarding redundancy overdesign, your discussion does not include the fact that you do not have to use the redundancy if yields are high, so there is no tester penalty in that case. Also, you are trading potential area penalty against yield risk. The other thing to keep in mind is that with current memories, some fraction of cells will be bad due to process variation, and you need the redundancy for that too.

Commented on October 7, 2013 at 11:20 am
By Simon Favre

Apologies. I just noticed your post after some delay. I agree with your comments. The CAA tool at Mentor does use foundry defect statistics, so it’s fairly easy to do some what-if analysis. If you do an analysis that tells you defect densities are so low that redundancy is not needed, then you save that area and tester time. A few years ago, I worked with a customer that put redundancy EVERYWHERE, even on small register files where it was a complete waste of area. Using CAA analysis, it was shown that if they only did the top 4 largest memory types in the design, the yield would have been the same.

I referenced Stapper in my NATW paper. IBM also did a lot of work on defect clustering, which means that the distribution is not entirely random. The problem with clustering is getting the data to arrive at an accurate number for the clustering coefficient. Most foundries today use a simple Poisson yield model because of this.

Add Your Comment