Simon’s DFM CornerSimon’s DFM Corner RSS
I posted the following reply to Daniel Nenni’s article on TSMC 28nm yield:
“I agree that design teams need to take more ownership of the yield issue. Unfortunately, yield is such a sensitive topic that people only talk about it when it’s bad! The defect density vs. die size and yield curves above represent the simplest area-based yield model, based on an average across many designs, or many measurements of test structures. The fact is, one design can yield differently from another purely due to defect limited yield. That’s what critical area analysis is for. Design teams that do critical area analysis and take corrective action tend not to have defect or process limited yield issues. They actively reduce the sensitivity of their design to random defects. A design that has a high sensitivity to random defects will yield poorly on the same process, where a design properly tuned for critical area will yield better. In that sense, a high sensitivity to random defects is a design-limited yield issue, or a design-induced yield issue. The thing is, unless the companies that have poor yields publicly discuss the sources of yield loss, whether due to OCV, power density, voltage drop, random defects, or any of a host of other causes, the industry as a whole will not gain anything from this painful learning they went through. ”
Main SemiWiki article:
Here we go again, that wonderful time of year. No, not Christmas, DAC! Now is the time of year when all the EDA vendors are scurrying about finishing whatever it is they plan on doing at DAC. Since the business climate seems to be improving, it would be nice to see an increase in attendance. I like San Diego, I think it’s a great place to have DAC. Some think DAC is going the way of the Dodo, what with blogs, Facebook, Twitter, and whatever else seems more current. Is DAC pointless now? I say no. Sometimes you just gotta get out there and press the flesh. DAC is as much about person to person networking as it is about press releases and demos. DAC is a place to connect with people you might not regularly connect with in spite of all the opportunities to connect in cyberspace. It’s just more personal to greet an old colleague or sit in on a discussion and ask a direct question than it is to view something dry and anonymous on the web. Just my opinion, FWIW.
Make sure to attend some of the DFM sessions in the Mentor booth at DAC. There will be a daily 10 AM briefing “DFM State of the Art at 20nm” by Jean-Marie Brunet Monday-Wednesday. On Monday at 11:00 AM, there will be a panel discussion “DFM Goes Mainstream … Are You Ready?” The Tessent product group is presenting “A New Method to Accelerate the Yield Ramp” Monday at 3:00 PM and Tuesday at 2:00 PM. All of the Mentor booth sessions will offer some rich content you might be interested in, whether it’s 3D IC, Electrical Rule Checking (PERC), Calibre, or Olympus. See you there.
Just back from DAC in Anaheim. The last true Denali party was a smash. We’ll see if the tradition continues under new management. Nothing amazing to report about the show. Attendance seemed OK, though not stellar.
Here’s an observation some may find interesting. I worked booth duty at both the TSMC OIP pavilion and the Global Foundries Global Solutions pavilion. At TSMC, the attendees needed to get something like 8 stamps on a card to get the nice giveaway TSMC was providing. At GF, there were only 3 stamps. It turns out that when people have to collect a large number of stamps for one item, they will not stand at your booth any longer than it takes to get the stamp and maybe fill out a contact card to be entered in another drawing. When there are fewer stamps to collect, people will actually stay for several minutes and listen to a pitch. Sure, there were some people just wanting a free introduction to a topic outside their main area, but that’s OK. It’s good to spread the word, particularly about DFM. Some people were probably just being polite to listen before collecting the stamp, but still they listened. When there are too many stamps to collect, you don’t even get that. Bottom line, if you’re giving stuff away, you can’t give it away fast enough. :=)
How would I change this? Require fewer stamps, but require the attendees to stand still for 5 minutes to hear a brief pitch to get the stamp. Maybe if it’s interesting and pertinent, they’ll stay longer than required, but they should have to stay at least a little while to get a stamp. In trying to get more partners into the pavilion, and get attendees to see more partners, it actually works against the partners. I had way more interested people listening and discussing at GF than at TSMC. Just my observations.
This week, I’m off to present a paper on Critical Area Analysis and Memory Redundancy. It’s at the 2010 IEEE North Atlantic Test Workshop in Hopewell Junction, NY, just up the road from Fishkill. IBM is in Fishkill. IBM invented CAA in what, the 1960′s? Venturing into IBM country to speak on Critical Area Analysis is kind of like being the court jester. I just hope they don’t say, “Off with his head.”
But seriously, it amazes me how little is known about this topic. There have been other papers on the subject. I’m merely bringing the topic up to date. I did come up with a way of writing out the formula that appears new, though the underlying principle is the same. Memory redundancy is about having repair resources available to fix an embedded RAM, typically an SRAM. Whatever structure you have repair resources for can be thought of as a unit. You have to calculate the unrepaired yield for all units, then adjust for repair resources. You have to add in the probabilities of having all units good, one unit bad, and so on until there are not enough repair resources to make repairs. It can make a dramatic difference in yield if the memories are large enough or defect rates are high enough. For details, see the Conference Proceedings of the 2010 NATW when they become available.
There are two extreme schools of thought on memory redundancy. One says, “Why bother? I can’t fix the logic, so don’t bother on the memories. Just pressure the foundry to reduce defect rates.” The other extreme says, “Redundancy is good. Put redundancy everywhere.” In between, the designer is either taking an educated guess, or just following the memory IP provider’s guidelines. Those guidelines may have been published when the process was new, and may be pessimistic. The only way to know for sure if adding redundancy helps yield significantly, or is just a waste of chip area and tester time, is to do memory redundancy analysis tied to current foundry defect rates. If a design is going to go through a re-spin, either to a half-node, or to add functionality, it may be the ideal time to ask, “Is the current memory redundancy scheme adequate, or is it overkill?” Calibre YieldAnalyzer has this capability. If you analyze the design with redundancy in mind, the redundancy configuration can be adjusted and the yield rapidly recalculated to facilitate a what-if analysis. It’s the best way to determine the optimal redundancy scheme based on actual foundry defect rates.
The downside of overdesign in this area is very real. Let’s say a large SOC is 50% embedded SRAM. If you add 2% to the area of each SRAM for redundancy, you just increased the chip area 1%. That’s 1% fewer die per wafer over the entire life of the design. It better be worth doing. There’s also tester time to consider. A chip tester is a large, expensive piece of hardware. Every millisecond it spends testing chips is accounted for. If you factor the depreciation cost of that hardware over its lifetime, every extra millisecond that tester spends testing embedded memories and applying the repair resources adds to chip cost. Again, that cost is over the entire life of the chip. Designers may have some idea of how much area they are adding, but the impact on good die vs. gross die may be missed without analysis. Designers probably have much less information about how redundancy overdesign impacts tester time and back-end manufacturing cost. I hope the NATW attendees can give me some perspective on that issue.
If attendance at the TSMC Technology Symposium in San Jose (link redirects) is any indicator, the economy must be improving. Official attendance was said to be 1500, but it felt like it was more. Of course, there were a lot of TSMC staff milling about, and they did have 2 of the booths in the vendor area for their own offerings, but there were still a lot of people there. The thing EDA vendors like about the TSMC event is that all the attendees are pre-screened. All of them have some business relationship with TSMC, which usually means they either buy Silicon, or they are in the IP or Services business. That means if somebody expresses interest in one of your products, there is potential business there. Of course, they may have absolutely no signature authority, but it all helps. The cost to put up a booth there is low enough that it only takes one or two meaningful discussions to make it worth doing. Things must be looking up if that many people are interested enough to attend. I would say there was a broad cross-section of people there from big digital houses to analog/mixed signal to small design houses. It was a good event.
The recent earthquake in Taiwan did have an impact on TSMC’s production, but probably not as great as some in the press have indicated: TSMC loses 40K wafers in quake. This blog gives a more careful reading of the announcement. What it comes down to is “1.5 days loss of wafer movement for the company in total.”
Regardless of whether or not they had to scrap any wafers, 1.5 days of downtime is 1.5 days of not producing wafers in a facility that is costing you an obscene amount of depreciation when it’s not producing wafers. It’s not actually 1.5 days for every facility they have, but they averaged it out. It was obviously worse in Tainan, with Hsinchu barely affected. 1.5 days for the company as a whole is roughly 1.6% of a quarter. That lost time is revenue delayed, though probably not lost. If they’re not already at capacity, they can probably make up the missing wafers for their quota. At the end, it comes down to, “How many wafers did you ship this quarter? How much did you get for them?” As I said, revenue delayed, not lost, but there is a cost associated with that.
Taking liberties with Latin and Caesar’s “Veni, vidi, vici” line, I can say “Veni, vidi, steti.” I came, I saw, I stood. :=) While the main Mentor booth seemed to be quite busy the whole time, I was elsewhere working booth duty at the TSMC OIP pavilion. It was a nice, open space kind of like the vendor area at a TSMC tech forum. The TSMC booth was very busy on Monday, with a lot of people representing that large company known as “Self” (groan), but the rest of the week was very light. People did come through in waves as general sessions ended. I was there to give demos. I talked to a number of people who wanted a 5-minute education on DFM. That’s OK, because we need more people to know about DFM, but 5 minutes is only enough to convince them that either they need to know more, or it’s something they don’t need at this time. I gave maybe 4-5 real demos to seriously interested people all week. I had 6 demo time slots per day, so yes, I stood around most of the time.
I had two good meetings with current DFM customers, one scheduled, one impromptu. Both meetings were of high value to all involved. When traffic is slow, the real value of DAC is being able to meet with people you might not otherwise have in one place and time. That’s getting quality time with quality people.
The one place at DAC I would not like to have been in is the Magma booth. On Monday, as the show opened, there was buzz about the financial community raising “going-concern” doubts. The Magma booth at the TSMC pavilion was empty most of the week. It wasn’t even staffed full-time. The main Magma booth was extremely small, given their past history. Remember the volcano? Those days are long gone. Will Magma survive the downturn? Only time will tell.
In contrast, TSMC’s neighbor at DAC was Apache Design Solutions, which seemed to be doing very well. Their traffic was helped by having a great giveaway (plush bear toy). They seemed to be having a very productive and busy DAC. On the other hand, they are still a private company, and people who have been there a while still haven’t collected anything on their stock options. Who’s going to go public in this climate?
Some people at DAC were wondering if DAC will survive. Some wondered if EDA will survive. One person even projected that EDA will be dead and gone in 5 years. I don’t believe that. I think EDA will surely survive. Just think of all those Calibre licenses! :=) I don’t know about DAC, though. It certainly doesn’t provide as much value as it used to. I happen to like DAC, I’ve worked quite a few. Let’s see what happens next year. It’s probably a safe bet that Anaheim in June will at least be warmer than San Francisco in July.
TSMC and Mentor Graphics recently held a joint Marketing seminar (06/25/09) for mutual customers to go over the new DFM requirements at 45/40 nm. (In my first post, I mused about the implications of making some DFM analysis steps mandatory.) When the presentations at the seminar ended, and the Q&A began, two customers basically asked the same question, “What do you mean by mandatory?” Of course, TSMC wasn’t going to stand over them and say, “Mandatory means mandatory, what part of mandatory don’t you understand?” :=) TSMC admitted that they haven’t changed the tape-out checklist to forcibly include the DFM checks (CMP and LPC). This is what begs the question, what do you mean by mandatory?
What it really comes down to is this: if you tape out without these checks at 45/40 nm, you are taking a risk. If the design has yield issues and you didn’t run these checks, TSMC might wave the design off and insist that you take ownership of the yield issues. This is a huge risk. If the part comes out and has zero yield, and they find out it’s because of a level 1 hotspot in litho, or bridging due to CMP that you didn’t check for, then you have to eat the cost of a respin. Most design teams doing large designs count on a certain number of respins before full production anyway, but to have to do one before you have any functional parts is a disaster. If you get parts that yield, but yield poorly, it can be just as bad, because these things can take a long time to find using traditional Low Yield Analysis, or FA. As someone who once had to re-spin a custom design due to a flaw in the incoming spec, the worst thing management can ask you before a re-spin is, “Are you sure that’s all that’s wrong with the design?” That one is guaranteed to cause sleepless nights.
TSMC pointed out that they have very little history on the 45/40 process at this time. This means there might be yield issues, and there is a definite need to do DFM analysis at this point in time. TSMC also pointed out that DFM analysis might not be mandatory once the process is considered stable. I think that by that time, the next process node will be in the hands of the early adopters, and DFM will be mandatory for that node, so the need for DFM won’t be going away anytime soon. According to this article and this article on the web, the yield issues are real. Considering the total cost of developing a chip in 45/40 nm and the risks of really low yields, DFM tools seem like cheap insurance.
Conclusion? Buy DFM tools from Mentor Graphics. You’ll sleep better. :=)
In my previous post about TSMC making some DFM analysis steps mandatory at 45nm, I ended with a question about why the foundries can’t just write better design rules (and rule decks) to make sure all designs yield well. This is a topic that has been discussed elsewhere, but here’s my take on it.
If we take a step back for a moment, there is something generic about DFM analysis that needs to be considered. Each type of DFM analysis has a “sphere of influence” in its scope. For CMP analysis, the analysis window size is around 20um. That’s large compared to a standard cell. For Critical Area Analysis (my favorite tool), the analysis scope is the size of the largest random particle to be considered, typically anywhere from 2um to 10um diameter. For Lithography analysis (LFD), the scope is a little smaller, roughly 1-2um.
How does this get back to rules? What’s the scope of a generic DRC rule? A shape all by itself, or a shape within a shape, or a shape and its nearest neighbors. In common practice, that’s about it in DRC land. Mentor has eqDRC, an extension of the Calibre nmDRC product that allows you to write equations to express rules instead of using fixed values, but you still can’t easily get past the nearest neighbor in a DRC rule. Yes, you can write complex rules to go one or two shapes past the nearest neighbors, but the complexity of the rule and its runtime will go up exponentially, the farther you try to go from the original shape. What this means is that it will be difficult if not impossible to write rules that take into account enough of the context of the shape being checked. DFM tools automatically take context into account. That’s the big advantage of model-based, over rule-based analysis.
At this point, DRC is “necessary but not sufficient.” The more “not sufficient” it becomes, the greater the need for DFM tools that see the extended context of all shapes in the design that are close enough to have any adverse effect. Of course, the farther upstream you find and fix a DFM issue, the easier it is to fix. That’s why I expect this trend of pushing the designers to do DFM analysis to continue, and for more foundries to follow TSMC’s lead.
By the way, the comment about DRC being necessary but not sufficient is not meant to belittle DRC. DRC is of course, mandatory at all nodes. DRC is also absolutely necessary in a DFM flow because for one thing, all DFM tools assume the design to be (essentially) DRC clean. If you get too far outside the allowed range of analysis in DFM because the design is not clean, the results can be inaccurate.
Ah, but Restricted Design Rules (RDRs) are going to fix everything, right? We won’t even need DRC or DFM anymore, right? Not so fast…
If you are a TSMC customer, no doubt you have heard that TSMC has announced that for 45nm (and presumably beyond), LPC and VCMP are mandatory for block/chip. What does this mean? It means that TSMC is requiring lithography and planarity analysis for all 45nm designs. Their website says customers can either run it themselves, or contract TSMC services to do it. The most cost-effective way would be for the customers to run it themselves, but some might not have the resources to do that. Of course, by the time you pay TSMC to do it 3 or 4 times, you could have bought some tools and run it yourself. That’s good for Mentor and other EDA vendors, right? Probably, but there has to be more to it than that.
So, what’s really behind this? TSMC isn’t generally known for making things easy for EDA vendors. Why make a new step mandatory like DRC is? Is it because they’re having real yield issues at 45nm, and they want the customers to find and fix issues themselves? Possibly. Is it because the DFM tools are finally mature enough to be a required part of the flow? Well, maybe they are, but that’s probably not the reason.
I think it has to do with money. As they say, “Follow the money.” Having low-yielding parts in the fab doesn’t do anybody any good. Most TSMC customers buy wafers at a pre-negotiated price. If the part yields poorly, the customer will likely have to buy more wafers to make up the volume, and will try to renegotiate the price. How is it bad for TSMC if they buy more wafers? Because that makes TSMC’s production starts more unpredictable. A small company with one product could go out of business if good die are costing them too much as a result of low yield. Orders from medium-sized companies could fluctuate wildly. That would really make things unpredictable for TSMC. Low yield would also hurt TSMC’s reputation. They like being #1 in the foundry business. They like being thought of as the best. Having lots of customers complaining about price and yield puts that at risk. Not only that, but to resolve low yields, TSMC would have to devote more resources to these problem chips, which would cost them real money. Even worse, some large customers actually buy good die. Low yield for those customers would directly impact TSMC’s bottom line, as TSMC would have to make up the difference. Follow the money. Having happy customers who sell more product, make more money, and come back for more high-yielding wafers probably makes the most sense for TSMC. The trend seems to be to make the customers more responsible for DFM. Expect other foundries to follow suit.
So why can’t the foundry just write better rules to make sure that all designs yield well? Hmmmm….
About Simon’s DFM Corner
Technical aspects of DFM, DFM use models, applications and solutions.
- TSMC 28nm yield (SemiWiki)
- DAC 2011 is upon us!
- You can’t give stuff away fast enough
- Critical Area Analysis and Memory Redundancy
- Economy must be improving
- TSMC loses some production time due to earthquake
- March 2012 (1)
- May 2011 (1)
- June 2010 (1)
- May 2010 (1)
- April 2010 (1)
- March 2010 (1)
- August 2009 (1)
- July 2009 (1)
- June 2009 (1)
- May 2009 (1)