Dock Brown on Succeeding at Failure Analysis

Reading time ( words)

Dock Brown of DfR Solutions gave a keynote speech at SMTAI, “Requirements for Both Cleaning and Coating to Building Medical Hardware.” Barry Matties and Happy Holden sat down with Dock to discuss the current trends he sees in failure analysis, the concept of “rules versus tools,” and how predictive engineering software used early in the design cycle can help predict failures in components and microvias and drive cost down.

Barry Matties: Dock, can you just tell us a little bit about what DfR Solutions does?

Dock Brown: DfR Solutions is an international reliability consulting firm. There are basically three aspects to the technical portion of our business. We do 1,200–1,300 failure analyses a year. The value proposition that we offer to our clients is that we have a whole suite of on-staff Ph.D. material science folks. These people are experts in chemistry, physics, and metals. What we try to do first is to understand the customer's problem because most of the time, they don't just want a quick cross-section in a picture—they want us to get to the root cause of the root cause of the root cause.

We first try to thoroughly understand the problem or set of problems the customer is facing, and then when we acquire the samples, we treat those very gingerly. We make sure that we have damaged and exemplary sample because oftentimes, you want to do an A/B comparison between the two. Then, we do a very careful disassembly or cross-section; we do non-destructive kinds of things first. Usually, you can see the issue there, and even if you can't see the problem non-destructively, it will help you focus in on it. For example, if you have a 600-ball ball grid array (BGA) with an issue, we can't afford to cross-section every single one of those little guys; we have to focus on the ones that are most likely to be problematic. Once we get down to the failure in the BGA ball, where is that fracture occurring—is it in the metallization or the ball itself, or is it even a pad cratering kind of issue?

The second part of our business is that we sell a software package called Sherlock that does reliability physics analysis. We're applying mathematical modeling to all the various complex material structures that are involved in any kind of electronics applications. You have, in the case of a circuit board, polymers, metals, ceramics, etc. All of these materials have very different properties, such as stress and strain properties, fracture mechanics, etc.

The Sherlock software can model all that. We combine the modeling with the failure analysis work to do what the people in the automotive industry, for example, are calling "shift left." This is instead of waiting until the last portion of the product development cycle where you're doing low-rate initial production, and then verification validation qualification testing—if you discover problems there, it's very expensive. What the shift left means is that they try to do pre-failure analysis and reliability physics modeling right at the schematic capture and board layout portion of the design development sequence.

Matties: Predictive engineering, basically.

Brown: Exactly. We can model all those stresses, strains, and ruptured things you're looking for while it's still in software. If you're going to have problems, you want to find those early. In the early part of my career, I worked at Rocket Research Corporation, which made rocket engines for satellites and things like that. The watchword there was if you had a problem, where and when do you want to find it? Do you want to find it when it shows up or when it blows up? In the rocket business, sometimes blowing up is literal. That's an analogy for the solution set that we're trying to offer people nowadays. You can do that shift left kind of thing and find the problem when it shows up, as opposed to when it blows up in testing.

Matties: That’s a new shift in thinking though. Do you see more companies looking for this?

Brown: More and more people are looking at it because it’s a heck of a lot less expensive. You find issues earlier, and time is money. That’s particularly true in the case of automotive. The automotive industry is working on new standards, not only for the circuit board applications, but also within the integrated circuits themselves. As geometries are shrinking in the integrated circuits, as well as at the board level, the CMOS properties inside memory elements and processors are not as long-lived as they used to be.

In the old days, people would say, “It’s not a tube anymore, it’s a transistor—it’s all solid state; nothing moves, and nothing will break.” What we’re finding is those geometries are shrinking in integrated circuits, then at the atomic level, things really do move. We can now model that movement mathematically—for example, electromigration in the metallization of the integrated circuit. If semiconductor manufacturers are going to have a problem with electromigration, we can help them figure that out before they go with the production. In the case of end users in automotive, which is such a harsh environment, if you’re going to have those kinds of issues, you want to be able to figure that out.

Matties: Obviously, it’s driven by the cost of failure, right? The higher the cost, the more likely they are to look at it on the front end?

Brown: It’s also the consequence of failure.

Matties: That’s part of the cost, for sure. As technology changes, I would think that you’re seeing new types of failures. What trends and failures are you seeing?

Brown: One of the big surprises for all of us was a presentation at the SMTA Pan Pacific Micro-electronics Symposium about three years ago.

Some scientists at RESRI and IMS in Europe were looking at what was going on with the radiation effects in COTS integrated circuits. For a long time, that wasn’t that big of a deal because most applications are terrestrial rather than up in the air. We have the atmosphere protecting us, and in this particular presentation, what the folks from Europe showed was an aircraft was transiting the South Atlantic Anomaly.


The South Atlantic Anomaly is an area on the earth’s surface where the level of cosmic radiation is quite a bit higher, and it has to do with how the Van Allen belt twists. The Van Allen belt is this ionizing belt that shields and screens earth from cosmic radiation. There’s a small twist to the belt, and the South Atlantic just off the coast of Brazil has a much higher total of ionizing radiation nodes than any place else on the surface of the planet. There was an airplane that was transiting through that area, and there was a disruption in the flight control system; thankfully, they were able to recover, and the only thing that happened was some overhead bins popped.

To read this entire interview, which appeared in the November 2018 issue of Design007 Magazine, click here.


Suggested Items

I-Connect007 Editor’s Choice: Five Must-Reads for the Week

01/28/2022 | Andy Shaughnessy, I-Connect007
We just wrapped up IPC APEX EXPO 2022, and I think it went better than anyone could have expected. I wasn’t sure the live show would actually take place. With the COVID protocols in California changing daily, no one knew for sure that a live show would even be allowed to open. But here's my picks for the week, both from the show and from the industry.

Max Seeley: Some Designers Hesitant to Adopt New Tech

02/20/2020 | Andy Shaughnessy, Design007 Magazine
I spoke with Max Seeley of 3M about a design class he presented at AltiumLive in Frankfurt, Germany. We also discussed autorouting and the continuing advances in EDA tools, as well as the schism between users who embrace new technology and those who still prefer to layout their boards the old-fashioned way. Which camp do you belong to?

Interconnect Reliability Correlation With System Design and Transportation Stress

08/19/2019 | Dr. Paul Wang, Vincent Weng, and Dr. Kim Sang Chim, Mitac International Inc.
Interconnect reliability is very critical to ensure product performance at predefined shipping conditions and user environments. Plating thickness of the compliant pin and the damping mechanism of electronic system design are key success factors for this purpose. This paper discusses design variables—such as pin hard gold plating thickness, motherboard locking mechanism, and damping structure design—to ensure interconnect reliability.

Copyright © 2023 I-Connect007 | IPC Publishing Group Inc. All rights reserved.