In the past 40 years, Moore's Law has grown beyond its original intentions. As a result, it is in danger of losing its meaning, and possibly its usefulness. But questions about the law can be addressed by looking at its history. In doing so, the law can be reformulated and described using learning curve theory — making it more relevant today. The coupling of technology innovations to the economic drivers of semiconductor industry growth will allow us to understand the important role of past and future lithography innovations in the continuation of Moore's Law.
The impact of the semiconductor integrated circuit (IC) on modern life is hard to overestimate. From computers to communication, entertainment to education, the growth of electronics technology, fueled by advances in semiconductor chips, has been phenomenal. The impact of these developments has been so profound that it is now often taken for granted: consumers have come to expect increasingly sophisticated electronics products at ever lower prices.
Underlying the electronics revolution has been a remarkable evolutionary trend called Moore's Law. Begun as a simple observation that the number of components integrated onto a semiconductor circuit doubled each year during the early life of the industry, Moore's Law has come to represent the amazing and seemingly inexhaustible capacity for exponential growth in electronics.
History of Moore's Law
The evolution of semiconductor technology from crude single transistors to million-transistor (and now billion-transistor) microprocessors and memory chips is a fascinating story. One of the first "reviews" of progress in the semiconductor industry was written by Gordon Moore, industry icon and a co-founder of Fairchild Semiconductor and Intel . After only six years since the introduction of the first commercial planar transistor in 1959, Moore observed an astounding trend — the number of components/chip was doubling every year, reaching ~60 components/chip in 1965 (Fig. 1). Extrapolating this trend for a decade, Moore predicted that chips with 65,000 components would be available by 1975! This observation of exponential growth in circuit density has proven to be one the greatest examples of prescience in modern times.
Some important details of Moore's remarkable 1965 paper have become lost in the lore, however. First, Moore described the number of components/IC, which included resistors and capacitors, not just transistors. Later, as the digital age reduced the predominance of analog circuitry, transistor count became a more useful measure of IC complexity. The accuracy of Moore's extrapolation depends on this definition and a switch to transistor count must necessarily involve a discontinuity in Moore's original trend line. Further, Moore clearly defined the meaning of the "number of components per chip" as the number which minimized the cost/component. For any given level of manufacturing technology, one can always add more components — the problem being a reduction in yield and thus an increase in the cost/component. As any modern IC manufacturer knows, cramming more components onto ICs only makes sense if the resulting manufacturing yields allow costs that produce more commercially desirable chips. This "minimum cost per component" concept is in fact the ultimate driving force behind the economics of Moore's Law.
Figure 1. Moore's 1965 prediction of the doubling of the number of components on a chip each year, based on historical data and extrapolated to 1975.
In 1975, Moore revisited his 1965 prediction and provided some critical insights into the technological drivers of the observed trends . Checking the progress of component growth, the most advanced memory chip at Intel in 1975 had 32,000 components (but only 16,000 transistors). Thus, Moore's extrapolation by three orders of magnitude was off by only a factor of 2. More important, Moore divided the advances in circuit complexity among its three principle components: increasing chip area, decreasing feature size, and improved device and circuit designs. Minimum feature sizes were decreasing by ~10%/yr (resulting in transistors that were ~21% smaller in area, and an increase in transistors per area of 25%/yr). Chip area was increasing by ~20%/yr. These two factors alone resulted in a 50% increase in the number of transistors/chip/year. Design cleverness made up the rest of the improvement (33%). In other words, the 2¥ improvement = (1.25)(1.20)(1.33).
Another important detail in Moore's second observation often lost in the retelling: the definition of minimum feature size. Moore explained that both the linewidths and the spacewidths used to make the circuits are critical to density. Thus, his density-representing feature size was an average of the minimum linewidth and the minimum spacewidth used in making the circuit. Today, we use the equivalent metric, the minimum pitch divided by 2 (called the minimum half-pitch). Unfortunately, many modern forecasters express the feature size trend using features that do not represent the density of the circuit as well as the minimum half-pitch.
By breaking the density improvement into its three technology drivers, Moore was able to extrapolate each trend into the future and predict a change in the slope of his observation. He saw the progress in lithography allowing continued feature size shrinks to "1µm or less." Ongoing reductions in defect density and increases in wafer size would allow the die area trend to continue, but in looking at the "device and circuit cleverness" component of density improvement, Moore saw a limit. Although improvements in device isolation and the development of the MOS transistor had contributed to greater packing density, he saw the latest circuits as near their design limits. Predicting an end to the design cleverness trend in 4–5 years, Moore predicted a change in the slope of his trend from doubling every year, to doubling every two years (Fig. 2).
Figure 2. Moore's second observation of 1975 showing his prediction of a change in slope, from doubling the number of components each year to doubling every two years.
The prediction of a slowdown was both too pessimistic and too generous, however. The slowdown from doubling each year had already begun by 1975 with Intel's 16Kb memory chip. The 64Kb DRAM chip, which should have been introduced in 1976 according to the original trend, was not available commercially until 1979 . But the prediction of a slowdown to doubling components every two years instead of every year was too pessimistic. The 50% improvement in circuit density each year due to feature size and die size was really closer to 60% (according to Moore's retelling of the story ), resulting in a doubling of transistor counts/chip ~18 months. Offsetting the curve to switch from component counts to transistor counts and beginning with the 64Kb DRAM in 1979, the industry followed the "new" Moore's Law trend throughout the 1980s and early 1990s.
After nearly 40 years, extrapolation of Moore's Law now seems less risky. In fact, predictions of future industry performance have reached such a level of acceptance that they have been codified in an industry-sanctioned "roadmap" of the future. The National Technology Roadmap for Semiconductors (NTRS)  was first developed by the Semiconductor Industry Association in 1994 to serve as an industry standard Moore's Law and went international in 1999, becoming the ITRS, the International Technology Roadmap for Semiconductors. The 1994 Roadmap extrapolated the current trends to the year 2010, where 70nm minimum feature sizes were predicted to enable 64Gb DRAM chip production.
Recent industry trends certainly do not show a slowdown in Moore's Law — many observers talk about an acceleration — but will it continue for the 15 years extrapolated out in the last edition of the ITRS? To answer this question, a more careful look at the drivers of Moore's Law is required.
Why does Moore's Law work?
That Moore's Law has continued to survive over 40 years begs for an explanation. Some have argued that industry momentum simply pushes semiconductor technology forward. Others describe semiconductor technology development as "fashionable" engineering, attracting the brightest minds. Most regard the law as a self-fulfilling prophecy . We all understand the economic benefits of continuing down the roadmap, and the economic consequences of falling behind our competitors, so we make Moore's Law happen because we want it to be true.
Ultimately, the drivers for technology development fall into two categories: push and pull. Push drivers are technology enablers, those things that make it possible to achieve the technical improvements. Moore described the three push drivers as increasing chip area, decreasing feature size, and design cleverness. Pull drivers are the economic drivers, those things that make it worth while to pursue the technical innovations. Although the two drivers are not independent, it is the economic drivers that always dominate. As Bob Noyce, co-founder of Intel, wrote in 1977  "...further miniaturization is less likely to be limited by the laws of physics than by the laws of economics."
The economic drivers for Moore's Law are extraordinarily compelling. As the dimensions of a transistor shrink, the transistor becomes smaller, lighter, faster, consumes less power, and in many cases is more reliable. All of these factors make the transistor more desirable for virtually every possible application. But there is more. Historically, the semiconductor industry has been able to manufacture silicon devices at an essentially constant cost/area of processed silicon. Thus, as the devices shrink, they enjoy a shrinking cost/transistor. As many have observed, it is a life without tradeoffs. Each step along the roadmap of Moore's Law virtually guarantees economic success.
It is interesting to note that the most compelling benefit of Moore's Law, a better transistor at a lower cost, does not fundamentally rely on increasing the number of transistors/chip. The increased memory capacity and/or functional abilities of more complex chips enable new applications that increase the demand for chips. But this high-end driver does not account for the majority of chips produced. The ability to produce moderate functionality at incredibly low prices enables new mass markets (like the microprocessor running Linux in a microwave oven, or a dishwasher that has more compute power than existed in the world in 1950). For these applications, increased functions/chip is not important. Instead, increased circuit density, which drives down costs and improves chip performance, is an enabler for all applications.
Redefining Moore's Law
A look at the trend of DRAM chip deliveries in the last five years shows that the number of transistors/chip has not been growing at anywhere near the historical pace of the past few decades. Examining the number of transistors used by each microprocessor generation also shows that the historical Moore's Law, doubling the number of transistors every 18 months, has not been matched in microprocessor manufacture for some time . Simply put, the capability to make a large number of transistors on a chip has outstripped the market demand for those chips. Why aren't 4Gb DRAM chips in mass production today, as the historical Moore's Law trend would suggest? Quite simply, there is no mass market demand for such a chip. Why don't microprocessors use as many transistors as our manufacturing capabilities would allow? Because the microprocessor designers (the customers of those transistors) can't yet design a chip to use that many transistors.
So is Moore's Law dead? Not at all. Its greatest value comes from improved circuit density and transistor performance, not increased functions/chip. The law is not about scaling up; it is about scaling down. It is the shrinking transistor that creates the compelling economic advantages of Moore's Law. As long as transistor scaling continues, so does Moore's Law. This fact is implicitly coded into the latest editions of the ITRS. While the first NTRS named each technology "node" or generation after the DRAM generation that it enabled (e.g., the 64Mb node), recent roadmaps simply label the nodes by their lithographic feature size. Transistor scaling continues, and is now the only true measure of Moore's Law.
Moore's Law as a learning curve
The economic drivers for Moore's Law are clear and compelling but they explain why it exists, not how. One possible explanation comes from learning curve theory, a basic tenet of which is that a consistent improvement in performance of some task is possible through increasing practice. The learning curve expresses a constant percent improvement in some performance metric each time the cumulative number of trials, or practice attempts, is doubled. By plotting the performance metric of interest as a function of the cumulative output of a person, factory, or industry, learning curve theory predicts a straight line on a log-log scale.
The metric of interest is the transistor size, as noted above. But what is the measure of "practice," the cumulative output of the industry? One thing is certain, time should not be used as the independent variable. Although there can be some debate, I propose cumulative area of silicon produced by the industry as the most appropriate metric of industry output. Figure 3 shows this new formulation of Moore's Law as a learning curve, compared to the traditional time-based expression.
The historical trend (Fig. 3b) shows there is a roughly linear progression (on the log-log scale) of minimum feature size as a function of cumulative area of silicon produced by the industry. There is a distinct slowdown of the learning curve in the 1990–1995 timeframe, speeding back up to the historical trend by 1996–1998. It is interesting to contrast this observation based on the learning curve version of Moore's Law with the traditional view. The time-based view of the law (Fig. 3a) saw no slowdown in the early 1990s, and has shown an acceleration of Moore's Law over the last six years or so. Changed predictions of the law found in the 1994–2001 roadmaps, while often called an "acceleration" of Moore's Law, can be seen in the learning curve formulation as a correction back to the historical trend line from the slowdown of the early 1990s.
If the learning curve theory is viable, one would expect this new formulation to be more predictive of future trends. This formulation also makes it perfectly clear that silicon volume drives continued innovation and in turn, Moore's Law.
Coupling technology to economics: The limits of Moore's Law
While the role of economics in Moore's Law is important, continuation of the IC evolution to allow smaller and smaller features is dependent on technology development, not just economics. There is a critical technology/economy cycle that rolls down the slope of Moore's Law. A technological development that enables the cost effective manufacture of smaller transistors allows the manufacturer to offer a new, desirable product to the market place (faster, smaller, cheaper). An increase in performance or a decrease in cost, or both, creates a new market for the product, which increases the volume of sales. Higher sales volumes allow a percentage of those sales to be reinvested in the development of the next technology evolution (the cause and effect relationship of an industry learning curve). Though each technology generation requires an increasing investment for development, the higher sales volume driven by the newly enabled markets justifies the investment. Technology development feeds economic growth, which allows investment in further technology development.
But, as Moore himself has said many times, no exponential is forever. There are both economic limits and technology limits (no amount of money can be used to overcome the laws of physics). The economic limits are defined by the growing demand for more silicon. If demand growth slows, so will Moore's Law. On the technology side, increasingly costly manufacturing processes required by the smaller transistors may cause an increase in the historical manufacturing cost/area of silicon. Higher cost/function limits the potential for growth of new markets, which lowers the growth of cumulative silicon area, which slows the learning.
As an example, current manufacturing techniques push the absolute physical limits of what is possible to achieve in optical lithography. As we approach the "brick wall" physical limits of the imaging technology, the cost required to get an incremental improvement in performance rises exponentially . The economic limit is always reached before the physical limit (Fig. 4a). But innovation can move the brick wall — whether perceived or real — allowing us to jump to a more beneficial capability vs. cost curve. In optical lithography, the combination of phase shifting masks, off-axis illumination, and optical proximity correction have moved the brick wall of ultimate resolution lower by a factor of two. The new cost/capability curve is initially higher cost for a given capability than the previous curve, but as the capability required for manufacturing is raised, a cross-over point is reached where the new technology is more cost effective than trying to push the old technology closer to its limit. At this point, the wise lithographer makes the leap to the new innovation.
Each generation of process technology developed to enable one more node on the ITRS roadmap requires a host of new and expensive equipment and materials. In what has sometimes been called Moore's second law, the cost of new fabrication facilities seems to rise exponentially over time, yet the economic driver of the law requires significant reductions in the cost/transistor over time. In fact, the amazing economic reality is that the cost of producing 1cm2 of finished silicon has remained approximately constant (or has risen only slowly) throughout the entire history of the semiconductor industry.
There have been three main avenues to control manufacturing costs/area of silicon in the presence of dramatically rising equipment and material costs: increasing wafer sizes, increasing yields, and improved equipment effectiveness. From the one-inch wafers used 40 years ago to today's 300mm wafers, this increase in wafer size takes advantage of the fact that some processing costs are essentially per wafer rather than per unit area. An increase in wafer size can actually reduce the processing costs/unit area of silicon. As the slow transition of the industry to 300mm wafers has shown, however, there is no guarantee that larger wafers will be more cost effective, and significant development effort is required to provide improved process quality over larger wafer sizes at reduced cost/unit area. It is also unclear whether wafers larger than 300mm will prove cost effective.
The second method for improving device costs is to improve the yield of the devices. It costs about the same to build a non-working device as it does to build a working device, so, all other things being equal, a process with 50% yield will have 2¥ the cost per finished, saleable device than a process with 100% yield. In the 1970s, yields of 20–40% for leading edge products were not uncommon. By the 1980s, 50–70% yields were the norm. By the 1990s, chipmakers came to expect 80–90% yields during volume production. While this trend has resulted in considerable cost improvements for the industry, there is little upside left with respect to yield. The emphasis today is on increasing the ramp to high yield, that is, decreasing the time from first silicon to 90%+ yield so that the average fab throughput of good devices is nearer its theoretical maximum.
Overall equipment efficiency is the final, and possibly most significant, enabler for low-cost semiconductor manufacturing. By far the most important component of equipment efficiency is throughput. For example, a stepper in 1980 cost $500K, while a scanner today may run more than $10M. However, that 1980 stepper had a maximum throughput of 40 four-inch wph, while today's scanner is capable of processing 90 300mm wph. The result is a roughly constant equipment cost/cm2 of processed silicon. There is still room for improvement. Fabs typically average actual throughputs in the lithography bay that are <0.5 of the theoretical maximum.
Innovations in lithography
By redefining Moore's Law as a transistor density trend, the minimum lithographic feature size takes on the dominant role as the industry's technology metric. Of course, many other important factors such as overlay capability, gate oxide thickness, junction depth, etc., must scale with minimum feature size in order to gain the benefits of the transistor shrink. While each of these factors represents great technical challenges, cost-effective lithography has traditionally been the limiter in the progress of Moore's Law. Over the years, many innovations in optical lithography have moved the physical limits and kept the costs acceptable for ever improving capabilities. These innovations have included: wavelength reduction, increasing numerical apertures, resolution enhancement technologies, improved resist performance, reduced process variations, and advanced process control. More innovations are still possible, such as: wavelength reduction to 157nm, increasing numerical apertures to 0.9, immersion lithography, real equipment productivity that approaches the theoretical, improved process control, more extensive use of phase shift masks and other "hard" resolution enhancements, polarization control, and promulgation of lithography friendly designs.
By using innovations like those above, optical lithography can continue to meet industry needs for the foreseeable future. Developing capabilities is not enough, however. These capabilities must enable the required lithographic performance at the required price point.
Moore's Law is a direct consequence of the incredible and unique scaling laws of semiconductor devices. By making a transistor smaller, that device becomes better in every respect: smaller, lighter, faster, lower power, and cheaper. It also becomes more difficult to make, and that means a smaller device is cheaper to make only as a result of a concerted engineering effort to make it so. Moore's Law is not a law; it is an act of will. Considerable effort is devoted to its continuation because there is a strong economic incentive to do so.
Push drivers are the technology innovations that enable low cost manufacturing of smaller transistors. Pull drivers are the new applications that these smaller, faster, cheaper, more powerful devices enable. As learning curve theory implies, the importance of pull drivers is in the creation of increasing demand and thus increasing volume of silicon area. These two drivers, push and pull, are inexorably linked due to the relationship between capability and cost for the technology push, and the relationship between cost and demand for the volume pull. Any reduction in the force of the push or the pull drivers will result in a slowdown in the time-based Moore's Law.
The economic benefits of Moore's Law come from the shrinking of the transistor. That is why Moore's Law has drifted from its historical origins as describing the number of transistors/chip to the more important metric of minimum lithographic feature size (where a proper choice of feature is made in order to properly represent the scaling potential of the transistor). While the popular press has failed to notice this shift, in the semiconductor industry there is no doubt that the technology nodes of Moore's Law are governed by the historical 0.7¥ shrink in minimum feature size per generation.
It is my opinion that Moore's Law is an example of an industry-wide learning curve. There is a constant fractional improvement in technical capability (as judged by the minimum feature size, for example) for every constant fractional increase in cumulative investment of effort. Since investment effort is generally proportional to output, Moore's Law can be formulated as a learning curve by plotting minimum feature size as a function of cumulative area of silicon produced by the industry on a log-log scale. (Alternatively, total cumulative revenue of the industry can be used as the x-axis as well with virtually no change in the curve.)
As presented here, Moore's Law has kept on a relatively constant learning curve throughout the history of the industry, with the exception of a slowdown in the early 1990s (it would be very interesting to speculate why this slowdown occurred). Current trends are on pace with historical learning rates and with this new formulation of Moore's Law, more accurate forecasting should be possible.
Chris A. Mack, KLA-Tencor, Austin, Texas
1. G.E. Moore, "Cramming More Components onto Integrated Circuits," Electronics, Vol. 38, No. 8, pp. 114–117, April 19, 1965.
2. G.E. Moore, "Progress in Digital Integrated Electronics," IEDM Technical Digest, Washington DC, pp. 11–13, 1975.
3. I. Tuomi, "The Lives and Death of Moore's Law," First Monday, online journal available at www.firstmonday.org/issues/issue7_11/tuomi.
4. G.E. Moore, "Lithography and the Future of Moore's Law," Optical/Laser Microlithography VIII, Proc., SPIE Vol. 2440, pp. 2–17, 1995.
5. The National Technology Roadmap for Semiconductors, Semiconductor Industry Association, San Jose, CA, 1994.
6. R. R. Schaller, "Moore's Law: Past, Present and Future," IEEE Spectrum, pp. 53–59, June 1977.
7. R. Noyce, "Microelectronics," Sci. Amer., V. 237, No. 3, pp. 63–69, Sept. 1997.
8. G.D. Hutcheson, J.D. Hutcheson, "Technology and Economics in the Semiconductor Industry," Scientific American, Vol. 274, No. 1, pp. 54–62, Jan. 1996.
For more information, contact Chris A. Mack, VP of Lithography Technology, at KLA-Tencor, 8834 N. Capital of Texas Highway, Suite 301, Austin, TX 78759; [email protected]