Memoryan Analogy
One of the best illustrations of memory evolution that we've seen, was created by the memory experts over at Crucial Technology, a division of Micron Technology, Inc. (http://www.crucial.com). We've modified their original inspiration, expanding it to include many of the related concepts found throughout this book. Imagine a printing business. Back when it started, there was only "the guy in charge" and a few employees. They did their printing in a small building, and things were pretty disorganized.
A motherboard is a lot like this printing business, in that the CPU is in charge of getting things done. The other components on the board have all been developed to lend a helping hand. In the way that a business can make more money by choosing different growth paths, system performance has improved using different evolutionary technologies. One path is by getting things done faster. Speeding things up means shipping out more stuff (technical term) in a given work period. More stuff means more money, and the business grows.
Another way is to provide more time in which to get things done. In the world of computers, events take place according to clock cycles (ticks). If it takes 10 ticks to move one byte, then 5 ticks to move the same byte would mean faster throughput. We can either keep the byte the same size and move it in less time (multipliers and half ticks), or we can increase the size of the byte and move more data bits at the original time (bus widths). The difference in each process underlies Rambus and DDR memory technology (discussed in this chapter).
Memory Matrix
Back in the days of DRAM, when this printing business was just getting started, the boss (CPU) would take in a print job, then go running back to the pressman to have it printed. Think of the press man as the memory controller, and the printing press as a RAM chip. The pressman would examine the document and then grab lead blocks carved with each individual letter (bit). He'd put each block into a form (a grid), letter by letter.
Once the form was typeset, the press man would slop on the ink; put a piece of paper under the press, and crank down a handle, printing a copy of the document. (A bit of trivia: the space above and below a line of printing is called the leadingpronounced as "led-ding." This space was the extra room on a lead block surrounding each carved letter.)
Nowadays, you can buy a toy printing kit from a shop (sort of like buying a 386 machine), where each letter is engraved on a piece of rubber that slides into the rail of a wooden stamp handle. When you've inserted a complete line of letters, you ink them with an ink pad and stamp the line onto a piece of paper. But suppose you could insert an entire line into the rail at once. Wouldn't that be a whole lot faster? That was the idea behind certain memory improvements we'll look at later in this chapter (i.e., FPM and EDO).
Wait States
One of the big problems with DRAM, to follow our story, was that at any given time, the boss wouldn't know what the press man was doing. Neither did the pressman have any idea of what the boss was doing. If the boss ran in with a print job while the pressman was re-inking the press, he'd have to wait until the guy was done before they could talk. This is like the CPU waiting for the memory controller to complete a memory refresh.
Memory cells are made up of capacitors that can either hold a charge (1) or not hold a charge (0). One of the problems with capacitors is that they leak (their charge fades). This is similar to how ink comes off each block on that old printing press. A memory refresh is when the controller checks with the CPU for a correct data bit address and recharges a specific capacitor. When a memory refresh is taking place, the CPU must wait for the controller before it can pass on new data. This is a wait state.
You can see that if there were some way to avoid the leakage, the memory controller wouldn't have to constantly waste time recharging memory cells. The CPU could transfer data more often, using fewer wait states, thereby generating faster throughput. SRAM works with transistors, rather than capacitors. Although they don't leak their charge, transistors are more expensive to use on a memory chip.
Interrupts (INT)
Another problem with DRAM was that if the press man had a box of printed documents ready to go, he'd come running out to the front office and interrupt whatever was going on. If the boss was busy with a customer, then the press man would stand there and shout, "Hey boss! Hey boss! Hey boss!" until eventually he was heard (or punched in the facean IRQ conflict). Once in awhile, just by luck, the press man would run into the office when there were no customers, and the boss would be free to talk.
Interruptions are known as Interrupt Requests (IRQs) and, to mix metaphors, they are like a two-year-old demanding attention. One way to handle them is to repeat "not now...not now...not now" until it's a good time to listen. Another way to handle an interruption is to say, "Come back in a minute, and I'll be ready to respond then." We'll look at IRQs in Chapter 5.
Timing
One day the boss had a great idea. There was a big clock in the front office (the motherboard oscillator), and he proposed putting in a window to the pressroom. That way, both he and the press man could see the clock. The boss would then be able to call out to the press man that he had a job to run, and the press man could holler back, "I'll be ready in a minute."
This could also work the other way around, where the press man could finish a job and call out that he was ready to deliver the goods. The boss could shout back that he needed another minute before he could take them, because he was busy with a customer. Both of them could watch the clock for a minute to go by, doing something else until they were ready to talk.
Data transfers are more efficient when the CPU and memory controller can plan out a specific time to communicate. SDRAM provided a way for the memory controller and CPU to understand the same clock ticks and adjust their actions to match each other. The controller became synchronized with the clock, and memory chips became known as Synchronous DRAM (SDRAM).
DMA Channels
This plan worked very well. Business increased, and the company expanded. The building grew to fill the property, and new problems began cropping up. To begin with, when the press room moved to the other side of the building, the press man couldn't see the clock in the front office anymore. That meant they had to install a separate clock in the press room. When the boss had a print job, he would call down to the press room and schedule a time to meet.
The printing department used one clock to synchronize with the boss, but also had a separate clock used to time print jobs to improve efficiency. CPUs have an internal clock, separate from the motherboard clock. Modern components can synchronize to different clocks, depending upon processing requirements. In other words, a memory subsystem might use the processor's bus to synchronize internal operations. When the resulting information is ready to move out across the motherboard, the system clock becomes the controlling factor.
With business improving, the boss was getting busier and busier. He hired a couple of secretaries to handle walking a print job over to the press room, and the press man hired some assistants to work with some additional printing presses. This is similar to how memory modules came about, where a series of memory chips work together on a single IC card.
Eventually, the boss and the pressman stopped needing to consult about every single piece of paper involved in a particular job. The press man suggested that he be given the authority to make certain decisions as to how to set up the press. You can imagine that by freeing up the boss from having to come down to the press room every few minutes, the boss ended up with more time to work on business matters. This bypass is essentially what Direct Memory Access (DMA) is all about.
When the CPU agrees, certain operations can bypass the ordinary processing channels and access memory directly. We'll examine DMA channels in Chapter 5, but as devices took on more intelligence of their own, the CPU didn't have to waste time on simple routines. The ATA specification and UDMA are an outgrowth of this idea.
Bus Clocking
Another way the boss and the pressman got things done was to widen the hallways and install some conveyor belts. Back when it was a small business, the CPU transferred data over a memory bus at about the same speed as the memory could handle the bits. With increased processor and memory speeds, the transfer process was limited by buses staying at slower speeds.
In our story, the conveyor belts allowed for higher speed movements up and down the hallways. When the boss had a job to send to the press room, he'd hand it to the "bus boy" and tell him to run it over to the printers. The bus boy would jump on a conveyor belt and go sailing off to the other end of the building. Everyone in the building was talking about how fast they could go, and bus boys began timing each other.
A CPU processes some number of instructions per clock tick. We know that synchronous RAM means that memory controllers can also be set to process events according to clock ticks. CPUs used to run faster than memory, but sometimes memory ran faster than the CPU. Either way, both the CPU and memory were faster than the transfer buses.
Eventually, with some new conveyor belts, a bus boy could walk three steps and be carried all the way down the hall to the other side of the building. Surely, you've experienced the thrill of stepping on a conveyor belt and enhancing the speed by walking at the same time as you're being carried? This is somewhat like the concept of clock multipliers.
Modern bus technology brings together the idea of clock ticks and multipliers, using timing cycles to transfer information. Bits can be moved as a clock tick begins and when the clock tick ends, making for two bits per tick. If the clock is ticking at one million ticks per second (1MHz), we can transfer two million bits of data (2Mbps).
Cache Memory
As the company expanded, there was more and more paperwork; with copies of financial statements and records being sent to the accounting department and the government. For a short time, the boss used to send these jobs to the press roomafter all, they were a printing companybut that was costing the company money. Finally, he bought some laser printers for his secretary so they could do these quick print jobs on their own.
Whenever the boss was working up a price quote for a customer, he could set up various calculations and have his secretary print them off. Because they didn't have to go all the way to the press room (main memory), these temporary jobs were extremely quick. The CPU uses Level 1 and Level 2 caching in a similar fashion.
Level 1 (primary) cache memory is like the boss' own personal printer, right there by his desk. Level 2 (secondary) cache memory is like the secretary's printers in the next room. It takes a bit longer for the secretary to print a job and carry it back to the boss' office, but it's still much faster than having to run the job through the entire company.