RISC |
Processor |
What is RISC?
RISC
(Reduced Instruction Set Computer) is a microprocessor that is designed to
perform a smaller number of types of computer instructions so that it can
operate at a higher speed (Perform more millions of instructions per second, or
MIPS).Since each instruction type that a computer must perform requires
additional transistors and circuitry, a larger list or set of computer
instructions tends to make the microprocessor more complicated and slower in
operation.
Reduced instruction set computing, or RISC, is a CPU
design strategy based on the insight that a simplified instruction set (as opposed to a complex set) provides higher performance when
combined with a microprocessor architecture capable of executing those instructions using fewer microprocessor cycles per instruction. A computer based on this strategy is a reduced instruction set computer, also called RISC. The
opposing architecture is called complex instruction set computing,
i.e. CISC.
Various suggestions have been made regarding a precise definition of
RISC, but the general concept is that of a system that uses a small, highly
optimized set of instructions, rather than a more versatile set of instructions
often found in other types of architectures. Another common trait is that
RISC systems use the load/store architecture.
History of RISC
The first RISC projects came from IBM, Stanford, and UC-Berkeley in
the late 70s and early 80s. The IBM 801, Stanford MIPS, and Berkeley RISC 1 and
2 were all designed with a similar philosophy which has become known as RISC. RISC
is a CPU design strategy based on the insight that a simplified instruction set
provides higher performance when combined with the microprocessor architecture
capable of executing those instructions with fewer clock cycles per
instruction. A computer based on this strategy is called a Reduced Instruction
Set Computer or simply RISC. The opposing architecture is Complex Instruction
Set Computer or simply CISC. Certain design features have been characteristic
of most RISCs processors:
- One cycle
execution time: RISC processors have a CPI (clock per instruction) of
one cycle. This is due to the optimization of each instruction on the CPU
and a technique called pipelining.
- Pipelining: A technique that allows for
simultaneous execution of parts, or stages, of instructions to more
efficiently process instructions.
- Large
number of registers: The RISC design philosophy generally incorporates a
larger number of registers to prevent in large amounts of interactions
with memory.
The simplest way to examine the advantages and disadvantages
of RISC architecture is by contrasting it with its predecessor: CISC (Complex
Instruction Set Computers) architecture.
The RISC
processors are of many types. They are 8 bit, 16 bit and 32 bit architectures.
The architectures of all types of RISC processors are explained below
8-bit RISC Processor :
In
computer architecture, 8-bit integers, memory addresses, or other data units
are those that are at most 8 bits (1 octet) wide. Also, 8-bit CPU and
ALU architectures are those that are based on registers, address buses, or data
buses of that size. 8-bit is also a term given to a generation of microcomputers
in which 8-bit microprocessors were the norm. The architecture of 8 bit RISC
processor is as in the figure 1.
8 bit RISC processor |
Figure1 Architecture of 8 bit RISC Processor
16 bit RISC Processor
The
16 bit RISC processor is the extension of 8 bit RISC processor and has the
features which are extension of 8 bit RISC Processor. This processor has the
16bit addressing capabilities and the data width line is also 16bit wide. The
ALU can do the 16bit operations on two 16bit data words. The Register files
available here are 16 registers with 16bit length. The architecture of 16 bit RISC
processor is as shown in the figure 2.
16 bit RISC processor |
Figure2 Architecture of 16 bit RISC Processor
Multiplying Two Numbers in Memory
On the right is a diagram representing the storage scheme for
a generic computer. The main memory is divided into locations numbered from
(row) 1: (column) 1 to (row) 6: (column) 4. The execution unit is responsible
for carrying out all computations. However, the execution unit can only operate
on data that has been loaded into one of the six registers (A, B, C, D, E, or
F). Let's say we want to find the product of two numbers - one stored in
location 2:3 and another stored in location 5:2 - and then store the product
back in the location 2:3.
Figure3
Multiplying two numbers in memory
The CISC Approach
The primary goal of CISC architecture is to
complete a task in as few lines of assembly as possible. This is achieved by
building processor hardware that is capable of understanding and executing a
series of operations. For this particular task, a CISC processor would come
prepared with a specific instruction. When executed, this instruction loads the
two values into separate registers, multiplies the operands in the execution
unit, and then stores the product in the appropriate register. Thus, the entire
task of multiplying two numbers can be completed with one instruction:
MULT 2:3, 5:2
MULT is what is known as a "complex
instruction." It operates directly on the computer's memory banks and does
not require the programmer to explicitly call any loading or storing functions.
It closely resembles a command in a higher level language. For instance, if we
let "a" represent the value of 2:3 and "b" represent the
value of 5:2, then this command is identical to the C statement "a = a * b."
One of the primary advantages of this system is that
the compiler has to do very little work to translate a high-level language
statement into assembly. Because the length of the code is relatively short,
very little RAM is required to store instructions. The emphasis is put on
building complex instructions directly into the hardware.
The RISC Approach
RISC processors only use simple
instructions that can be executed within one clock cycle. Thus, the
"MULT" command described above could be divided into three separate
commands: "LOAD," which moves data from the memory bank to a
register, "PROD," which finds the product of two operands located
within the registers, and "STORE," which moves data from a register
to the memory banks. In order to perform the exact series of steps described in
the CISC approach, a programmer would need to code four lines of assembly:
LOAD A, 2:3
LOAD B, 5:2
PROD A, B
STORE 2:3, A
LOAD B, 5:2
PROD A, B
STORE 2:3, A
At
first, this may seem like a much less efficient way of completing the
operation. Because there are more lines of code, more RAM is needed to store
the assembly level instructions. The compiler must also perform more work to
convert a high-level language statement into code of this form.
However,
the RISC strategy also brings some very important advantages. Because each
instruction requires only one clock cycle to execute, the entire program will
execute in approximately the same amount of time as the multi-cycle
"MULT" command. These RISC "reduced instructions" require
less transistors of hardware space than the complex instructions, leaving more
room for general purpose registers. Because all of the instructions execute in
a uniform amount of time (i.e. one clock), pipelining is possible.
Separating
the "LOAD" and "STORE" instructions actually reduces the
amount of work that the computer must perform. After a CISC-style
"MULT" command is executed, the processor automatically erases the
registers. If one of the operands needs to be used for another computation, the
processor must re-load the data from the memory bank into a register. In RISC,
the operand will remain in the register until another value is loaded in its
place.
The Performance Equation
The following equation is commonly
used for expressing a computer's performance ability:
The CISC approach
attempts to minimize the number of instructions per program, sacrificing the
number of cycles per instruction. RISC does the opposite, reducing the cycles
per instruction at the cost of the number of instructions per program.
equation |
Advantages of RISC
Implementing a processor with a
simplified instruction set design provides several advantages over implementing
a comparable CISC design:
- Speed: Since a simplified
instruction set allows for a pipelined, superscalar design RISC processors
often achieve 2 to 4 times the performance of CISC processors using
comparable semiconductor technology and the same clock rates.
·
Simpler hardware: Because the instruction set of a
RISC processor is so simple, it uses up much less chip space; extra functions,
such as memory management units or floating point arithmetic units, can also be
placed on the same chip. Smaller chips allow a semiconductor manufacturer to
place more parts on a single silicon wafer, which can lower the per-chip cost
dramatically.
·
Shorter design cycle: Since RISC processors are simpler
than corresponding CISC processors, they can be designed more quickly, and can
take advantage of other technological developments sooner than corresponding
CISC designs, leading to greater leaps in performance between generations.
The hazards of RISC
The transition from a CISC design strategy to a RISC design strategy
isn't without its problems. Software engineers should be aware of the key
issues which arise when moving code from a CISC processor to a RISC processor.
Code Quality
The
performance of a RISC processor depends greatly on the code that it is
executing. If the programmer (or compiler) does a poor job of instruction
scheduling, the processor can spend quite a bit of time stalling: waiting for
the result of one instruction before it can proceed with a subsequent
instruction.
Since the scheduling rules can be complicated, most programmers use
a high level language (such as C or C++) and leave the
instruction scheduling to the compiler.
Debugging
Unfortunately,
instruction scheduling can make debugging difficult. If scheduling (and other
optimizations) are turned off, the machine-language instructions show a clear
connection with their corresponding lines of source. However, once instruction
scheduling is turned on, the machine language instructions for one line of
source may appear in the middle of the instructions for another line of source
code.
System Architecture of a 32–bit RISC Processor
The system
architecture of a
32-bit RISC processor is shown in Fig.4.
The RISC
processor architecture consists of Arithmetic Logic Unit(ALU), Control
Unit(CU), Barrel Shifter, Booth’s Multiplier, Register File and Accumulator. RISC processor is designed with load/store
(Von Neumann) architecture, meaning that all operations are performed on operands held in the processor registers and the main memory can only
be accessed
through the load and
store instructions. One
shared memory for instructions(program) and data with
one data bus and one address bus between processor and
memory. Instruction and data are
fetched in sequential order so that the latency incurred between the
machine cycles can be
reduced. For increasing the speed of operation
RISC processor is designed with
five stage pipelining. The pipelining stages are Instruction
Fetch (IF), Instruction Decode(ID), Execution(EX), Data Memory(MEM), and Write Back(WB).
architecture of 32 bit processor |
Figure4 Architecture of 32 bit RISC processor
Internal Blocks of RISC processor
– Control Unit
– Data path Unit
·
ALU
·
Registers
·
Barrel shifter
·
Accumulator
·
Multiplier
The function of
the instruction
fetch unit is to obtain an
instruction from the instruction memory using the current
value of the PC and increment
the PC
value for the next instruction. The main function
of the instruction decode unit is to use the 32-bit instruction provided from the previous
instruction fetch unit to
index the register file and obtain the register data values. The instructions
op code field bits [31-26] are sent
to a control unit to determine the type of instruction
to execute. The type of instruction
then determines which control signals are to be asserted and what function the ALU is
to perform, thus decoding the instruction. The register file reads
in the requested addresses and outputs the
data values contained in these
registers.
These data values can then be operated on by the
ALU whose operation is determined by the control unit
to either compute a memory address (e.g., load or store), compute an arithmetic result (e.g., add,
and or
slt), or perform a compare (e.g., branch). The control unit
has two instruction decoders that
decode the instruction bits and the decoded output
of the control unit is fed
as control signal either into Arithmetic logic unit (ALU) or Barrel shifter or Booth’s Multiplier. If the instruction decoded is arithmetic, the ALU result must be written to a register. If the instruction decoded is a load or a store,
the ALU result is then used to
address the data memory. The
final step writes the ALU result
or memory
value back to the register
file.
0 comments:
Post a Comment