Teaching Computer Organization and Architecture Using Simulation and FPGA Applications

: This paper presents the design concepts and realization of incorporating micro-operation simulation and FPGA implementation into a teaching tool for computer organization and architecture. This teaching tool helps computer engineering and computer science students to be familiarized practically with computer organization and architecture through the development of their own instruction set, computer programming and interfacing experiments. A two-pass assembler has been designed and implemented to write assembly programs in this teaching tool. In addition to the micro-operation simulation, the complete configuration can be run on Xilinx Spartan-3 FPGA board. Such implementation offers good code density, easy customization, easily developed software, small area, and high performance at low cost.


INTRODUCTION
Computer organization and architecture is a common course that is offered at universities throughout the world [1] . Traditionally, teaching such a course to computer engineering and computer science students can be insufficient if the teaching focus is solely on textbook materials [2,3] . Students often have to rely on their imaginations to understand the underlying hardware-related concepts. In most universities, students learn computer design concepts by software implementing individual pieces of a computer. This approach has several limitations, while students can simulate their design using software, they don't have the chance to realize or run their design in hardware [4] . Also, it is not feasible to build a laboratory that can provide various computer architectures for teaching computer organization and architecture. Hence, keeping computer education up-to-date requires keeping in touch with the rapid evolution of the computer technology and industry. Searching for an efficient way of teaching computer organization and architecture is an ongoing task [2] . An active tool will be considered in this paper for teaching computer organization and architecture by taking advantage of simulation and Field Programmable Gate Array (FPGA) technology [5,6] .
FPGA technology offers the potential of designing high performance systems at low cost. FPGAs have been used for many computational tasks [6] , and this paper presents the micro-operation simulation of a basic computer and its implementation on an FPGA. Field programmable gate arrays consist of programmable logic blocks, which can each implement a small amount of digital logic, and programmable routing which allows the logic block inputs and outputs to be connected to form larger circuits [7,8,9] .
FPGAs have become a popular technology for creating digital systems since they can lead to a shorter time-to-market for designs than application-specific circuits (ASICs) and allow design modifications to be made after system creation [10] . The primary method used for validating a design with most FPGA design flows is simulation.
This paper presents the micro-operation simulation and FPGA realization of a single cycle computer, which can be used for educational purposes. This simulation is a set of micro-operations that represent the register transfer statements of all operations that can be implemented.
Also, it covers the design of an assembler for the designed computer, which can be used for educational purposes. The student can obtain a better understanding of the internal operation of a computer by simulating each element, and this will help students to study the design and performance issues.

COMPUTER ORGANIZATION TEACHING TOOL
The design of the computer's instruction set is an important architectural issue. The processor structure and the functionality of the instructions define the computer's behavior. The objective of this work is to design a simple computer that will introduce the singlecycle computer features to the students. The internal organization of a digital system is defined by sequence of micro-operations that perform on data stored in its registers. The computer is capable of executing various micro-operations and can be instructed to perform a sequence of operations. Once a student selects the design parameters, their values are written into a file used by the main module in the Verilog HDL code in order to build the required architecture. Then this architecture is downloaded into the FPGA using Project Navigator Package via USB port, see Fig.1.

DESIGN EXAMPLE
Instruction codes together with data are stored in memory. The computer reads each instruction from memory and places it in a control register. The control unit then interprets the binary code of the instruction and proceeds to execute it by issuing a sequence of micro-operations.
An instruction code is a group of bits that instruct the computer to perform a specific operation. It is made up of 16 bits, and divided into three parts, as shown in Fig.2. Two bits (I 1 ,I 0 ) to specify the addressing mode, four bits binary code to specify the operation, and ten bits address field. Table 1 illustrates the addressing mode of the proposed computer, which is used to: • Reduce the number of bits in the address field of the instruction, and • Give user flexibility in dealing with counters, pointers …etc A computer needs registers for manipulating data and a register for holding a memory address. Nine registers are required for the proposed computer, as shown in Table 2.  Some registers (such as AC, MAR, and MBR) may receive data from several multiplexed sources. A basic computer has eight registers, a memory unit, and a control unit. Paths must be provided to transfer information from one register to another and between memory to registers. A more efficient scheme for transferring information in a system with many registers is to use a common bus. A multiplexer can be used to design the common bus. The connection of the registers to the common bus system is shown in Fig. 3.

Computer Instructions:
A basic computer has three basic instruction code formats. A memory-reference instruction uses ten bits to specify either an address or an operand and two bits to specify the addressing mode (I 0 ,I 1 ). For immediate addressing it is 00, 01 for direct addressing, and 11 for indirect address. The registerreference instructions are recognized by the operation code 1111 and a 00 in the left most bits of the instruction. A register-reference instruction specifies an I1 I0 Addressing mode 0 0 Immediate addressing ($ address) 0 1 Direct addressing 1 1 Indirect addressing (# address) operation on the AC register. An operand from memory is not needed; therefore, the other 10 bits are used to specify the operation to be executed.   Similarly, an input-output instruction does not need a reference to memory and is recognized by the operation code 1111 and 11 in the left most bits of the instruction. The remaining 10 bits are used to specify the type of the input-output operation. This technique allows having up to 35 different operations, as given in Table 3.

Timing and Control:
The timing for all registers in the basic computer is controlled by a master clock generator. The control signals are generated by the control unit and provide control inputs for the multiplexer in the common bus, control inputs in processor registers, and micro-operations for the accumulator. The control unit, Fig. 3, consists of three decoders, a sequence counter, and a number of control logic gates. The control unit gets the operation code from the OPR through a 4x16 decoder. Bits 14 & 15 of the instruction code are transferred to two flip-flops designated by the symbols I 0 & I 1 . Bits 0 through 9 are applied to the control logic gated directly from the MBR through the common bus. The outputs of the counter are decoded into 4 timing signals T 0 through T 3 . The sequence counter (SC) can be incremented or cleared. Most of the time, the counter is incremented to provide the sequence of timing signals out of the 2x4 decoder. Once, the counter is cleared, causing the next active timing signal to T 0 . The decoder is used to determine the cycle to be performed.

INSTRUCTION CYCLES
A program residing in the memory unit of the computer consists of a sequence of instructions. Each instruction cycle is subdivided into a sequence of subcycles. The value of three flip-flops is entered into a decoder to determine the cycle to be served, as illustrated in Table 4. As illustrated in Fig. 4, each instruction cycle is divided into the following five subcycles: Fetch and Decode Cycle: Initially, the program counter is loaded with the address of the first instruction in the program. The sequence counter is cleared to 0, providing a decoded timing signal T 0 . After each clock pulse, sequence counter is incremented by one, so that the timing signals go through a sequence T 0 , T 1 , T 2 and T 3 . The micro-operations for the fetch and decode cycle can be specified by the following statements: The effective address of the operand may be read during two time pulses. Therefore, to disable the delay of waiting for T 3 , the sequence counter may be cleared at time T 2 . Thus, the next time pulse will be T 0 of the execution cycle and not T 3     This basic computer serves the interrupt by saving the next sequential instruction in memory address 0, and then it starts execution from address 1 in the memory. The micro-operations required for this instruction are:

Register Transfer Statements:
A register transfer language is useful not only for describing the internal organization of the computer, but also for specifying the logic circuits needed for its design. The implemented computer has 35 instructions, as in Table 3. Each instruction is represented by a single statement or a set of statements. Table 5 illustrates the control functions and micro-operations for selected instructions. The obtained statements give all the information necessary for the design of the logic circuits of the computer.

ASSEMBLER DESIGN
A two pass assembler has been designed and implemented to write assembly programs and use the output of the assembler to run these programs on the basic computer. Figure 6 shows the files used as input and those generated as output by the assembler. These are; • Source File (input): It is a text file containing the source program to be assembled. It has a ".asm" extension. It consists of two segments, the code segment followed by the data segment. The data segment starts with "data:". • Binary Code File (output): It contains the assembled statements represented in binary form. This file is stored in the block memory, and it has a ".dat" extension. • Hex Code File (output): It contains the assembled statements in hexadecimal form. This file is stored in the external memory, and it has a ".mem" extension. • Listing file (output): It consists of the source file statements, the assembled code, and the Branch Vector Table BVT. This file has a ".lst" extension.
The basic computer assembly language character set consists of the following subset of the standard ASCII character set: • Lower-case letters (a to z).
Assembler Instructions: Any assembly program for the basic computer consists of text lines, and each line contains only one instruction and an optional comment. Table 6 shows all instructions and their appropriate operands and addressing modes.
A symbolic destination must be placed at the beginning of the line of code for branching to a line of assembly code. A line that begins with comment symbol (//) is considered a comment line. It is printed into the list (.lst) file but will not be encoded into the hex (.mem) and the binary (.dat) files. Comments may be added to lines that contain program code.  To write an assembly program for the implemented basic computer, follow these steps: • Write an assembly program, using a text editor such as Microsoft Notebook.
• Save the file as text only, using a ".asm" extension, in the same directory as the Basic Computer Assembler. • Using Windows Explorer or a command window, start the assembler. • When the assembler comes up, enter the name of your file with .asm extension. Three new files are generated in the directory, the assembly language file with extensions .mem, and the list file with .lst extension (see Fig. 7), and the binary file have a "mem1.dat" name.

SIMULATION AND FPGA IMPLEMENTATION
Once files defining mico-operations are ready, the proposed computer can be entirely simulated with the simulator included in the Xilinx Development Environment. This is found particularly important to help students to; • understand what is going on and why, • check that obtained values from simulation confirm to what is expected, • verify and follow the progression of the signals directly on the screen, since it matches the architecture layout given in Fig. 5. Figure 8 shows the trace window for the signals generated by the assembly program given in Fig.7. By examining the control signals, contents of computer registers and memory at address 1 to 9 and at addresses 900, 901 1000 and 1001, we can determine whether the prototype is functioning correctly or not? Once we have   determined that the design is functioning correctly we are ready to proceed to the synthesis and device programming to generate a configuration that will program an FPGA device to implement the proposed computer system. Field programmable gate arrays are a class of programmable logic devices based on an array of logic cells surrounded by a periphery of input/output cells. These programmable integrated circuits can be programmed in the field to implement specific design function. A basic computer architecture is created from the ground up as a scalable architecture, covering the basic operations in 16-bit processor domain.
The general layout of the user I/O user interface of the teaching tool is given in Fig. 9. An I/O interface is connected to the FPGA board to input any command or data from the student and to monitor current values of all registers, flags and related memory locations.
There are eight slide switches (SW 0 -SW 7 ) in the system. Switches (SW 4-7 ) are used as input data, switches (SW 1 & SW 2 ) are used as external   interrupts, while switch (SW 0 ) is used to indicate if the clock is a system clock or a user clock. Figure 10 shows the design procedure of the basic computer. Table 7 shows the different characteristics of the FPGA board used in the implementation [11,12 . Tables  8 and 9 show the macro statistics and FPGA resources are shown in Tables 8 and 9 respectively.

CONCLUSION
This paper addressed the importance of using computer simulation and FPGA realization in learning computer organization and architecture. The given teaching tool can be considered as a useful practical addition to computer engineering and computer science curricula. This teaching tool helps computer engineering and computer science students to be familiarized practically with computer organization and architecture through development of their own instruction set and computer programming and interfacing experiments. In this paper; • The simulation of a single cycle basic computer and the implementation of an assembler has been presented. • The micro-operation of the computer module and its assembler are implemented on Xilinx Spartan-3 FPGA board, since it offers good code density, easy customization, easily developed software, high performance and small area. • It is worthwhile to mention that this teaching tool has been developed and implemented using popular Xilinx boards found in many universities. It has been tested by 3 rd year undergraduate students enrolled at the computer architecture course given at Philadelphia university-Jordan. The students performed better when they used this teaching tool. • The codes of the various modules are implemented and tested with a program which utilizes every instruction as well as exercises the critical paths of the chip. • This FPGA application runs at a maximum frequency of 73.465 MHz.