Unit 1b - Static Scalars and Arrays.md

 
x
1
public class Foo {
2
    static int a;
3
    static int[] b;
4
  
5
    public void foo() {
6
        a=0;
7
        b[a] =a ;
8
    }
9
10
}

 
xxxxxxxxxx
2
1
int a;
2

Phases of Computation

Human creation: design program and describe it in high-level language
Compilation: convert high-level human description into machine-executable test
Execution: a physical machine executes the code text

Static vs Dynamic Computation

Two crucial phases of Computation
- Parameterized by input values unknown at compilation
- Producing output values that are unknowable at compilation
Anything the compiler can compute is called static
Anything that can only be discovered when executing is called dynamic

The Arithmetic Logic Unit (ALU)

The ALU is a combination circuit similar to the adder used in 121
- In additions to the inputs and outputs of the adder, there is a third input used to select which function/operation to compute (could be +, *, etc)
- Let's draw out the operations $r = (x+y) * x$
  1. Load $x$ into $ALU$ _A, $y$ into $ALU$ _B, and $3$ into $ALU$ _F (Assume $ALU$ _F3 means add).
  2. $ALU$ computes $ALU$ _A + $ALU$ _B and puts result into $ALU$ _O
  3. Copy $ALU$ _O to $ALU$ _B, and $4$ into $ALU$ _F (assuming 4 means multiply)
  4. $ALU$ computes $ALU$ _A * $ALU$ _B and puts result into $ALU$ _O
  5. Copy $ALU$ _O to r

The Processor (CPU)

Implements a set of instructions
Each instruction is implemented using logic gates
- Built from transistors: fundamental mechanism of computation
- The fewer and simpler the instructions, the better

First Proposed Instruction: ADD

Say we propose an instruction that does: $C \leftarrow B + A$
- $A$ , $B$ , and $C$ are stored in memory.
Instruction parameters: addresses of $A, B, C$
- In our case, each address is 32-bits (modern computers: 64-bits)
Every instruction is encoded as set of bits in memory:
- Operation name (e.g., add)
- Addresses for $A,B,C$

01|00001024|00001028|0000102c $\leftrightarrow$ operation|A|B|C

The Problem With Memory Access

Accessing memory is slow
- ~100 cycles for every memory access
- Fast programs avoid accessing memory when possible
Big instructions are costly
- Memory accesses are big (so instructions are big)
- Big instructions lead to big programs
- Reading instructions from memory is slow (less caching options)
- Large instructions use more CPU resources (transfer, storage)

General Purpose Registers

Register file
- Small, fast memory stored in CPU itself
- Roughly single cycle access
Registers
- Each register named by a number (e.g., 0-7)
- Size of architecture's common integer (32 or 64bits)

Instructions Using Registers

Memory instructions
- Load data from memory into register (slow)
- Store data from register into memory (slow)
Other instructions access data in registers
- Small and fast

01|00001024|00001028|0000102c <- using memory

01|0|1|2 <- using register

Instruction Set Architecture (ISA)

ISA is a formal interface to a processor implementation
- defines the instructions the processor implements
- defines the format of each instruction
Types of instructions:
- math and logic
- memory access
- control transfer: "gotos" and conditional "gotos"

Representing Instruction Semantics

Register Transfer Language (RTL): simple, convenient pseudo language to describe semantics
- easy to read/write, directly translates to machine steps
Syntax:
- each line is of the form: LHS <- RHS
- LHS is memory or register that receives a value
- RHS is constant, memory, register or expression on two registers
- m[a] is memory in address a
- r[i] is register with number i

RTL Examples

Assume the value $7$ is stored at address 0x1234
What do the following instructions do?
- r[0] $\leftarrow$ 10
- r[1] $\leftarrow$ 0x1234
- r[2] $\leftarrow$ m[r[1]] //memory at register 1
- r[3] $\leftarrow$ r[0] + r[2]
- m[r[1]] $\leftarrow$ r[2]

Static Variable Allocation

 
xxxxxxxxxx
7
1
int a;
2
int b[10];
3
4
void foo() {
5
    a = 0;
6
    b[a] = a;
7
}

Allocation is
- assigning a memory location to store variable's value
- assigning the variable an address (its name for reading and writing)
Static vs. dynamic computation
- Global/static variables can exist before program starts
- Compiler allocates variables, giving them a constant address
- No dynamic computation required to allocate variables
Key observation:
- Address of a,b[0],b[1],… are constants known to the compiler.
Use RTL to specify the instructions needed for a = 0
- r[0] $\leftarrow$ 0x1000
  - m[r[0]] $\leftarrow$ 0

Static Array Access

Compiler doesn't know address of b[a]
- Unless it knows the value of a statically (which is might here, but not in general)
Array access is computed from base and index
- Address of element is base plus offset
- Offset is index times element size
- The base address (0x2000) and element size (4, as it is an integer) are static, but the index value is dynamic (a's value can change)

Static Variable Access

Similar to previous exercise, use RTL to specific the instructions for b[a] = a
- Assume that the compiler does not know the value of a
r[0] $\leftarrow$ 0x1000
r[1] $\leftarrow$ m[r0]
r[2] $\leftarrow$ 0x2000
r[3] $\leftarrow$ r[2] + 0x4 * r[1]
m[r[3]] $\leftarrow$ r1

What Instructions Do We Need So Far?

Generalizing and simplifying, we get:
- r[x] $\leftarrow$ constant
- m[r[x] $\leftarrow$ r[y]
- r[y] $\leftarrow$ m[r[x]
- m[r[x] + r[y] * 4] $\leftarrow$ r[z]
- r[z] $\leftarrow$ m[r[x] + r[y] * 4]

ISA Specification

The compiler's semantic translation
It uses these instructions to compile the program snippets

Code Snippet Translation

fwaef

fawe

The Simple Machine (SM213) ISA

Architecture
- Register file: 8 general purpose registers, each 32-bits long
- CPUL One cycle per instruction (fetch and execute)
- Main memory: byte-addresses, big endian
Instruction format
- 2-byte or 6-byte instructions

Instruction Binary Format

Binary format represented as (each character is a hex digit): x-01, xxsd, x0vv, v-sd vvvvvvvv
Where:
- x is an opcode (instruction unique identification)
- - means unused portion (ignored by instruction)
- s and d are register numbers
- vv and vvvvvvvv are immediate/constant values.

Memory Access Instructions

We have 4 addressing modes for operands:
- immediate: constant value stored in struction
- register: operand is a register number (register is storing value)
- base + offset: operand is register number (register is storing memory address of value)
- indexed: two register number operands

The CPU Implementation: Internal State

PC (program counter): address of next instruction to fetch
Instruction: value of the current instruction
- Separated into components:

Stages

Fetch stage:
- Read instruction at PC
- Determine size
- Separate components
- Update PC
Execute stage:
- Read internal state and/or memory
- Perform specified computation
- Update internal state and/or memory

Java Simulator CPU Syntax

Internal registers:
- instruction, insOpCode, insOp0, insOp1, insOp2, insOpImm, insOpExt, pc
- Read with get(), change with set(value)
General purpose registers:
- Read with reg.get(regNumber)
- Change with reg.set(regNumber, value)
Main memory:
- Read with mem.readInteger(address)
- Change with mem.writeInteger(address, value)

Global Dynamic Array

 
xxxxxxxxxx
7
1
public class Foo {
2
    static int a;
3
    static int[] b = new int[10];
4
    public void foo () {
5
        b[a] = a;
6
    }
7
}

 
xxxxxxxxxx
6
1
int a;
2
int* b;
3
void food() {
4
    b = malloc(10*sizeof(int));
5
    b[a] = a;
6
}

Arrays in Java
- Store reference to array allocated dynamically with new statement
  int b[] = new int[10];
Arrays in C
- Can store static arrays
  int bst[10];
- Can store points to other arrays
  int *bptr = &bst[0]; // or int *bptr = bst;
- Can store points to arrays allocated dynamically with call too malloc library function
  int *bdyn = malloc(10 * sizeof(int))

C Arrays Different from Java

Terminology
- Use the term pointer instead of reference (they mean the same)
Declaration
- Pointers to the type of its elements, indicated with *
Allocation
- malloc allocates a block of bytes (no type, and no constructor)
Bounds checking
- C performs no array bounds checking
- Out-of-bounds access manipulated memory that isn't part of array

Static vs Dynamic Arrays

Arrays are accessed in C the same way, but declared and allocated differently

Static allocation:

compiler allocates whole array

int b[10];

 
xxxxxxxxxx
4
1
0x2000: value of b[0]
2
0x2004: value of b[1]
3
...
4
0x2024: value of b[9]

Dynamic allocation:

allocates a pointer

int *b = malloc(10 * sizeof(int));

 
xxxxxxxxxx
1
1
0x2000: 0x3000

When the program runs:

Static allocation: int b[10]

 
xxxxxxxxxx
4
1
  0x2000: value of b[0]
2
  0x2004: value of b[1]
3
  ...
4
  0x2024: value of b[9]

Dynamic allocation: int *b = malloc(10 * sizeof(int));

Working with Pointers in C

Pointers are declared as: type *varname
Pointers may be accessed as: varname[0] or *varname varname[i] or *(varname + i)

The address of a variable can be obtained with:

 
xxxxxxxxxx
2
1
type a;
2
type *ptr = &a;

variable	address	value
a	0x1000	3
ptr:	0x2000	0x1000

 
xxxxxxxxxx
5
1
  a == 3;
2
  &a == 0x1000;
3
  ptr == 0x1000;
4
  &ptr == 0x2000;
5
  *ptr == 3;

`*` and `&` example

 
xxxxxxxxxx
13
1
int a;
2
int b;
3
int *c;
4
5
void foo() {
6
    a = 4;
7
    b = 7;
8
    c = &a;
9
  
10
    *c = 5;
11
    b = 2;
12
    c = &b;
13
}

Code to Access The Array

C and Java Arrays and Pointers

In both languages:
- An array is a list of items of the same type
- Array elements are named by non-negative integers (start at 0)
- Syntax for accessing element i of array b is b[i]
In C
- A variable can store a pointer to the array (dynamic) or the array itself (static). For example: b[x] = 0
- m[b + x * sizeof(array-element)] <- 0
- m[m[b] + x * sizeof(array-element)] <- 0

Pointer Arithmetic in C

Adding an integer to a pointer;
- Integer is multiplied by type size before adding
- Example: if a is a pointer of type int, then (a+i) is equivalent to 4*i bytes ahead of a.
- Works with subtraction as well
Subtracting two pointers of the same type
- returns the number of elements of that type between the addresses
- equivalent to dividing address difference (in bytes) by type size
- Example: &a[7] - &a[2] == 5