# Segmented Integers

## Introduction

I've been working on Tequila, a NASM emulator in JavaScript. One problem I ran into pretty quickly is x86's register subdivision. In x86, you have 4 general-purpose registers. Each register can hold up to 64 bits of memory. However you can address parts of this 64 bits. For example, if the register `rax` was initialized to 64 bits of zeros, the instruction

```MOV eax, 11
```

would fit the decimal value 11 (`1011` in binary) into the second 32 bits of `rax`. Similarly,

```MOV al, 3
```

would move `11` into the last 8 bytes of `rax`. After modifying a part of the register, the full register may still be addressed. For example, if `rax` was initialized to all zeroes,

```MOV al, 3
```

would move the decimal value 3 into the end of `rax`. Since all the leading bits are zero, the total value of register `rax` will also be 3. Then,

```MOV ah, 2
```

would load `10` into `ah`. Thus, ignoring leading zeroes, the value of `rax` would be:

```00000010 00000011
```

yielding a value of 515.

Note that you can only address the upper and lower half of `eax`. You cannot address the upper half of `rax`. This was a design choice in x86 to save space in the instruction set (less addressible register names).

## Implementation

### Binary-Based Integers

First, what I needed to do was create a base integer type that would get and set integer values from a binary store. I implement the `h/l` registers using this class. I call this `SizedInteger`.

I chose to store these values in pure binary instead of as an array of bytes. Since all addressible sizes in x86 are a multiple of 8 bits, this would be a better decision for performance and memory efficiency. I chose to stick with binary as a design choice.

The implementation is as follows:

```class SizedInteger {
constructor (size) {
this.size = size                       // store the size of the register, in bits
this.contents = '0'.repeat(this.size)  // fill the value store with 0's to initialize
}

get () {
return parseInt(this.contents, 2)      // parse the binary store as a binary integer
}

set (n) {
let x = (n % (2 ** this.size)).toString(2)

// take up the full size with binary, padding with 0, so that when this integer
// is appended to others in memory, they take up exactly the size allocated
let leftpad = this.size - x.length
this.contents = '0'.repeat(leftpad) + x
}
}
```

### Segmented Integers

Now for the good stuff: segmented integers!

```class SegmentedInteger {

// a segmented integer is constructed with two SizedInteger objects,
// r(egister)1 and r(egister)2
constructor (r1, r2) {
this.r1 = r1
this.r2 = r2
}

get contents () {
return this.r1.contents + this.r2.contents
}

set contents (x) {
this.r1.contents = x.slice(0, this.r1.size)
this.r2.contents = x.slice(this.r1.size, this.r1.size + this.r2.size)
}

get size () {
return this.r1.size + this.r2.size
}

get () {
return parseInt(this.r1.contents + this.r2.contents, 2)
}

set(n) {
let combined_size = this.r1.size + this.r2.size
let x  = (n % (2 ** combined_size)).toString(2)
let leftpad = combined_size - x.length
x = '0'.repeat(leftpad) + x

// fill r1, then r2
this.r1.contents = x.slice(0, this.r1.size)
this.r2.contents = x.slice(this.r1.size, this.r1.size + this.r2.size)
}

}
```

The implementation of `SegmentedInteger`s was essentially the same as that for `SizedInteger`s; I simply had to split the binary store between two registers. Because the constructor takes in `SizedInteger` objects, the objects the constructor is passed references to can be manipulated seperately from the `SegmentedInteger`.

I used ECMAScript 5's new syntax for getters and setters so that `SegmentedIntegers` exposed an identical API and functioned exactly the same as `SizedIntegers`, both so that you could make a `SegmentedInteger` of a `SegmentedInteger`, but also to simpify my code for Tequila.

## Testing

Here's some test code:

```> ah = new SizedInteger(16)
> al = new SizedInteger(16)
> eax = new SegmentedInteger(ah, al)
> rax = new SegmentedInteger(new SizedInteger(32), eax)
> eax.set(9)
'00000000000000000000000000001001'
> eax.get()
9
> al.contents
'0000000000001001'
> ah.contents
'0000000000000000'
> al.set(0)
> eax.get()
0
```