Segmented Integers

Introduction

I've been working on Tequila, a NASM emulator in JavaScript. One problem I ran into pretty quickly is x86's register subdivision. In x86, you have 4 general-purpose registers.

A diagram of x86 registers.

Each register can hold up to 64 bits of memory. However you can address parts of this 64 bits. For example, if the register rax was initialized to 64 bits of zeros, the instruction

MOV eax, 11

would fit the decimal value 11 (1011 in binary) into the second 32 bits of rax. Similarly,

MOV al, 3

would move 11 into the last 8 bytes of rax. After modifying a part of the register, the full register may still be addressed. For example, if rax was initialized to all zeroes,

MOV al, 3

would move the decimal value 3 into the end of rax. Since all the leading bits are zero, the total value of register rax will also be 3. Then,

MOV ah, 2

would load 10 into ah. Thus, ignoring leading zeroes, the value of rax would be:

00000010 00000011

yielding a value of 515.

Note that you can only address the upper and lower half of eax. You cannot address the upper half of rax. This was a design choice in x86 to save space in the instruction set (less addressible register names).

Implementation

Binary-Based Integers

First, what I needed to do was create a base integer type that would get and set integer values from a binary store. I implement the h/l registers using this class. I call this SizedInteger.

I chose to store these values in pure binary instead of as an array of bytes. Since all addressible sizes in x86 are a multiple of 8 bits, this would be a better decision for performance and memory efficiency. I chose to stick with binary as a design choice.

The implementation is as follows:

class SizedInteger {
    constructor (size) {
    this.size = size                       // store the size of the register, in bits
    this.contents = '0'.repeat(this.size)  // fill the value store with 0's to initialize
    }

    get () {
    return parseInt(this.contents, 2)      // parse the binary store as a binary integer
    }

    set (n) {
    let x = (n % (2 ** this.size)).toString(2)

    // take up the full size with binary, padding with 0, so that when this integer
    // is appended to others in memory, they take up exactly the size allocated
    let leftpad = this.size - x.length
    this.contents = '0'.repeat(leftpad) + x
    }
}

Segmented Integers

Now for the good stuff: segmented integers!

class SegmentedInteger {

    // a segmented integer is constructed with two SizedInteger objects,
    // r(egister)1 and r(egister)2
    constructor (r1, r2) {
    this.r1 = r1
    this.r2 = r2
    }

    get contents () {
    return this.r1.contents + this.r2.contents
    }

    set contents (x) {
    this.r1.contents = x.slice(0, this.r1.size)
    this.r2.contents = x.slice(this.r1.size, this.r1.size + this.r2.size)
    }

    get size () {
    return this.r1.size + this.r2.size
    }

    get () {
    return parseInt(this.r1.contents + this.r2.contents, 2)
    }

    set(n) {
    let combined_size = this.r1.size + this.r2.size
    let x  = (n % (2 ** combined_size)).toString(2)
    let leftpad = combined_size - x.length
    x = '0'.repeat(leftpad) + x

    // fill r1, then r2
    this.r1.contents = x.slice(0, this.r1.size)
    this.r2.contents = x.slice(this.r1.size, this.r1.size + this.r2.size)
    }

}

The implementation of SegmentedIntegers was essentially the same as that for SizedIntegers; I simply had to split the binary store between two registers. Because the constructor takes in SizedInteger objects, the objects the constructor is passed references to can be manipulated seperately from the SegmentedInteger.

I used ECMAScript 5's new syntax for getters and setters so that SegmentedIntegers exposed an identical API and functioned exactly the same as SizedIntegers, both so that you could make a SegmentedInteger of a SegmentedInteger, but also to simpify my code for Tequila.

Testing

Here's some test code:

> ah = new SizedInteger(16)
> al = new SizedInteger(16)
> eax = new SegmentedInteger(ah, al)
> rax = new SegmentedInteger(new SizedInteger(32), eax)
> eax.set(9)
'00000000000000000000000000001001'
> eax.get()
9
> al.contents
'0000000000001001'
> ah.contents
'0000000000000000'
> al.set(0)
> eax.get()
0