Parsing Lessons
Continuing on from the previous post, I spent some time trying to figure out how best to parse the variable length instructions and figured I'd doc the failures here.
Base assumptions
These tokens will be used for the examples below:
define token opcode8(8)
op8 = (0,7)
op8_idx_bits = (2,3)
;
define token data8 (8)
imm8=(0,7)
;
Too much data consumed
For this example consider the tables and constructor (i.e. instruction) below. I know that these could really be just done with the one line, but if you're attempting to make things modular and reusable, you'll see what I'm driving at:
imm_b: "#"imm8 is imm8 { export *[const]:1 imm8; }
OP_b: is op8_idx_bits=0x00 ; imm_b { export imm_b; }
OP_b: is ...
:ADCA OP_b is (op8=0x73 | op8=0x63) ; OP_b { }
So I'd like to be able to have OP_b
define what it exports based upon the op8_idx_bits
field from the opcode. Unfortunately as it's defined above this is going to end up consuming 3 bytes. Like you see on the ram 000002
line here.
It's going to first match the opcode8 token because the op8
field and consume those 8 bits out of the data. It's then going to attempt to match the op8_idx_bits
token; and if it finds bits 2,3
set to 0 in the next 8 bits, it's going to consume the whole byte, then attempt to export the next 8 bits as the immediate data. This is because I've defined this with the ;
(concatenation) operator and each one has been appended. But just applying & here causes mismatches and fails to compile.
How to fix?
I fixed this by using the & operator and a context register. I added the definition below for a context register.
define context contextreg
idx_reg=(0,2) # hold onto some data extracted
;
I was then able to adjust the definition of the instruction to say I wanted access to the op8_idx_bits
field and then set that in the context register in the idx_reg
field. The table entry then defines what it needs the context register's fields to match and can then decompile correctly. This can be expanded out to handle the variations on the arguments.
OP_b: is idx_reg=0 & imm_b { export imm_b; }
OP_b: is idx_reg=...
:ADCA OP_b is op8_idx_bits & (op8=0x73 | op8=0x63) ; OP_b [idx_reg=op8_idx_bits]{ }
The small test parses correctly after doing this.
I know, I know, this is actually a really contrived example, but you get the idea ;) .