Nameless Interpreter for the Apple II+

120 Byte Interpreter that shall remain nameless

Inspired by Peter Ferrie and Michael Pohoreski.

Usenet discussion: comp.emulators.apple2 narkive.com

Here is the 6502 Assembly1 source code:

;By: mmphosis
;License: BSD "Sharing is Caring!"
	ORG  $BF65
HGR	EQU  $F3E2
HPOSN	EQU  $F411
CLRHGR2	EQU  $F3DE
COUT	EQU  $DB5C   ; OUTDO
RDKEY	EQU  $D553   ; INCHR
CODE	EQU  $1A
DATA	EQU  $26
BALANCE	EQU  $E1
HGRPAGE EQU  $E6
VALUE   EQU  $EB
	ORG  $300
	JSR  HGR     ; the screen that shall remain blank&blink for debugging
	JSR  HPOSN   ; set up $27 $26 (data = $2000) $E2 $E1 (balance = $0000)
	JSR  CLRHGR2 ; set up $1B $1A (code = $6000) using other screen clear
	BEQ  START   ; branch always Y=00
LOOP	JSR  DIRECT
OVER	JSR  WEIGH
	LDA  BALANCE
	ORA  BALANCE+1
	BNE  LOOP
;	RTS          ; fallthru with A=00 will RTS
INSTRUCT
	LDX #DATA
	CMP #'>'     ; $3E increment DATA pointer
	BEQ  INC16
	CMP #'<'     ; $3C decrement DATA pointer
	BEQ  DEC16
	SBC #'+'     ; opcodes +,-. $2B $2C $2D $2E
	TAX          ;  become ?012 $FF $00 $01 $02
	STX  VALUE
	BEQ  INKEY   ; read key , INKEY $00 Carry is set for SUB 0
READ_	LDA (DATA),Y ; this SEGFAULT knows no bounds! careful
	BNE  NONZERO
;ZERO	CLV          ; increment direction, oVerflow is clear already
	CPX #'['-'+' ; $5B = 30+2B           because SBC #'+'
	BEQ  OVER
	BNE  RESUME
NONZERO
;       SEV          ; decrement direction
;	BIT  DONE    ;  bit 6 of #$60 is set
;	BIT  $0C     ; hopefully #$E1
	BIT  HGRPAGE ;definitely #$40 as it was set in call to CLRHGR2
	CPX #']'-'+' ; $5D = 32+2B
	BEQ  OVER
RESUME	DEX
	DEX
	BEQ ECHO
	CPX #$FD     ; Carry is set
	BCS  SUB     ; either ADD +1 (aka subtract -1) or SUB +1
	RTS
INKEY
	JSR  RDKEY
;	AND #$7F     ; INCHR does this
SUB	SBC  VALUE   ; if fallthru from INKEY then VALUE=0
WRITE_	STA (DATA),Y ; there is no SEGFAULT just spinning drive motors
	RTS
;ECHO_	ORA #$80     ; OUTDO does this
;	CMP #$8A     ; "Laugh
;	BNE  ECHO    ;  in the face
;       LDA #$8D     ;  of danger"
ECHO	JMP  COUT
WEIGH
	LDA (CODE),Y
	LDX #BALANCE
	CMP #'['     ; heavier
	BEQ  INC16
	CMP #']'     ; lighter
	BNE  DONE
DEC16	LDA  $00,X
	BNE  SKIP
	DEC  $01,X
SKIP	DEC  $00,X
	RTS
DIRECT
	LDX #CODE
	BVS  DEC16   ; "This is the way"
INC16	INC  $00,X
	BNE  DONE
	INC  $01,X
	RTS
OPCODE
	JSR  INSTRUCT
	CLV          ; increment direction
	JSR  DIRECT  ; next instruction
START  ;LDY  #$00    ; because COUT trashes Y?
	LDA (CODE),Y ; fetch instruction
	BNE  OPCODE
DONE	RTS

The DATA is stored starting at $2000. Normally this buffer is 30000 bytes, but only the first 16384 bytes are cleared and then you overrun into your program. The program CODE is stored starting at $6000. There is absolutely no bounds checking!

ROM calls2 are made to save a lot of bytes. Calls to the Applesoft ROM are used, so if you are running this on the original Apple II you may need to switch to Applesoft using the Language Card.

300:
:20 E2 F3 20 11 F4 20 DE F3 F0 68 20 61 3 20 4C 3 A5 E1 5 E2 D0 F4 A2
:26 C9 3E F0 48 C9 3C F0 37 E9 2B AA 86 EB F0 19 B1 26 D0 6 E0 30 F0
:DE D0 6 24 E6 E0 32 F0 D6 CA CA F0 D E0 FD B0 4 60 20 53 D5 E5 EB 91
:26 60 4C 5C DB B1 1A A2 E1 C9 5B F0 11 C9 5D D0 1F B5 0 D0 2 D6 1 D6
:0 60 A2 1A 70 F3 F6 0 D0 E F6 1 60 20 17 3 B8 20 61 3 B1 1A D0 F5 60

To run the Interpreter from the Monitor, type 300G, but before you do that you will need to have a program to run.

Here is an Applesoft BASIC program to enter programs into memory.

 0  FOR A = 24576 TO 38399
 1      GET A$
 2      POKE A, 0
 3      C = ASC(A$)
 4      IF C < 32 AND P < 32 THEN END
 5      PRINT A$;
 6      LET P = C
 7      POKE A, ASC ( A$)
 8  NEXT A

Enter two control characters in a row to indicate the end of text. For example: press the Return key twice.

Here is a Hello World program for the interpreter:

++++++++[>++++[>++>+++>+++>+<<<<-]>+>->+>>+[<]<-]>>.>>---.+++++++..+++.>.<<-.>.+++.------.--------.>+.>++.++++++.

To run the Interpreter from BASIC, type CALL 768

screenshot of the AppleWin Apple II emulator running the Hello World program
AppleWin Apple II emulator3

START an instruction loop

Before we get to START the interpreter some initialization needs to be done. The first thing the interpreter will do is clear the HGR screen which is a big hack to clear 8192 bytes of the program’s DATA buffer, but it saves a lot of bytes in avoiding initial housekeeping. The screen will remain showing “Hi-resolution” GRaphics (HGR) and four lines of TEXT at the bottom of the screen. This is helpful for debugging as you can see bits changing on the graphics portion of the screen and see the text output at the bottom of the screen.

The second initializing call is to HPOSN which is used to setup the DATA pointer in the zero page at $26 and $27 and the BALANCE value in the zero page at $E1 and $E2. This is not what HPOSN was meant for but it is a way to save more bytes! Next, HGR2 is cleared without switching the screen to page 2 by making a call to $F3DE and this sets up the program CODE pointer in the zero page at $1A and $1B and also clears 8192 more bytes for DATA. The call to $F3DE leaves the Zero flag set so that a branch into the START of the interpreter assembly code can happen.

The Y register is already zero which saves 2 bytes! The Y Register needs to have the value zero throughout the program as Zero Page Indirect Y-Indexed addressing mode is used in four places.

  1. LDA (CODE),Y fetches the interpreter op code which may be one of these commands:
    ophexdecwhat the command does
    >$3E62 increment the DATA pointer
    <$3C60 decrement the DATA pointer
    +$2B43 increment the byte value at the DATA pointer
    -$2D45 decrement the byte value at the DATA pointer
    .$2E46 echo the character loaded from the byte at the DATA pointer
    ,$2C44 read a character and store it to the byte at the DATA pointer
    [$5B91 if the DATA byte value is zero then
        jump to the next command after the matching ]
    ]$5D93 if the DATA byte value is not zero then
        jump back to the command after the matching [
    ^@$000 quit the interpreter
  2. LDA (DATA),Y gets the byte value at the DATA pointer which is used by the + - . [ and ] commands.

  3. STA (DATA),Y puts the byte value at the DATA pointer which is used by the + - and , commands.

After fetching the instruction byte if the op code is zero the interpreter quits. If the op code is non-zero, a branch is made into the OPCODE instruction loop BNE OPCODE. The first call at the start of OPCODE is to JSR INSTRUCT and when it returns it gets the next op code by incrementing the CODE pointer CLV JSR DIRECT. I'll describe the INSTRUCT and DIRECT calls in more detail.

INSTRUCT

This is the main part of the interpreter where each op code actually does what the command is supposed to do. LDX #DATA primes the X register so that both of the INC16 or DEC16 calls will operate on the DATA pointer. This first part of the interpreter is pretty straight forward for the two commands. CMP #'>' BEQ INC16 means if it's the > command increment the 16 bit DATA pointer. CMP #'<' BEQ DEC16 means if it's; the < command decrement the 16 bit DATA pointer.

Subtraction, the Carry (aka “not” borrow) flag and the oVerflow flag

Now things get a bit hairy. I noticed a pattern with the commands. It can be best described with a one-liner Applesoft program.

FORI=2TO5:FORJ=0TO15:?" "CHR$(I*16+J);:NEXTJ:?:NEXTI
The output from this helped me find a pattern that I used to reduce the size of the program.
   ! " # $ % & ' ( ) * + , - . /
 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
 @ A B C D E F G H I J K L M N O
 P Q R S T U V W X Y Z [ \ ] ^ _

The ordered sequence of the + , - and . commands can be put to use in greatly reducing the overall code size.

SBC #'+' makes the character code into value that really fits well with what the instruction is going to do! Note that the Carry flag does not need to be set or cleared before the SuBtract with Carry (SBC) assembly code — this saves one byte! I think the aligned comment describes what is going on here:

	SBC #'+'     ; opcodes +,-. $2B $2C $2D $2E
	TAX          ;  become ?012 $FF $00 $01 $02

The prior CMP #'<' Cleared the Carry flag for any command value less than < ($3C.) This means that for these lower values SBC will borrow and subtract by one more, so the + command becomes $FF which is $2B - $2B - 1. The , command becomes $00, the - command becomes $01 and the . command becomes $02.

The VALUE is Transferred from the Accumulator to the X Register TAX, and then stored in the zero page STX VALUE for possible later use. No comparison is needed where BEQ INKEY means if the X Register is zero then it's the , command so branch to INKEY.

LDA (DATA),Y reads the byte value from the DATA pointer which will be needed by subsequent code. BNE NONZERO means branch to NONZERO if the byte value is non-zero and fallthrough if the byte value is zero. If it's zero, the the oVerflow flag is clear already for the [ command and if it's non-zero we set the oVerflow flag. The 6502 does not have a SEV instruction so we set the overflow flag by using BIT $0C with a known value $E1 in the zero page BIT HGRPAGE4 which was set to $40 in the call to $F3DE. The Overflow flag will be used to determine which way to read instructions for the [ and ] commands. BEQ OVER if it's the appropriate command for the appropriate direction it branches to OVER using CPX #'['-'+' or CPX #']'-'+'. This accounts for the Carry flag having been set when we subtracted '+' to yield the VALUE that is now in the X Register.

I'll go over what happens at OVER after I resume describing what happens at RESUME. I hope the names I used make sense. Good names for things is so important.

screenshot of the KEGS Apple II emulator running a fibint test
KEGS Apple II emulator3

Rather the DEX DEX to subtract 2 from the X Register instead of doing a bunch of comparisons. BEQ ECHO branches to ECHO for the . command.

COUT EQU $DB5C is actually calling OUTDO which sets the high bit of the character to be printed. This saves 2 bytes.

The CPX #$FD checks that the X Register is only $FD or $FF and the Carry flag will be set as well. X won't be $FE because we already checked for the , command earlier. With a single BCS SUB branch to SUB for both + and - commands. RTS because anything else is ignored.

+ - and , commands all do SBC VALUE. The VALUE is different for each command:

+ will subtract -1 which means add +1.
- will subtract 1.
, will JSR RDKEY before subtracting 0.

RDKEY EQU $D553 is actually calling INCHR which clears the high bit of the character that was gotten. This saves another 2 bytes.

OVER

At the first entry point at OVER, JSR WEIGH returns with a new 16-bit value for BALANCE which is checked and if it's zero it will fallthrough to RTS. Otherwise, it branches to LOOP which moves the CODE pointer in the appropriate direction with JSR DIRECT.

WEIGH tips the scale heavier for each [ command and lighter for each ] command.

DIRECT increments or decrements the CODE pointer depending on the oVerflow flag.

Given the offset value in the X Register, INC16 increments and DEC16 decrements the 16-bit zero page value. I actually hadn't used Zero Page,X addressing mode before but it seems to have shortened up the code because these operations are called a lot for different values in the zero page.

My article is over — get me out of this Turing tar-pit. The idea here was to document some of my process in coding this. I tried to write down why I coded things the way I did. I hope you found some insight and maybe this program can still be made shorter!

footnotes

  1. Assembled with Merlin32 Cross Assembler ↩︎
  2. Documented calls with faddensoft.com SourceGen Disassembly Projects ↩︎
  3. Tested with AppleWin ↩︎ and KEGS ↩︎ emulators
  4. Discussion on reddit about using a “safe” value ↩︎