Optimizing 6809 Assembly Code: Part 4 – Odds and Sods – More Tricks

Table of Contents (click on a link to jump to that web page)

I’ve run out of notes that I had for speeding up 6809 assembly.  I’ll update this page with anymore cool ideas that anyone shares with me or puts in the comments.

Cheers.


Dave Philipsen wrote with this tip:

I don’t know that it necessarily optimizes for speed but it saves space. When printing a string or any kind of calling a subroutine which requires a string, instead of pointing to the string and then calling the subroutine like this:

ldx   #strptr     * point to the string
jsr   prtstr      * print the string
lda   #??         * continue with the rest of the program

You can do this:

jsr   prtstr        * call the print string subroutine which pulls
                    * the address of the string from the
fcs   /text string/ * program counter which was just pushed to
                    * the stack
lda   #??           * continue with the rest of the program

The prtstr routine can change the program counter as it is saved on the stack so that when the routine returns, it returns to the point just past the end of the string. This optimizes for size by eliminating the need to load the pointer each time you print. It also reduces complexity because you don’t need to assign a label to the string.

Another example of this might be calling a subroutine which positions the cursor on the screen.

Instead of:

ldd   #$0101      * A=1 (x coord), B=1 (y coord)
jsr   curXY       * position the cursor
lda   #$??        * continue with program

do this:

jsr   curXY       * position the cursor, X and Y are pointed to
                  * by the program counter
fdb   #$0101      * A=1 (x coord), B=1 (y coord)
lda   #$??        * continue with the program

Thanks Dave.


Art Flexser wrote with this tip use STA instead of CLR (if you can):

When addressing some CoCo hardware registers, STA is a cycle faster than CLR and has the same effect.  Erik Gavriluk pointed out that using CLR does affect the Condition Codes and that should be taken into account.  Also “STA sets flags, too. It’s weird, but CLR really does write AND read from the memory address in question. There are CoCo hardware registers where this can cause a problem.”

CLR $FFDE         * Slow way
STA $FFDE         * Faster way

Thanks Art.


Maybe save a byte of space with this trick.  I found out looking at some BASIC unravelled source code of a neat little trick to save a byte of code and some CPU cycles you can optimize this bit of code:

LA974     CLRA
          BRA  Skip
LA977     LDA  #8
Skip      STA ,-S
...

To this version:

LA974     CLRA
          FCB $8C
LA976     LDA  #8
          STA ,-S
...

The use of the FCB $8C is actually a CMPX  #xxxx instruction so the CPU thinks the LDA   #8 is the address of the CMPX  # instruction then the CPU carries on.  This doesn’t make the program faster but it does save a byte of space (if you really need it).


Speeding up IRQ/FIRQs

I remembered one other thing about speeding up your IRQ/FIRQ jumps that you can do to speed them up:

Normally to jump to an IRQ or FIRQ you would store a $7E as the first byte of the interrupt jump pointer then you store the address in the next two bytes.  If your interrupt is in the Direct Page memory you can jump to that address using a DP jump instruction and the pointer to your interrupt in the next byte.

Setup an FIRQ that is in DP space.

Example:

The DP is set to $3E and the FIRQ routine starts in RAM at address $3ECA

 LDA    #$0E  * Write the direct page JMP instruction
 LDB    #$CA  * Will jump to address DP + $CA = $3ECA in our example
 STD    $010F * CoCo 1 & 2 FIRQ Jump pointer

Simon Jonassen told me about trick that you can use with the jump location in a CoCo 1 & 2 FIRQ jump pointer is to actually just put your entire FIRQ routine starting at $010F.  That way you save the few extra cycles that would normally be used with the usual JMP $xxxx instruction.

Thanks Simon


L. Curtis Boyle posted this tip in the comments section:

To save memory, when doing cmpx #0 or cmpy #0, use leax ,x or leay ,y (note: leau and leas do NOT set zero flag, so this trick can’t be used with those registers). This saves 1 byte for cmpx, and 2 bytes for cmpy.
In y’s case, it’s the same speed, too.

Thanks Curtis

On that same note, since LEAX and LEAY do set the zero flag you could also do the test for zero right after the LEA instruction if it works in your program.  For example

    LDX    #$2000
XisNotZero:
    LEAX   -32,X
    BNE    XisNotZero

 

 

Advertisements
This entry was posted in CoCo Programming and tagged , , , . Bookmark the permalink.

2 Responses to Optimizing 6809 Assembly Code: Part 4 – Odds and Sods – More Tricks

  1. L. Curtis Boyle says:

    One other little trick:
    To save memory, when doing cmpx #0 or cmpy #0, use leax ,x or leay ,y (note: leau and leas do NOT set zero flag, so this trick can’t be used with those registers). This saves 1 byte for cmpx, and 2 bytes for cmpy.
    In y’s case, it’s the same speed, too.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s