This file is indexed.

/usr/share/doc/cc65-doc/html/customizing.html is in cc65-doc 2.16-2.

This file is owned by root:root, with mode 0o644.

The actual contents of the file can be viewed below.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<HTML>
<HEAD>
 <LINK REL="stylesheet" TYPE="text/css" HREF="doc.css">
 <META NAME="GENERATOR" CONTENT="LinuxDoc-Tools 0.9.72">
 <TITLE>Defining a Custom cc65 Target</TITLE>
</HEAD>
<BODY>
<H1>Defining a Custom cc65 Target</H1>

<H2>Bruce Reidenbach</H2>2015-03-13
<HR>
<EM>This section provides step-by-step instructions on how to use the cc65
toolset for a custom hardware platform (a target system not currently
supported by the cc65 library set).</EM>
<HR>
<P>
<H2><A NAME="toc1">1.</A> <A HREF="customizing.html#s1">Overview</A></H2>

<P>
<H2><A NAME="toc2">2.</A> <A HREF="customizing.html#s2">System Memory Map Definition</A></H2>

<P>
<H2><A NAME="toc3">3.</A> <A HREF="customizing.html#s3">Startup Code Definition</A></H2>

<P>
<H2><A NAME="toc4">4.</A> <A HREF="customizing.html#s4">Custom Run-Time Library Creation</A></H2>

<P>
<H2><A NAME="toc5">5.</A> <A HREF="customizing.html#s5">Interrupt Service Routine Definition</A></H2>

<P>
<H2><A NAME="toc6">6.</A> <A HREF="customizing.html#s6">Adding Custom Instructions</A></H2>

<P>
<H2><A NAME="toc7">7.</A> <A HREF="customizing.html#s7">Hardware Drivers</A></H2>

<P>
<H2><A NAME="toc8">8.</A> <A HREF="customizing.html#s8">Hello World! Example</A></H2>

<P>
<H2><A NAME="toc9">9.</A> <A HREF="customizing.html#s9">Putting It All Together</A></H2>


<HR>
<H2><A NAME="s1">1.</A> <A HREF="#toc1">Overview</A></H2>


<P>The cc65 toolset provides a set of pre-defined libraries that allow the
user to target the executable image to a variety of hardware platforms.
In addition, the user can create a customized environment so that the
executable can be targeted to a custom platform.  The following
instructions provide step-by-step instructions on how to customize the
toolset for a target that is not supported by the standard cc65
installation.</P>
<P>The platform used in this example is a Xilinx Field Programmable Gate
Array (FPGA) with an embedded 65C02 core.  The processor core supports
the additional opcodes/addressing modes of the 65SC02, along with the
STP and WAI instructions.  These instructions will create a set of files
to create a custom target, named SBC, for <B>S</B>ingle <B>B</B>oard
<B>C</B>omputer.</P>

<H2><A NAME="s2">2.</A> <A HREF="#toc2">System Memory Map Definition</A></H2>


<P>The targeted system uses block RAM contained on the XILINX FPGA for the
system memory (both RAM and ROM).  The block RAMs are available in
various aspect ratios, and they will be used in this system as 2K*8
devices.  There will be two RAMs used for data storage, starting at
location $0000 and growing upwards.  There will be one ROM (realized as
initialized RAM) used code storage, starting at location $FFFF and
growing downwards.</P>
<P>The cc65 toolset requires a memory configuration file to define the
memory that is available to the cc65 run-time environment, which is
defined as follows:</P>
<P>
<BLOCKQUOTE><CODE>
<HR>
<PRE>
MEMORY {
    ZP:        start =    $0, size =  $100, type   = rw, define = yes;
    RAM:       start =  $200, size = $0E00, define = yes;
    ROM:       start = $F800, size = $0800, file   = %O;
}
</PRE>
<HR>
</CODE></BLOCKQUOTE>
</P>
<P>ZP defines the available zero page locations, which in this case starts
at $0 and has a length of $100.  Keep in mind that certain systems may
require access to some zero page locations, so the starting address may
need to be adjusted accordingly to prevent cc65 from attempting to reuse
those locations.  Also, at a minimum, the cc65 run-time environment uses
26 zero page locations, so the smallest zero page size that can be
specified is $1A.  The usable RAM memory area begins after the 6502
stack storage in page 1, so it is defined as starting at location $200
and filling the remaining 4K of space (4096 - 2 *
256&nbsp;=&nbsp;3584&nbsp;=&nbsp;$0E00).  The 2K of ROM space begins at
address $F800 and goes to $FFFF (size&nbsp;=&nbsp;$0800).</P>
<P>Next, the memory segments within the memory devices need to be defined.
A standard segment definition is used, with one notable exception.  The
interrupt and reset vector locations need to be defined at locations
$FFFA through $FFFF.  A special segment named VECTORS is defined that
resides at these locations.  Later, the interrupt vector map will be
created and placed in the VECTORS segment, and the linker will put these
vectors at the proper memory locations.  The segment definition is:</P>
<P>
<BLOCKQUOTE><CODE>
<HR>
<PRE>
SEGMENTS {
    ZEROPAGE: load = ZP,  type = zp,  define   = yes;
    DATA:     load = ROM, type = rw,  define   = yes, run = RAM;
    BSS:      load = RAM, type = bss, define   = yes;
    HEAP:     load = RAM, type = bss, optional = yes;
    STARTUP:  load = ROM, type = ro;
    ONCE:     load = ROM, type = ro,  optional = yes;
    CODE:     load = ROM, type = ro;
    RODATA:   load = ROM, type = ro;
    VECTORS:  load = ROM, type = ro,  start    = $FFFA;
}
</PRE>
<HR>
</CODE></BLOCKQUOTE>
</P>
<P>The meaning of each of these segments is as follows.</P>

<P><CODE>       ZEROPAGE:  </CODE>Data in page 0, defined by ZP as starting at $0 with length $100</P>
<P><CODE>       DATA:      </CODE>Initialized data that can be modified by the program, stored in RAM</P>
<P><CODE>       BSS:       </CODE>Uninitialized data stored in RAM (used for variable storage)</P>
<P><CODE>       HEAP:      </CODE>Uninitialized C-level heap storage in RAM, optional</P>
<P><CODE>       STARTUP:   </CODE>The program initialization code, stored in ROM</P>
<P><CODE>       ONCE:      </CODE>The code run once to initialize the system, stored in ROM</P>
<P><CODE>       CODE:      </CODE>The program code, stored in ROM</P>
<P><CODE>       RODATA:    </CODE>Initialized data that cannot be modified by the program, stored in ROM</P>
<P><CODE>       VECTORS:   </CODE>The interrupt vector table, stored in ROM at location $FFFA</P>
<P>A note about initialized data:  any variables that require an initial
value, such as external (global) variables, require that the initial
values be stored in the ROM code image.  However, variables stored in
ROM cannot change; therefore the data must be moved into variables that
are located in RAM.  Specifying <CODE>run&nbsp;=&nbsp;RAM</CODE> as part of
the DATA segment definition will indicate that those variables will
require their initialization value to be copied via a call to the
<CODE>copydata</CODE> routine in the startup assembly code.  In addition,
there are system level variables that will need to be initialized as
well, especially if the heap segment is used via a C-level call to
<CODE>malloc&nbsp;()</CODE>.</P>
<P>The final section of the definition file contains the data constructors
and destructors used for system startup.  In addition, if the heap is
used, the maximum C-level stack size needs to be defined in order for
the system to be able to reliably allocate blocks of memory.  The stack
size selection must be greater than the maximum amount of storage
required to run the program, keeping in mind that the C-level subroutine
call stack and all local variables are stored in this stack.  The
<CODE>FEATURES</CODE> section defines the required constructor/destructor
attributes and the <CODE>SYMBOLS</CODE> section defines the stack size.  The
constructors will be run via a call to <CODE>initlib</CODE> in the startup
assembly code and the destructors will be run via an assembly language
call to <CODE>donelib</CODE> during program termination.</P>
<P>
<BLOCKQUOTE><CODE>
<HR>
<PRE>
FEATURES {
    CONDES:    segment = STARTUP,
               type    = constructor,
               label   = __CONSTRUCTOR_TABLE__,
               count   = __CONSTRUCTOR_COUNT__;
    CONDES:    segment = STARTUP,
               type    = destructor,
               label   = __DESTRUCTOR_TABLE__,
               count   = __DESTRUCTOR_COUNT__;
}

SYMBOLS {
    # Define the stack size for the application
    __STACKSIZE__:  value = $0200, weak = yes;
}
</PRE>
<HR>
</CODE></BLOCKQUOTE>
</P>
<P>These definitions are placed in a file named &#34;sbc.cfg&#34;
and are referred to during the ld65 linker stage.</P>

<H2><A NAME="s3">3.</A> <A HREF="#toc3">Startup Code Definition</A></H2>


<P>In the cc65 toolset, a startup routine must be defined that is executed
when the CPU is reset.  This startup code is marked with the STARTUP
segment name, which was defined in the system configuration file as
being in read only memory.  The standard convention used in the
predefined libraries is that this code is resident in the crt0 module.
For this custom system, all that needs to be done is to perform a little
bit of 6502 housekeeping, set up the C-level stack pointer, initialize
the memory storage, and call the C-level routine <CODE>main&nbsp;()</CODE>.
The following code was used for the crt0 module, defined in the file
&#34;crt0.s&#34;:</P>
<P>
<BLOCKQUOTE><CODE>
<HR>
<PRE>
; ---------------------------------------------------------------------------
; crt0.s
; ---------------------------------------------------------------------------
;
; Startup code for cc65 (Single Board Computer version)

.export   _init, _exit
.import   _main

.export   __STARTUP__ : absolute = 1        ; Mark as startup
.import   __RAM_START__, __RAM_SIZE__       ; Linker generated

.import    copydata, zerobss, initlib, donelib

.include  &#34;zeropage.inc&#34;

; ---------------------------------------------------------------------------
; Place the startup code in a special segment

.segment  &#34;STARTUP&#34;

; ---------------------------------------------------------------------------
; A little light 6502 housekeeping

_init:    LDX     #$FF                 ; Initialize stack pointer to $01FF
          TXS
          CLD                          ; Clear decimal mode

; ---------------------------------------------------------------------------
; Set cc65 argument stack pointer

          LDA     #&lt;(__RAM_START__ + __RAM_SIZE__)
          STA     sp
          LDA     #&gt;(__RAM_START__ + __RAM_SIZE__)
          STA     sp+1

; ---------------------------------------------------------------------------
; Initialize memory storage

          JSR     zerobss              ; Clear BSS segment
          JSR     copydata             ; Initialize DATA segment
          JSR     initlib              ; Run constructors

; ---------------------------------------------------------------------------
; Call main()

          JSR     _main

; ---------------------------------------------------------------------------
; Back from main (this is also the _exit entry):  force a software break

_exit:    JSR     donelib              ; Run destructors
          BRK
</PRE>
<HR>
</CODE></BLOCKQUOTE>
</P>
<P>The following discussion explains the purpose of several important
assembler level directives in this file.</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
.export   _init, _exit
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>This line instructs the assembler that the symbols <CODE>_init</CODE> and
<CODE>_exit</CODE> are to be accessible from other modules.  In this
example, <CODE>_init</CODE> is the location that the CPU should jump to when
reset, and <CODE>_exit</CODE> is the location that will be called when the
program is finished.</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
.import   _main
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>This line instructs the assembler to import the symbol <CODE>_main</CODE>
from another module.  cc65 names all C-level routines as
{underscore}{name} in assembler, thus the <CODE>main&nbsp;()</CODE> routine
in C is named <CODE>_main</CODE> in the assembler.  This is how the startup
code will link to the C-level code.</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
.export   __STARTUP__ : absolute = 1        ; Mark as startup
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>This line marks this code as startup code (code that is executed when
the processor is reset), which will then be automatically linked into
the executable code.</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
.import   __RAM_START__, __RAM_SIZE__       ; Linker generated
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>This line imports the RAM starting address and RAM size constants, which
are used to initialize the cc65 C-level argument stack pointer.</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
.segment  &#34;STARTUP&#34;
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>This line instructs the assembler that the code is to be placed in the
STARTUP segment of memory.</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
          JSR     zerobss              ; Clear BSS segment
          JSR     copydata             ; Initialize DATA segment
          JSR     initlib              ; Run constructors
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>These three lines initialize the external (global) and system
variables.  The first line sets the BSS segment -- the memory locations
used for external variables -- to 0.  The second line copies the
initialization value stored in ROM to the RAM locations used for
initialized external variables.  The last line runs the constructors
that are used to initialize the system run-time variables.</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
          JSR     _main
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>This is the actual call to the C-level <CODE>main&nbsp;()</CODE> routine,
which is called after the startup code completes.</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
_exit:    JSR     donelib              ; Run destructors
          BRK
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>This is the code that will be executed when <CODE>main ()</CODE>
terminates.  The first thing that must be done is run the destructors
via a call to <CODE>donelib</CODE>.  Then the program can terminate.  In
this example, the program is expected to run forever.  Therefore, there
needs to be a way of indicating when something has gone wrong and the
system needs to be shut down, requiring a restart only by a hard reset.
The BRK instruction will be used to indicate a software fault.  This is
advantageous because cc65 uses the BRK instruction as the fill byte in
the final binary code.  In addition, the hardware has been designed to
force the data lines to $00 for all illegal memory accesses, thereby
also forcing a BRK instruction into the CPU.</P>

<H2><A NAME="s4">4.</A> <A HREF="#toc4">Custom Run-Time Library Creation</A></H2>


<P>The next step in customizing the cc65 toolset is creating a run-time
library for the targeted hardware.  The easiest way to do this is to
modify a standard library from the cc65 distribution.  In this example,
there is no console I/O, mouse, joystick, etc. in the system, so it is
most appropriate to use the simplest library as the base, which is for
the Watara Supervision and is named &#34;supervision.lib&#34; in the
lib directory of the distribution.</P>
<P>The only modification required is to replace the <CODE>crt0</CODE> module in
the supervision.lib library with custom startup code.  This is simply
done by first copying the library and giving it a new name, compiling
the startup code with ca65, and finally using the ar65 archiver to
replace the module in the new library.  The steps are shown below:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
$ copy &#34;C:\Program Files\cc65\lib\supervision.lib&#34; sbc.lib
$ ca65 crt0.s
$ ar65 a sbc.lib crt0.o
</PRE>
</CODE></BLOCKQUOTE>
</P>

<H2><A NAME="s5">5.</A> <A HREF="#toc5">Interrupt Service Routine Definition</A></H2>


<P>For this system, the CPU is put into a wait condition prior to allowing
interrupt processing.  Therefore, the interrupt service routine is very
simple:  return from all valid interrupts.  However, as mentioned
before, the BRK instruction is used to indicate a software fault, which
will call the same interrupt service routine as the maskable interrupt
signal IRQ.  The interrupt service routine must be able to tell the
difference between the two, and act appropriately.</P>
<P>The interrupt service routine shown below includes code to detect when a
BRK instruction has occurred and stops the CPU from further processing.
The interrupt service routine is in a file named
&#34;interrupt.s&#34;.</P>
<P>
<BLOCKQUOTE><CODE>
<HR>
<PRE>
; ---------------------------------------------------------------------------
; interrupt.s
; ---------------------------------------------------------------------------
;
; Interrupt handler.
;
; Checks for a BRK instruction and returns from all valid interrupts.

.import   _stop
.export   _irq_int, _nmi_int

.segment  &#34;CODE&#34;

.PC02                             ; Force 65C02 assembly mode

; ---------------------------------------------------------------------------
; Non-maskable interrupt (NMI) service routine

_nmi_int:  RTI                    ; Return from all NMI interrupts

; ---------------------------------------------------------------------------
; Maskable interrupt (IRQ) service routine

_irq_int:  PHX                    ; Save X register contents to stack
           TSX                    ; Transfer stack pointer to X
           PHA                    ; Save accumulator contents to stack
           INX                    ; Increment X so it points to the status
           INX                    ;   register value saved on the stack
           LDA $100,X             ; Load status register contents
           AND #$10               ; Isolate B status bit
           BNE break              ; If B = 1, BRK detected

; ---------------------------------------------------------------------------
; IRQ detected, return

irq:       PLA                    ; Restore accumulator contents
           PLX                    ; Restore X register contents
           RTI                    ; Return from all IRQ interrupts

; ---------------------------------------------------------------------------
; BRK detected, stop

break:     JMP _stop              ; If BRK is detected, something very bad
                                  ;   has happened, so stop running
</PRE>
<HR>
</CODE></BLOCKQUOTE>
</P>
<P>The following discussion explains the purpose of several important
assembler level directives in this file.</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
.import   _stop
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>This line instructs the assembler to import the symbol <CODE>_stop</CODE>
from another module.  This routine will be called if a BRK instruction
is encountered, signaling a software fault.</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
.export   _irq_int, _nmi_int
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>This line instructs the assembler that the symbols <CODE>_irq_int</CODE> and
<CODE>_nmi_int</CODE> are to be accessible from other modules.  In this
example, the address of these symbols will be placed in the interrupt
vector table.</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
.segment  &#34;CODE&#34;
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>This line instructs the assembler that the code is to be placed in the
CODE segment of memory.  Note that because there are 65C02 mnemonics in
the assembly code, the assembler is forced to use the 65C02 instruction
set with the <CODE>.PC02</CODE> directive.</P>
<P>The final step is to define the interrupt vector memory locations.
Recall that a segment named VECTORS was defined in the memory
configuration file, which started at location $FFFA.  The addresses of
the interrupt service routines from &#34;interrupt.s&#34; along with
the address for the initialization code in crt0 are defined in a file
named &#34;vectors.s&#34;.  Note that these vectors will be placed in
memory in their proper little-endian format as:</P>

<P><CODE>       $FFFA&nbsp;-&nbsp;$FFFB:</CODE> NMI interrupt vector (low byte, high byte)</P>
<P><CODE>       $FFFC&nbsp;-&nbsp;$FFFD:</CODE> Reset vector (low byte, high byte)</P>
<P><CODE>       $FFFE&nbsp;-&nbsp;$FFFF:</CODE> IRQ/BRK interrupt vector (low byte, high byte)</P>
<P>using the <CODE>.addr</CODE> assembler directive.  The contents of the file are:</P>
<P>
<BLOCKQUOTE><CODE>
<HR>
<PRE>
; ---------------------------------------------------------------------------
; vectors.s
; ---------------------------------------------------------------------------
;
; Defines the interrupt vector table.

.import    _init
.import    _nmi_int, _irq_int

.segment  &#34;VECTORS&#34;

.addr      _nmi_int    ; NMI vector
.addr      _init       ; Reset vector
.addr      _irq_int    ; IRQ/BRK vector
</PRE>
<HR>
</CODE></BLOCKQUOTE>
</P>
<P>The cc65 toolset will replace the address symbols defined here with the
actual addresses of the routines during the link process.</P>

<H2><A NAME="s6">6.</A> <A HREF="#toc6">Adding Custom Instructions</A></H2>


<P>The cc65 instruction set only supports the WAI (Wait for Interrupt) and
STP (Stop) instructions when used with the 65816 CPU (accessed via the
--cpu command line option of the ca65 macro assembler).  The 65C02 core
used in this example supports these two instructions, and in fact the
system benefits from the use of both the WAI and STP instructions.</P>
<P>In order to use the WAI instruction in this case, a C routine named
&#34;wait&#34; was created that consists of the WAI opcode followed by
a subroutine return.  It was convenient in this example to put the IRQ
interrupt enable in this subroutine as well, since interrupts should
only be enabled when the code is in this wait condition.</P>
<P>For both the WAI and STP instructions, the assembler is
&#34;fooled&#34; into placing those opcodes into memory by inserting a
single byte of data that just happens to be the opcode for those
instructions.  The assembly code routines are placed in a file, named
&#34;wait.s&#34;, which is shown below:</P>
<P>
<BLOCKQUOTE><CODE>
<HR>
<PRE>
; ---------------------------------------------------------------------------
; wait.s
; ---------------------------------------------------------------------------
;
; Wait for interrupt and return

.export  _wait, _stop

; ---------------------------------------------------------------------------
; Wait for interrupt:  Forces the assembler to emit a WAI opcode ($CB)
; ---------------------------------------------------------------------------

.segment  &#34;CODE&#34;

.proc _wait: near

           CLI                    ; Enable interrupts
.byte      $CB                    ; Inserts a WAI opcode
           RTS                    ; Return to caller

.endproc

; ---------------------------------------------------------------------------
; Stop:  Forces the assembler to emit a STP opcode ($DB)
; ---------------------------------------------------------------------------

.proc _stop: near

.byte      $DB                    ; Inserts a STP opcode

.endproc
</PRE>
<HR>
</CODE></BLOCKQUOTE>
</P>
<P>The label <CODE>_wait</CODE>, when exported, can be called by using the
<CODE>wait&nbsp;()</CODE> subroutine call in C.  The section is marked as
code so that it will be stored in read-only memory, and the procedure is
tagged for 16-bit absolute addressing via the &#34;near&#34;
modifier.  Similarly, the <CODE>_stop</CODE> routine can be called from
within the C-level code via a call to <CODE>stop&nbsp;()</CODE>.  In
addition, the routine can be called from assembly code by calling
<CODE>_stop</CODE> (as was done in the interrupt service routine).</P>

<H2><A NAME="s7">7.</A> <A HREF="#toc7">Hardware Drivers</A></H2>


<P>Oftentimes, it can be advantageous to create small application helpers
in assembly language to decrease codespace and increase execution speed
of the overall program.  An example of this would be the transfer of
characters to a FIFO (<B>F</B>irst-<B>I</B>n,
<B>F</B>irst-<B>O</B>ut) storage buffer for transmission over a
serial port.  This simple action could be performed by an assembly
language driver which would execute much quicker than coding it in C.
The following discussion outlines a method of interfacing a C program
with an assembly language subroutine.</P>
<P>The first step in creating the assembly language code for the driver is
to determine how to pass the C arguments to the assembly language
routine.  The cc65 toolset allows the user to specify whether the data
is passed to a subroutine via the stack or by the processor registers by
using the <CODE>__fastcall__</CODE> and <CODE>__cdecl__</CODE> function qualifiers (note that
there are two underscore characters in front of and two behind each
qualifier).  <CODE>__fastcall__</CODE> is the default.  When <CODE>__cdecl__</CODE> <EM>isn't</EM>
specified, and the function isn't variadic (i.e., its prototype doesn't have
an ellipsis), the rightmost argument in the function call is passed to the
subroutine using the 6502 registers instead of the stack.  Note that if
there is only one argument in the function call, the execution overhead
required by the stack interface routines is completely avoided.</P>
<P>With <CODE>__cdecl__</CODE>, the last argument is loaded into the A and X
registers and then pushed onto the stack via a call to <CODE>pushax</CODE>.
The first thing the subroutine does is retrieve the argument from the
stack via a call to <CODE>ldax0sp</CODE>, which copies the values into the A
and X.  When the subroutine is finished, the values on the stack must be
popped off and discarded via a jump to <CODE>incsp2</CODE>, which includes
the RTS subroutine return command.  This is shown in the following code
sample.</P>
<P>Calling sequence:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
        lda     #&lt;(L0001)  ;  Load A with the high order byte
        ldx     #&gt;(L0001)  ;  Load X with the low order byte
        jsr     pushax     ;  Push A and X onto the stack
        jsr     _foo       ;  Call foo, i.e., foo (arg)
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>Subroutine code:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
_foo:   jsr     ldax0sp    ;  Retrieve A and X from the stack
        sta     ptr        ;  Store A in ptr
        stx     ptr+1      ;  Store X in ptr+1
        ...                ;  (more subroutine code goes here)
        jmp     incsp2     ;  Pop A and X from the stack (includes return)
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>If <CODE>__cdecl__</CODE> isn't specified, then the argument is loaded into the A
and X registers as before, but the subroutine is then called
immediately.  The subroutine does not need to retrieve the argument
since the value is already available in the A and X registers.
Furthermore, the subroutine can be terminated with an RTS statement
since there is no stack cleanup which needs to be performed.  This is
shown in the following code sample.</P>
<P>Calling sequence:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
        lda     #&lt;(L0001)  ;  Load A with the high order byte
        ldx     #&gt;(L0001)  ;  Load X with the low order byte
        jsr     _foo       ;  Call foo, i.e., foo (arg)
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>Subroutine code:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
_foo:   sta     ptr        ;  Store A in ptr
        stx     ptr+1      ;  Store X in ptr+1
        ...                ;  (more subroutine code goes here)
        rts                ;  Return from subroutine
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>The hardware driver in this example writes a string of character data to
a hardware FIFO located at memory location $1000.  Each character is
read and is compared to the C string termination value ($00), which will
terminate the loop.  All other character data is written to the FIFO.
For convenience, a carriage return/line feed sequence is automatically
appended to the serial stream.  The driver defines a local pointer
variable which is stored in the zero page memory space in order to allow
for retrieval of each character in the string via the indirect indexed
addressing mode.</P>
<P>The assembly language routine is stored in a file names
&#34;rs232_tx.s&#34; and is shown below:</P>
<P>
<BLOCKQUOTE><CODE>
<HR>
<PRE>
; ---------------------------------------------------------------------------
; rs232_tx.s
; ---------------------------------------------------------------------------
;
; Write a string to the transmit UART FIFO

.export         _rs232_tx
.exportzp       _rs232_data: near

.define         TX_FIFO $1000    ;  Transmit FIFO memory location

.zeropage

_rs232_data:    .res 2, $00      ;  Reserve a local zero page pointer

.segment  &#34;CODE&#34;

.proc _rs232_tx: near

; ---------------------------------------------------------------------------
; Store pointer to zero page memory and load first character

        sta     _rs232_data      ;  Set zero page pointer to string address
        stx     _rs232_data+1    ;    (pointer passed in via the A/X registers)
        ldy     #00              ;  Initialize Y to 0
        lda     (_rs232_data)    ;  Load first character

; ---------------------------------------------------------------------------
; Main loop:  read data and store to FIFO until \0 is encountered

loop:   sta     TX_FIFO          ;  Loop:  Store character in FIFO
        iny                      ;         Increment Y index
        lda     (_rs232_data),y  ;         Get next character
        bne     loop             ;         If character == 0, exit loop

; ---------------------------------------------------------------------------
; Append CR/LF to output stream and return

        lda     #$0D             ;  Store CR
        sta     TX_FIFO
        lda     #$0A             ;  Store LF
        sta     TX_FIFO
        rts                      ;  Return

.endproc
</PRE>
<HR>
</CODE></BLOCKQUOTE>
</P>

<H2><A NAME="s8">8.</A> <A HREF="#toc8">Hello World! Example</A></H2>


<P>The following short example demonstrates programming in C using the cc65
toolset with a custom run-time environment.  In this example, a Xilinx
FPGA contains a UART which is connected to a 65c02 processor with FIFO
(<B>F</B>irst-<B>I</B>n, <B>F</B>irst-<B>O</B>ut) storage to
buffer the data.  The C program will wait for an interrupt generated by
the receive UART and then respond by transmitting the string &#34;Hello
World! &#34; every time a question mark character is received via a
call to the hardware driver <CODE>rs232_tx&nbsp;()</CODE>.  The driver
prototype uses the <CODE>__fastcall__</CODE> extension to indicate that the
driver does not use the stack.  The FIFO data interface is at address
$1000 and is defined as the symbolic constant <CODE>FIFO_DATA</CODE>.
Writing to <CODE>FIFO_DATA</CODE> transfers a byte of data into the transmit
FIFO for subsequent transmission over the serial interface.  Reading
from <CODE>FIFO_DATA</CODE> transfers a byte of previously received data out
of the receive FIFO.  The FIFO status data is at address $1001 and is
defined as the symbolic constant <CODE>FIFO_STATUS</CODE>.  For convenience,
the symbolic constants <CODE>TX_FIFO_FULL</CODE> (which isolates bit 0 of
the register) and <CODE>RX_FIFO_EMPTY</CODE> (which isolates bit 1 of the
register) have been defined to read the FIFO status.</P>
<P>The following C code is saved in the file &#34;main.c&#34;.  As this
example demonstrates, the run-time environment has been set up such that
all of the behind-the-scene work is transparent to the user.</P>
<P>
<BLOCKQUOTE><CODE>
<HR>
<PRE>
#define FIFO_DATA     (*(unsigned char *) 0x1000)
#define FIFO_STATUS   (*(unsigned char *) 0x1001)

#define TX_FIFO_FULL  (FIFO_STATUS &amp; 0x01)
#define RX_FIFO_EMPTY (FIFO_STATUS &amp; 0x02)

extern void wait ();
extern void __fastcall__ rs232_tx (char *str);

int main () {
  while (1) {                                     //  Run forever
    wait ();                                      //  Wait for an RX FIFO interrupt

    while (RX_FIFO_EMPTY == 0) {                  //  While the RX FIFO is not empty
      if (FIFO_DATA == '?') {                     //  Does the RX character = '?'
        rs232_tx (&#34;Hello World!&#34;);                //  Transmit &#34;Hello World!&#34;
      }                                           //  Discard any other RX characters
    }
  }

  return (0);                                     //  We should never get here!
}
</PRE>
<HR>
</CODE></BLOCKQUOTE>
</P>

<H2><A NAME="s9">9.</A> <A HREF="#toc9">Putting It All Together</A></H2>


<P>The following commands will create a ROM image named &#34;a.out&#34;
that can be used as the initialization data for the Xilinx Block RAM
used for code storage:</P>
<P>
<BLOCKQUOTE><CODE>
<PRE>
$ cc65 -t none -O --cpu 65sc02 main.c
$ ca65 --cpu 65sc02 main.s
$ ca65 --cpu 65sc02 rs232_tx.s
$ ca65 --cpu 65sc02 interrupt.s
$ ca65 --cpu 65sc02 vectors.s
$ ca65 --cpu 65sc02 wait.s
$ ld65 -C sbc.cfg -m main.map interrupt.o vectors.o wait.o rs232_tx.o
          main.o sbc.lib
</PRE>
</CODE></BLOCKQUOTE>
</P>
<P>During the C-level code compilation phase (<CODE>cc65</CODE>), assumptions
about the target system are disabled via the <CODE>-t none</CODE> command
line option.  During the object module linker phase (<CODE>ld65</CODE>), the
target customization is enabled via inclusion of the <CODE>sbc.lib</CODE>
file and the selection of the configuration file via the <CODE>-C
sbc.cfg</CODE> command line option.</P>
<P>The 65C02 core used most closely matches the cc65 toolset processor
named 65SC02 (the 65C02 extensions without the bit manipulation
instructions), so all the commands specify the use of that processor via
the <CODE>--cpu 65sc02</CODE> option.</P>

</BODY>
</HTML>