But, what if the programmer isn't using RAMDOS? Then, the speed of the RAMLink becomes an issue. The RAMLink takes about 8 cycles to perform a transfer of a byte, while the REU does it in 1. This is significant. However, if a user owns both a RAMLink and an REU, there is a way to boost the transfer rate of the RAMLink via software. The method is called Double-DMA.
Address Description ------- ----------- $de00 256 bytes of data (See $dfc0-$dfc3 for more information) $df7e write to this location to activate the RAMLink hardware $df7f write to this location to deactivate the RAMLink hardware. $dfa0 lo byte of requested RAMCard memory page $dfa1 hi byte of requested RAMCard memory page $dfc0 write to this location to show RL variable RAM at $de00 (default) $dfc1 write to this location to show RAMCard memory at $de00 $dfc2 write to this location to show the RAM Port device $de00 page at $de00 $dfc0 write to this location to show Pass-Thru Port dev. $de00 page at $de00For all locations that have the description "write to this address...", the program can safely write any byte to those locations, as the RAMLink hardware simply waits for an access, not any particular byte to be written.
The following steps will realize the Double-DMA method:
Notes: P = PAGE in RAMCard RAM to be transferred to/from A = PAGE of RAM in main memory to be transferred to/from X = single page of memory in REU used as temp RAM
sta $dfa0
Now, with the RAMLink hardware enabled in this way, the REU registers are also visible, so one can do a double DMA transfer at this point. There are two choices:
Transfer A->P:
Transfer P->A
Now, to go on:
The answer is: Maybe. If a user has inserted an REU into the RAMLink RAM port and has the Normal/Direct swittch set to Normal, RL-DOS uses REU memory as the lowest RAM in the RAMLink memory map. However, when directly accessing the RAMLink and bypassing RL-DOS, the REU is not mapped into the RAMLink memory map. So, for such a condition, the code that determines the start of the DACC partition must SUBTRACT the size of the REU from the address returned by the G-P command. It's non-utopian, but the program need only do this once. However, for such an REU configuration, one must take care to ensure that at least 256 bytes of REU RAM is available and not already in use before utilizing the Double-DMA technique.
Type Bandwidth Latency Notes (bytes/sec) (~usec) ------------- --------- ------- ----- REU 1,007,641 65.8 REU in Direct mode REU thru RL 1,007,641 77.8 REU in RAM Port in Normal mode RAMLink 105,792 199.2 Regular RAMLink access RL with REU 372,827 319.8 Double-DMA Internal RAM0 120,181 44.2 Zero-page Internal RAM1 80,283 56.3 All main memory except zero-page
So, using this technique in ACE results in a 3.7x increase in transfer speed. For some applications, that is well worth the trouble.
; Name: Double-DMA memory transfer ; Author: Craig Bruce ; Date: 1995-12-4 ; Description: The following routines use the Double-DMA technique to transfer ; memory to/from main RAM and the RAMLink. If no RL is present, ; normal CPU transfer methods are utilized. ; ; Variables: [mp] holds the address of RAMCard memory to transfer ; ramlinkNearPtr hold the address of main memory to transfer ; ramlinkLength is length of data to transfer ; ramlinkOpcode = $90: main memory -> RL ; = $91: RL -> main memory reu = $df00 rlActivate = $df7e rlDeactivate = $df7f rlSram = $dfc0 rlPageSelect = $dfa0 rlPageActivate = $dfc1 rlPageData = $de00 ramlinkOpcode .buf 1 ramlinkLength .buf 2 ramlinkNearPtr .buf 2 ramlinkMpSave .buf 3 ramlinkZpSave .buf 2 ramlinkOp = * ;( [mp]=farPtr, ramlinkNearPtr, ramlinkLength, ramlinkOpcode ) lda mp+0 ldy mp+1 ldx mp+2 sta ramlinkMpSave+0 sty ramlinkMpSave+1 stx ramlinkMpSave+2 lda zp+0 ldy zp+1 sta ramlinkZpSave+0 sty ramlinkZpSave+1 lda ramlinkNearPtr+0 ldy ramlinkNearPtr+1 sta zp+0 sty zp+1 clc lda mp+1 adc aceRamlinkStart+0 sta mp+1 lda mp+2 adc aceRamlinkStart+1 sta mp+2 - lda ramlinkLength+0 ora ramlinkLength+1 beq + jsr rlTransferChunk jmp - + lda ramlinkMpSave+0 ldy ramlinkMpSave+1 ldx ramlinkMpSave+2 sta mp+0 sty mp+1 stx mp+2 lda ramlinkZpSave+0 ldy ramlinkZpSave+1 sta zp+0 sty zp+1 clc rts rlTrSize .buf 1 rlTransferChunk = * ;( [mp]=rlmem, (zp)=nearmem, rlLength, rlOpcode ) ;** figure maximum page operation lda ramlinkLength+1 beq + lda #0 ldx mp+0 beq rlTrDo sec sbc mp+0 jmp rlTrDo + lda mp+0 beq + lda #0 sec sbc mp+0 cmp ramlinkLength+0 bcc rlTrDo + lda ramlinkLength+0 ;** do the transfer rlTrDo = * tay sty rlTrSize jsr rlPageOp ;** update the pointers and remaining length clc lda rlTrSize bne + inc mp+1 inc zp+1 dec ramlinkLength+1 rts + adc mp+0 sta mp+0 bcc + inc mp+1 + clc lda zp+0 adc rlTrSize sta zp+0 bcc + inc zp+1 + sec lda ramlinkLength+0 sbc rlTrSize sta ramlinkLength+0 bcs + dec ramlinkLength+1 + rts rlPageOp = * ;( [mp]=rlmem, (zp)=nearmem, .Y=bytes, ramlinkOpcode ) php sei sta rlActivate lda mp+1 sta rlPageSelect+0 lda mp+2 sta rlPageSelect+1 sta rlPageActivate lda aceReuRlSpeedPage+3 bne rlPageOpReu ;xxx dependency on aceMemNull==0 rlPageOpNonReu = * tya clc adc mp+0 tax lda ramlinkOpcode cmp #$91 bne rlPageOpWrite dex dey beq + - lda rlPageData,x sta (zp),y dex dey bne - + lda rlPageData,x sta (zp),y jmp rlPageOpContinue rlPageOpWrite = * dex dey beq + - lda (zp),y sta rlPageData,x dex dey bne - + lda (zp),y sta rlPageData,x rlPageOpContinue = * sta rlSram sta rlDeactivate plp rts rlPageOpReu = * ;( [mp]=rlmem, (zp)=nearmem, .Y=bytes, ramlinkOpcode ) ;** ramlink hardware already switched in ldx #1 tya beq + ldx #0 cmp #0 ;xx cut-off value bcc rlPageOpNonReu + ldy ramlinkOpcode cpy #$90 beq + ldy #$90 ;rl->reu->intern jsr rlPageOpReuRl ldy #$91 jsr rlPageOpReuIntern jmp ++ + ldy #$90 ;intern->reu->rl jsr rlPageOpReuIntern ldy #$91 jsr rlPageOpReuRl + sta rlSram sta rlDeactivate plp rts rlPageOpReuIntern = * ;( .AX=bytes, .Y=op ) sta reu+7 ;len stx reu+8 sty temp1 pha lda zp+0 ldy zp+1 sta reu+2 sty reu+3 lda aceReuRlSpeedPage+0 ldy aceReuRlSpeedPage+1 sta reu+4 sty reu+5 lda aceReuRlSpeedPage+2 sta reu+6 .if computer-64 ldy vic+$30 lda #0 sta vic+$30 .ife lda temp1 sta reu+1 .if computer-64 sty vic+$30 .ife pla rts rlPageOpReuRl = * ;( .AX=bytes, .Y=op ) sta reu+7 ;len stx reu+8 sty temp1 pha lda mp+0 ldy #>rlPageData sta reu+2 sty reu+3 lda aceReuRlSpeedPage+0 ldy aceReuRlSpeedPage+1 sta reu+4 sty reu+5 lda aceReuRlSpeedPage+2 sta reu+6 .if computer-64 ldy vic+$30 lda #0 sta vic+$30 .ife lda temp1 sta reu+1 .if computer-64 sty vic+$30 .ife pla rts