Internal ram upgrade to 640k

Nagging hardware related question? Post here!
User avatar
Dave
SandySuperQDave
Posts: 2808
Joined: Sat Jan 22, 2011 6:52 am
Location: Austin, TX
Contact:

Re: Internal ram upgrade to 640k

Post by Dave »

I would be interested in making a few of these for sale. They would be quite cheap and easy to make.


Paul
Gold Card
Posts: 257
Joined: Mon May 21, 2012 8:50 am

Re: Internal ram upgrade to 640k

Post by Paul »

Sorry, this a newbie question :oops:
I wonder how a QL would react if it is internally expanded to 640K and you attach an external 512K Memoryexpansion.

What happens if you have no internal, but two external expansions?

For example a Processorcard with 512K AND a Floppy Controller with 512K?
Can they detect that there is already a expansion?
Can I make shure the faster expansion (this seems to exist? At least with 68020?) has priority?

Please help making me bit more knowing 8-)
Kind regards
Paul


User avatar
tofro
Font of All Knowledge
Posts: 3132
Joined: Sun Feb 13, 2011 10:53 pm
Location: SW Germany

Re: Internal ram upgrade to 640k

Post by tofro »

Paul wrote: I wonder how a QL would react if it is internally expanded to 640K and you attach an external 512K Memoryexpansion.

What happens if you have no internal, but two external expansions?

For example a Processorcard with 512K AND a Floppy Controller with 512K?
Can they detect that there is already a expansion?
Can I make shure the faster expansion (this seems to exist? At least with 68020?) has priority?

Please help making me bit more knowing 8-)
Kind regards
Paul
Paul,

I'm afraid the QL wouldn't even run in case of overlapping memory expansions.

The QL has built-in detection for external hardware (that is, I/O and ROM memory) only on the expansion port and only for the add-on peripherals address range $C0000 to $E0000. For memory expansions, there's no such thing as overlap tests or other negotiation of who supplies the memory and where. The designers of the QL have apparently not catered for internal expansions and multiple memory expansion cards. Memory used to be expensive those days - You just weren't expected to own more than one expansion ;)

The QL memory map reserves 8 slots of 16kBytes address range for expansion cards (I/O addresses and add-on ROM). There is a mimic that allows external cards to allocate themselves into one of those slots, depending on the position in the daisy chain of cards connected to the expansion ports and the "speed" of "grabbing" a slot (first come, first served) - But this does not apply for memory expansion.

Newer cards - like the Gold and SuperGold Cards allocate their memory in excess of 1M completely outside the 68008 address range (well, they need to), their peripherals seem to comply with the original scheme, however.

Regards,
Tobias


ʎɐqǝ ɯoɹɟ ǝq oʇ ƃuᴉoƃ ʇou sᴉ pɹɐoqʎǝʞ ʇxǝu ʎɯ 'ɹɐǝp ɥO
User avatar
gertk
Hardware Hero
Posts: 41
Joined: Mon Aug 19, 2013 10:00 pm

Re: Internal ram upgrade to 640k

Post by gertk »

If the timing is no problem, overlapping memory will just work. When a byte is written it is written into both memory chips, when read it should be the same too.. Not pretty but no disaster either.
What is possible is to detect if the DMSCL line is pulled externally and than disable the internal memory expansion too.


Nasta
Gold Card
Posts: 462
Joined: Sun Feb 12, 2012 2:02 am
Location: Zapresic, Croatia

Re: Internal ram upgrade to 640k

Post by Nasta »

tofro wrote: I'm afraid the QL wouldn't even run in case of overlapping memory expansions.
It depends to an extent on how the memory is implemented, but in fact it will work because both RAMs get the same data written to and read from, but there will be partial contention on the bus and of course it will not be a configuration you would actually want to use.
In some cases specific timing results from the implementation (usually for dynamic RAM) which results in incompatibility with one or the other or even both RAM boards and then it does not work.
However, an internal RAM board CAN indeed be made so that in case an external RAM is connected, the externals take precedence. This is done by correctly implementing the DSMCL pin logic, it has to be intercepted on it's way to the 8301 ULA and logic added. This logic generates DSMCL=high whenever the internal RAM is accessed, but also when the external DSMCL is generated, except that in this case if an external DSMCL (from the expansion connector) is received it also disables the internal expansion.
The QL has built-in detection for external hardware (that is, I/O and ROM memory) only on the expansion port and only for the add-on peripherals address range $C0000 to $E0000.
Actually, it does not. This is almost entirely implemented in software and relies on certain hardware behavior when there are is no expansion present. In particular, all of the on-board gardware is decoded within the first 256k out of the 1M of the 68008 (DIP case) address range. The remaining 3 256k blocks (to make up the total 1M) are just the first one repeated.
The software makes no assumptions about the address map except for the start addresses of certain areas to test for various things such as RAM or expansion ROM.
The RAM detect routine simply starts to look for RAM from 20000h onwards and proceeds in 16k blocks until it finds something that is not RAM, i.e. does not properly retain written data (**). On a standard non-expanded QL, this happens as soon as it hits address 40000h because it finds an alias of the system ROM at this address (also it's there at 80000h, C0000h), which obviously does not behave as RAM so it stops and assumes no more RAM is present. In other words, it assumes RAM is contiguous. Incidentally, this was a real problem on the Atari TT emulator, because it has slow and fast RAM so special care had to be taken by adding an extra RAM detect and test routine to cater for it. The OS itself does indeed have a way (albeit using a few tricks) to represent non-contiguous RAM areas in it's data structures, and manages these correctly when they are set up properly.
A side effect of this is that using a different CPU that has a wider address range (the simplest of which is the 68008FN PLCC case version, with 4M address range), and providing more RAM than the original 1M address range will happily make the QL detect extra RAM and use it, given the correct version of the OS, because some had bugs in setting up the data structures the OS uses to manage the RAM, which could not become apparent while the total address range was limited to 1M (*).
A similar philosophy is used to detect extra IO devices, more on that below.

(*) Early Minerva revisions had a bug where they would correctly detect RAM past the 1M 'end of memory map' for a standard 68008, but the OS uses free RAM to cache file reads and writes (As well as implement RAM disks) using a system called 'slave blocks'. The bug used a 16-bit offset to build the slave block table as it detected additional RAM, but if there was more than 2M of RAM the offset pointer would wrap around and corrupt the start of the table, causing the system to crash immediately after the F1/F2 prompt. This was fixed and Minerva will happily detect RAM sizes up to the 4M boundary and also look for IO peripherals from 0C0000h up to the end of the address map, even if this is 4M on a 68008FN - which is exactly how I know this works :)
It is a HUGE pity the original QL did not use the FN case 68008, if it just HAD to use an 8-bit data bus.
For memory expansions, there's no such thing as overlap tests or other negotiation of who supplies the memory and where. The designers of the QL have apparently not catered for internal expansions and multiple memory expansion cards. Memory used to be expensive those days - You just weren't expected to own more than one expansion ;)
Actually, it is very difficult to implement a system that would cater for everything, and the chief driving parameter in the implementation of the logic was simplicity while retaining a lot of usability, i.e. the price had to be kept low :)
It is possible to set up a system where various RAM expansions can be 'daisy chained' and added together but this sort of thing is generally discouraged because it incurs a significant speed penalty as well as complicates the hardware. In such systems it's especially difficult to cater for the fact that more boards in a daisy chain increases access time for RAM towards the end of the chain, plus reduced electrical signal integrity, because it's VERY difficult to detect marginal cases - i.e. when the delay has become too large, or the signal integrity is insufficient. Even using something like 'plug and play' to set up RAM addresses, which caters for the delay problem, does not cater for the signal integrity problem.
Regarding internal expansion, it was simply NOT designed for. One of the reasons being power consumption - the internal regulator is at the limits of it's capability as it is. Yes, lower poer RAM and ROM with larger capacity became available, but the normal way things would have worked out would have been the appearance of a QL II or a motherboard redesign.
The QL memory map reserves 8 slots of 16kBytes address range for expansion cards (I/O addresses and add-on ROM). There is a mimic that allows external cards to allocate themselves into one of those slots, depending on the position in the daisy chain of cards connected to the expansion ports and the "speed" of "grabbing" a slot (first come, first served) - But this does not apply for memory expansion.
Actually, the hardware caters for 16 slots of 16k. The idea was to daisy chain boards in a case with a buffered bus, and the address is automatically set up by the position in the chain, or in an alternate implementation, the position on a backplane. Nothing to do with first come, first served.
It should be noted that the OS in principle uses the same mechanism to detect IO boards as it uses for RAM - it looks at the ROM slot (0C000h) and then starts at C0000h and continues on to the end of the address map (whatever that might be) in steps of 16k. Again, this depends on the actual OS version used. In particular Minerva will do this, and in fact also looks at 10000h and 14000h.
In the case of IO expansions, the system expects every such expansion to have a driver ROM at the start of it's 16k alloted slot, and looks for a ROM headed in the forst 4 bytes to determine if there is an Io expansion there or not. That being said, because of the way the RAM detect and test works, if the IO expansion area is populated by RAM and it is contiguous with the usual RAM area, the OS will gladly alow use of the IO expansion area to implement an extended RAM expansion area, which was in fact used by the Miracle Trumpcard.

(**) In order not to have the RAM test, which writes and reads data from the start of each 16k block it tests, IO circuits should be designed so that this initail test does not impede their operation. This is why IO expansion assumes a ROM at the start - this will immediately result in the system disregarding the 16k block as possible RAM. Any special registers of the actual IO hardware are assumed to be allocated from the end of the 16k area used by each IO peripheral.
The exception to this are the registers used to control the motherboard hardware and QIMI, which are located at the start and end of the 16k block of addresses at 18000h..1BFFFh. The OS assumes these to always be there and they are not movable unless the relevant OS code is changed.

There is an admonishment in literature to generate the DSMCL signal in a certain way, because it has to be pulled high before the DS signal is generated by the CPU. To do this the decoding circuit for DSMCL must use the address lines ONLY. It is possible on the 68008 to use ASL to validate the state on the address lines but this is NOT recomended, ASL was to be left unconnected, and used only to enable the buffers for a backplane. This can work because the address lines are guaranteed to be stable while DSL is active, in fact they are set up well before DSL goes low and held a while after it goes high.
The DSMCL signal is actually the DSL signal pin on the 8301 ULA, and connects to the DSL pin on the CPU and bus connector through a resistor. If nothing is connected to the signal externally, it will just pass through to the ULA, which uses it to activate it's internal decoder, which decodes appropriate chip selects for all of the QL motherboard logic (sometimes with a bit of help from an external chip, as in the case of the system ROMs). Because the inactive level of the DSL signal is high, when a high signal is put on the DSMCL line, the DSL from the CPU to the ULA is over-ridden (hence the resistor, to limit the current from the CPU DSL pin when DSL is low but we want to make it appear high, i.e. inactive to the 8301 ULA). Therefore, the ULA has no idea anything is happening on the bus and it's decoder remains inactive. This makes it possible for external hardware to over-ride the internal hardware, by setting itself up on addresses where the internal hardware would appear if the ULA decoder was active.

For any expansion, a decoding circuit has to drive DSMCL high for every address it wants to use. For RAM (without using extra tricks), this would be addresses 40000h..BFFFFh, or in other words for any address where A19 is low and A18 is high, or A19 is high and A18 is low. For the IO area, C0000h onwards to FFFFFh, any address where A18 and A19 are high.

However, the simplicity of the DSMCL implementation is actually also it's strength. If one looks closely at how it works, it is in fact possible to disable ANY internal hardware and replace it with external hardware. Also, there is no reason why one would not disable only part of the internal hardware and only some of the time, using extra signals and states of hardware (perhaps these can even be set up by software!) , to get various interesting and useful side effects.
Newer cards - like the Gold and SuperGold Cards allocate their memory in excess of 1M completely outside the 68008 address range (well, they need to), their peripherals seem to comply with the original scheme, however.
As you can see, the basic mechanism in software caters for larger memory maps, but compliance to and implementation of original expansion schemes and address areas depends on the actual hardware. It is possible to implement a lot of them, but that does not mean it's a good idea to do so, i.e. wanting to replace part of a fast internal RAM with slower external RAM does not really make sense. Therefore it's possible to implement an internal memory expansion so that the required addresses for external RAM not to even appear on the expansion connector, therefore connecting a peripheral with, say, RAM and a floppy controller, will simply make the system just see the floppy controller.


Nasta
Gold Card
Posts: 462
Joined: Sun Feb 12, 2012 2:02 am
Location: Zapresic, Croatia

Re: Internal ram upgrade to 640k

Post by Nasta »

Regarding a SRAM internal expansion:
Two such RAM chips (if a PCB was made, a SMD case would be perfect for small size) or even one larger 1Mx8 chip (they do exist but are quite difficult to find), can be used to make a very clever and fast RAM expansion similar to the Trumpcard (but without the floppy controller). Using a GAL decoder, this could implement full shadowing of the on-board videao RAM which will speed it up quite a bit.
The trick is in the decoding logic for DSMCL and the two chip selects for the RAM chips.
Since two 512k RAM chips have a total capacity of 1M, which is equal to the full address space of the 68008 in the QL, some of the actual RAM will not be used as such because parts of the address map have to be left as is to decode the on-board (EP)ROM, IO, ROM slot, and perhaps a 16k slot for Io expansion (for a floppy controller or IDE interface).

This is how the decoding logic would work:
Inputs: Outputs:
A19 A18 A17 A16 A15 RWL DSMCL RAMCS0L RAMCS1L EPROMCSL
0 0 0 0 X X high-Z H H L System ROM (or EPROM)*
0 0 0 1 0 X * * H * Extended ROM or emulation
0 0 0 1 1 X high-Z H H H QL internal IO + spare
0 0 1 0 0 0 high-Z L H H Screen 0 write
0 0 1 0 0 1 H L H H Screen 0 read
0 0 1 0 1 0 high-Z** L H H Screen 1 write
0 0 1 0 1 1 H L H H Screen 1 read
0 0 1 1 X X H L H H replaces internal top 64k
0 1 X X X X H L H H implements 256k expansion
1 0 X X X X H H L H additional 256k expansion
1 1 0 X X X H H L H additional 128k expansion
1 1 1 0 X X H H L H additional 64k expansion
1 1 1 1 0 X H H L H additional 32k expansion
1 1 1 1 1 X H*** H H H top 32k of IO area is free

* can be used for ROM emulation on a Minerva
** depends on use of screen 1, see below.
*** has an interesting side effect when no IO expansion is present

Now, there are more clever ways to set-up the decoder so that one 512k and one 128k chip can be used, for instance, also address lines can be jiggled around a bit to simplify and possibly fit the logic into a smaller GAL still including more options.

What does this decoder actually do?
Well, it maps the first 512k SRAM chip into the bottom 512k of the QL's memory map (addresses 00000h..7FFFFh), the second 512k SRAM chip into the top 512k of the QL's memory map (addresses 80000h..FFFFFh), but not in it's entirety. It 'skips' the required QL internal bits and pieces, and does this with some cleverness regarding the internal RAM, to gain speed.

The second chip decoding (RAMCS1L) is easier to understand - it simply puts it in the top 512k of the entire QL address map except for the very top 32k at F8000h..FFFFFh, where the last two 16k IO expansion 'slots' are, so that something like a floppy interface and/or IDE/CF interface can be connected and mapped there. The SRAM chip is prevented from responding at those addresses by keeping it's chip select signal inactive (high). However, there is a small catch regarding DSMCL which I will explain below. In any case the part of the SRAM chip that would map there is not used.

The first chip decoding (RAMCS0L) is a bit more involved as more areas need to be left unmapped so that the required QL's internal bits appear at the proper places. Although the SRAM chip maps into the bottom 512k of the address map, some addresses are not used by preventing the chip from responding by keeping it's chip select signal inactive, and leaving the internal QL hardware to respond instead in the usual manner by keeping the DSMCL signal inactive when these addresses appear. In particular, the system ROM area is left as is (for obvious reasons) although there is a EPROMCSL active low signal implemented for convenience should someone want to use an EPROM instead of the original ROMs. The EPROMCSL signal is also generated for both reading or writing so that a rewritable Flash ROM can also be used, for this a RDL (read active low) signal must be generated by the GAL by simply inverting RDWL. Extra options could be added to select use of the external ROM slot or use the top 16k of a 64k EPROM chip to emulate it - the table assumes EPROM emulation, but this is easily added into the logic.
Interesting things happen in the next 32k up. This area is normally unused on QLs but when Minerva is run, it will look for ROM images there, so it is possible to map some RAM into those addresses, and load ROM code, so the system can recognize them on the next reset as ROMs. The caveat is that the ROM images are unprotected and can be corrupted by writing data to that area. A better use perhaps would be to couple this feature with the use of a Flash ROM or EPROM, a part of it could be mapped here and indeed contain additional ROM images for the system to detect.
The next 32k up contain the QL internal IO and yet another unused 16k, this however is left as is in the QL for simplicity reasons.
The next 128k up (address $20000h and up) are occupied by the internal RAM and implement the two 32k screen RAM areas in the bottom half. Here a bit of cleverness is used to get around the problem of the internal RAM being slow. The whole top 64k of the internal RAM is replaced by the ram in SRAM chip 1, by disabling the internal ram via DSMCL and mapping the external SRAM instead. A bit more attention is needed for the screen, as the QL still has to write the contents of the screen RAM into the actual internal RAM, to have the 8301 ULA display it on the screen. To do this, it does not disable the internal RAM for writing, but does select the external RAM, which has the consequence of writing data to both the internal and external RAM. The internal RAM dictates the speed by virtue of the ULA generating the DTACKL signal (this is not mentioned in the table but see below).
However, the internal RAM is disabled for reading so the data written will be read back from the copy in external RAM - at the increased speed of the external RAM, about twice as fast.
The only other thing left is to decide if we want to use screen 1, addresses 28000h..2FFFFh. As shown, the decoder treats both screen 0 and 1 the same, enabling use of both screen areas. The entire screen 1 area could be disabled and external RAM mapped into it instead with no ill effect except for gibberish on the screen if screen 1 is activated. The reason one would want this is that when 2 screens are not used, the screen 1 area of RAM contains quite a bit of the system's internal data structures, which are very frequently accessed, both for reads and writes. System variables and tables all start at the beginning of this area and extend to higher addresses depending on how much extra RAM there is. Using external fast RAM for this will get you the last few % in speed, but in general the shadowing mechanism used already increases speed quite a bit.
The rest of SRAM chip 2 is used to implement the first 256k of expansion RAM.

The table shows how decoding is put into terms of a decoding table - any signal with an X is not used to decode that particular area of the address map. Sometimes the notation looks a bit odd, for instance the addresses for SRAM chip 2 have to be expressed as a 256k + 128k + 64k + 32k = total 480k RAM expansion (on top of that implemented in SRAM chip 1) just to cater for the last 32k remaining free for IO expansion boards. Similarily, some 384k (128 + 256) of RAM are implemented in SRAM chip 1. The total is therefore 864k.

Three important things do not appear in the decoder table:

1) either the CS signals for the EPROM and SRAM must only go low when DSL is low, or separate read and write active low signals must be generated by the GAL, using RDL and only active when DSL is low. In the first case the OE signals of the SRAM chips are connected to ground and WR of the SRAM chips is connected to RDWL. The EPROMCSL signal must also only appear when RDWL is high (only on read) or it's OE pin must be connected to a read enable signal generated by the GAL by inverting RDWL. In the second case, the connections are more logical and the chip select signals go directly to the relevant chips, while a read enable goes to all OE pins, and write enable to all WE pins (if present - the EPROM only has OE so it will be read-only).

2) DTACKL must be handled for all addresses that the expansion hardware takes over. This is normally done by simply copying the DSL signal to DTACKL whenever the decoder generates DSMCL high (accounting for the fact that DTACKL should only pull low but remain high impedance or tri-state when it's supposed to be high, i.e. inactive). In the case of this hardware, there is an exception, and that is when the external SRAM is used to shadow a write to the screen area. In this case DSMCL is not generated, and neither is DTACKL but the relevant chip select to the external RAM chip and it's write enable is. This lets the QL's internals behave exactly as usual, enabling the external RAM to 'pick up' the write data and store a copy of it.
Using DSL to generate DTACKL with no delay results in the shortest and fastest access cycle. All SRAM chips of the required capacity are so fast that they could easily run at twice the speed of the QL or better, so running them slower than maximum significantly reduces power requirements (the chips use CMOS technology, with it power consumption is proportional to speed), which is an added bonus.

3) There is an unexpected caveat to the way the OS scans for RAM which results in RAM being correctly detected up to the very top of the address map, including the 32k Io expansion area 'snippet' left free there. The reason for this is the QL's internal decoder only decoding 256k of addresses, so it simply sees the same actual internal hardware in each block of 256k within the 1M of the complete address map. Hence, the system ROM will appear at 00000h..0BFFFh, but also at 40000h..4BFFFh, 80000h..8BFFFh, C0000h..CBFFFh. The same happens with internal RAM - 20000h..3FFFFh, also at 60000h..7FFFFh, A0000h..BFFFFh, E0000h..FFFFFh. In our particular case it's the alias of the internal RAM at E0000h..FFFFFh that is of interets, in particular it;s last 32k at F8000h..FFFFFh, because all of the rest has been disabled by the decoder for our RAM expansion. However, the last 32k in the address map has not been disabled, and when scanning for RAM, the system will find an alias of the on-board RAM there, and in fact initialize it and use it as RAM, increasing the total system RAM to 896k. But hang on, isn't this the same as at addresses 38000h..3FFFFh where the system expects the original 128k of RAM? Well, no - because we have disabled that particular part (as well as all other aliases) with our decoder, so this is actuallythe only place where the CPU can now find the top 32k of the original slow 128k RAM, and in fact it will use it if no IO expansion is installed to take over those addresses as it would do normally.
If one has read carefully, then one may notice there is a potential problem here, and indeed it would be the reason why there was never a 640k RAM expansion - except the one I made for myself and stumbled across this problem.
Here is what happens - such an expansion would use the internal RAM and then add 512k and a further 128k to make up for 640k of expansion RAM or 768k total RAM. However, when the system checks for RAM, it finds it starting from the usual place at 20000h, and all the way up to DFFFFh, making 768k total. But then, it encounters an alias of the original 128k RAM at $E0000h at which point it is again testing the first 128k or RAM and writing stuff into it. This was very odd to see - the screen would fill with the test pattern, then for a while nothing would happen while the extra RAM was being tested, then all of a sudden the screen would fill with the test pattern again and then the system would crash before the F1/F2 prompt - because the RAM test would actually overwrite the system variable area being built on-fly with the test, at the start of screen 1, by writing to it's alias.
So, if one wants a QL with 768k total RAM, there has to be a 'fake' or real IO expansion at E0000h just to stop the RAM test from crashing the system :)


Post Reply