Page 2 of 2

Re: Q: ZX8301 effect on memory bandwidth?

Posted: Wed Mar 26, 2014 8:07 pm
by prime
Ok, I got curious so I did some tests on my 2 QLs and with the expansions I have

The two machines are a Samsung QL (originally German) with Minerva on eprom.
A UK machine with JM roms.

The expansions I tested where :
Sandy Superdisk (no onboard RAM), plus Miracle Expanderam 512K with 80ns 21256 chips.
Sandy SuperQBoard with the 512 addon board containing 512K of TI 150ns 4254 chips.
Miracle Trump Card, with 768K (originally 512K), of 120ns 41256 chips.
Sandy Superdisk on it's own with no addon RAM.

Both the Sandy boards have had the mouse upgrade done and where running ROM 1.18Y.
Trump card was running ROM 1.25.

Results where :

Smsung+Minerva

SSD+Expanderam 23sec
SQB 26sec
Trump 25sec
No expansion 35sec

UKQL+JM
SSD+Expanderam 22sec
SQB 22sec
Trump 23sec
No expansion 33sec

So what ROM version is running / model of QL also seems to make a difference.

Just for reference MESS ran the test in 10 seconds :)

Cheers.

Phill.

Re: Q: ZX8301 effect on memory bandwidth?

Posted: Wed Mar 26, 2014 8:17 pm
by dilwyn
prime wrote:Ok, I got curious so I did some tests on my 2 QLs and with the expansions I have
The two machines are a Samsung QL (originally German) with Minerva on eprom.
A UK machine with JM roms.
The expansions I tested where :
<snip>
So what ROM version is running / model of QL also seems to make a difference.
Just for reference MESS ran the test in 10 seconds :)
Cheers.
Phill.
Something I've never tried with the MESS emulator: is it tied to a fixed speed or does it run faster on faster computer systems, e.g. would MESS run much faster on modern PCs?

Re: Q: ZX8301 effect on memory bandwidth?

Posted: Wed Mar 26, 2014 8:18 pm
by tofro
prime wrote: So what ROM version is running / model of QL also seems to make a difference.
Phill,

the test is - because interrupts are not disabled during the test - slightly influenced by the ISR performance of the ROM. Minerva is maybe a bit slower because it jumps through some more vectors (the price to pay for a bit of comfort in the OS).

I'd guess the two hardware versions should give nearly identical results if running the same ROM.

Tobias

Re: Q: ZX8301 effect on memory bandwidth?

Posted: Thu Mar 27, 2014 12:50 pm
by Nasta
There is a slight difference between issue 5 and 6 boards, regarding ZX8302 accesses, on issue 5 they are slowed down to the speed of the ZX8301 - however, these occur so infrequently compared to memory cycles, theh make only a theoretical difference.
From a hardware standpoint, the ZX8301 slows down the CPU by ~50% on average. It somewhat depends on the exact sequence of instructions influencing bus cycle sequence.
The various RAM expansions use various strategies to handle DRAMs need for refresh. The highest performance is attainable when a strategy known as hidden refresh is used (this was not available on all DRAM chips), which enables the DRAM to work as fast as static RAM, no wait states included to refresh the DRAM. It is a bit more difficult to design a DRAM controller that will do that, and there can be a power drain penalty too.
Static RAM requires no refresh, and is normally fast enough to run at much higher speeds than the QL is capable of, and also uses very little power - but tat the time these expansions were developed, it's cost was almost an order of magnitude higher than that of DRAM.
It is possible to 'shadow' the internal RAM, which results in only write cycles to the screen RAM addresses slowing down the CPU - the GC and SGC use this approach, but it is also possible to use it on a regular 68008, it makes the QL a bit faster as the system variables and tables are normally located in the area used by screen 1, and the OS accesses these quite frequently.

Re: Q: ZX8301 effect on memory bandwidth?

Posted: Thu Mar 27, 2014 6:40 pm
by ppe
Nasta wrote:From a hardware standpoint, the ZX8301 slows down the CPU by ~50% on average.
Thank you Nasta, this was the figure I was looking for. The benchmarks seem to support this figure. The ZX8301 really visits memory at a rather rapid rate. Hmmm... as a naive guesstimate the figure would be 32768 addresses at 60Hz resulting in roughly 2 million memory reads per second? That's a lot of cycles to steal from a 7.5 MHz CPU.

Re: Q: ZX8301 effect on memory bandwidth?

Posted: Sat Mar 29, 2014 9:39 am
by tofro
Interestingly, the performance of the above program is also pretty much influenced by how you use it - You can compile it using Turbo (or QLiberator, not much difference) and it will run with the same (slow) speed regardless of expansion.

The reason for this is that RESPR directly from interpreted S*BASIC will allocate memory from top to bottom, while, once compiled, the memory allocation will be done from bottom to top (because "real" RESPR will not work once jobs are running, the compilers replace that call with common heap allocation where the OS strategy is to grow from bottom to top). That is, in compiled from the machine code will always reside in "slow" memory (as long as there is room for it) and thus run slower.

Sa maybe it has not been the exactly best strategy to put a program like that into the Turbo manual....

Tobias

Re: Q: ZX8301 effect on memory bandwidth?

Posted: Sat Mar 29, 2014 1:37 pm
by Nasta
The problem is, where the memory to run the program is allocated. When fast expansion memory is added, only that part of memory is fast. Running a program in the on-board 128k will give you the same results as an unexpanded QL - unless the expansion board uses some tricks :P and replaces the internal RAM (this is done by the GC and SGC).

Re: Q: ZX8301 effect on memory bandwidth?

Posted: Wed Apr 16, 2014 12:25 am
by Nasta
ppe wrote:
Nasta wrote:From a hardware standpoint, the ZX8301 slows down the CPU by ~50% on average.
Thank you Nasta, this was the figure I was looking for. The benchmarks seem to support this figure. The ZX8301 really visits memory at a rather rapid rate. Hmmm... as a naive guesstimate the figure would be 32768 addresses at 60Hz resulting in roughly 2 million memory reads per second? That's a lot of cycles to steal from a 7.5 MHz CPU.
If that simple calculation was used, the 8301 would actually steal ALL of the cycles from the CPU as it takes 4 cycles per byte for a 68008 to access memory, hence at 7.5MHz the absolute maximum throughput is 1.875 Mbytes/sec. The 8301 is fotunately as clever as it can be given the amount of gates available to it's designers, and uses an internal 4-byte buffer to fetch 4 bytes of data at a time using page mode access for the RAM. In this manner it reads the 4 bytes in about 8 cycles instead of 16, leaving half the bandwidth to the CPU.
In principle it could have been even faster if there was a larger buffer available but it's a game of diminishing returns - twice the buffer would give the CPU only about 70% bandwidth, so would a much more complex bank interleaved access scheme.
An interesting tidbit of information: there is a blanking bit in the 8301 screen control register, but the system continues reading screen data even when the screen is blanked.

Re: Q: ZX8301 effect on memory bandwidth?

Posted: Wed Apr 16, 2014 12:14 pm
by Nasta
Brane2 wrote: But in mode 4 each pixel takes 1 1/2 cycles ! And pixel rate is 10MHz.
Given a pinch of transistors on a chip and local pizzeria at Ferranti's that is willing to sputter and burn some metal on them, could you get 10 Mpix/s out of 15 MHZ clk ? :mrgreen:
Divide bz 1.5 is only a little bit more than trivial... which is what the chip does (10 Mhz artifacts are visible on a scope in the RGB outputs even in mode 8 and screen blanked) :P

Re: Q: ZX8301 effect on memory bandwidth?

Posted: Sun Apr 20, 2014 1:37 am
by M68008
Did some tests on this some time ago. I wrote a benchmark with single cycle accuracy and the results just didn't seem to make sense. Eventually another user from this forum collected some signal traces and from that we figured out good part of how it works (at some point I hope we can publish the findings). Essentially the speed of a machine code loop depends on how the distance between the memory accesses aligns with the VDA cycle of memory refresh. There is an average slow down, but the actual slow down can vary significantly depending on the alignment.
Whish I knew this at the time I was trying to extract maximum performance from my QL programs! For example, reordering the instructions of a loop without changing them could make the loop slower or faster depending on the pattern of accesses to the bus.