Brane2 wrote:@Dave:
WRT to CPU speed:
non-waiting 68000 should be on average 60% faster than 68008. But with QL's heavily braked 68008 this is more likely 100% or even more.
So 68000 at 32MHz might well feel 10-times faster, if not more. And with SEC version, there is a good chance that ti might work even significantly faster, even on 3V...
WRT to the rest of the system
(...)
This is why it makes no sense to use it for anything besides compatibility portal. I would go with this as far as support for typical cheap 17" LCD at 1280x1024 with 4bp ( and perhaps a byte) with option for a native 512/256x256 with a border.
68020 is better still, not dramatically so but sufficiently.
It has 3 advantages over a standard 68000:
1) 2x wider bus operating on a 3 cycle/transfer basis, which can dynamically be sized down to 16 or 8 bits very simply.
2) Write-back buffer and prefetsch buffer that lets the CPU core move on when data is written to the bus or execute short loops directly from the prefetch buffer without the actual need to fetch them via the external bus, which signifficantly increases data transfer if you write your routines in assembly. This is with cache disabled.
3) A small but efficient instruction cache, which is the simplest to support without problems with most existing programs (self-modifying code being the obvious exception (*)). even though the cache size seems pathetic, it caters well fro QL style programming, most of the important stuff being assembler.
It also has a disadvantage:
1) Not completely compatible due to a separate interrupt stack and slight differences in the formats of some stack frames. However, it's the simplest of the 68000+ family to get bog standard 68k compatibility on it to the largest degree, the SGC being the obvious proof to this fact.
(*) which is a no-no as far as the Ql is concerned, in general.
Ql program code is expected to be position independent and fully re-entrant which completely rules out self-modifying code (and incidentally, makes if possible to run everything from ROM).
[USB, keyboard, mouse]
Polling is not a big disadvantage as far as the OS is concerned if the USB hardware support makes it fairly simple and not prone to deadlock (as in someone removed the USB device, lets wait an unreasonable amount of time in a loop with all interrupts disabled in hopes it will respond to the poll).
Almost everything on the QL is actually polled, either at scheduler loop speed (quick if the machine is just waiting for the user) or at least at polling interrupt (20ms) speed (unless something is going on that requires all interrupts to be disabled).
One advantage of a 68k+ CPU here is support for the full 7 interrupt levels. Not because we'd want to use vectored interrupts or some such complicated stuff, but because certain levels of interrupt can be assigned to specific things like block data transfer, and by this I mean as a OS resource. Here things are pretty much free to be defined since no such thing existed so far. Adding extra trap calls is not a problem, implementing interrupt service lists (linking and unlinking in the usual manner).
An example would be a system resource to transfer X amount of data from address A to B, with routines that cater for incrementing A and/or B and linked code to test for a condition, like for instance FIFO full in some piece of hardware. The actual transfer routine is implemented by the OS, the 'user' has to link the peripheral onto the interrupt line, link in the appropriate interrupt condition test code, specify X, A, B and options. The OS then provides a low level hardware service routine that pumps data in or out of memory, maintained by higher level driver code.
Another similar use would be a fast periodic interrupt that is used for system timing, with a well know period that is of smaller granularity than the usual polling interrupt. The OS could then provide timing services via this interrupt. The point being, the interrupt structure / priority is defined by the OS, NOT by the user (to maintain things in the RTOS domain).
So far I had in mind for a "mule":
- 68SEC000FU20 working on 3.3V
- cheap 3V FLASH 70ns. Something like 1 or 2 MB
- maybe 4MB of video RAM.
WRT memory organization and (mentined later) DDR2 RAM - better stick with SDR. It is still available and will remain available because it's simpler to implement in small systems due to power and impedance control issues. Espacially for this 'mule' there would only be disadvantages in using DDR because it's simply too fast and manufacturers will NOT guarantee operation at slow speeds. This is one odd fact I have found during the development of the ill-fated GoldFire. Just recently I've been involved in another project where the same scenario happened with a DDR2 part. Unlike SDR, modern DDR2 has internal PLL-like structures to compensate for internal delays and optimize setup and hold times. It took nearly a month of back-and-forth to get the manufacturer to check if this could work on a slow bus, and no - it does not.
DDR2 and even DDR has no real advantage here even if the RAM was also being used as a frame buffer, and it's not easy to get in 'small enough' (!) sizes. Remember, the 68EC020 has only 16M of address space.
It is worth mentioning here that all synchronous RAM (static or dynamic) is actually best suited for burst accesses, and the reason it was created in the first place, is the emergence of CPUs that use burst accesses (usually 4 transfers per burst). This was implemented in the 68k family since 68030 onward, so with 68020 we are 'stuck' at emulating normal random access, i.e. a 1-transfer 'burst'. Even at the fastest official 68020 clock rate, a SDR DRAM can do this in 3 cycles, but in reality for the 68020 it would need at least 4 because of the actual time the address is supplied and data is expected. But, today the slowest SDR RAM you can find operates at 133MHz, so in effect every 68020 memory cycle can last up to 12 cycles of the RAM clock speed, and within that one could put tons of interesting things, including say, a 4-cycle burst of reads to fetch data for a screen, and have enough time to cater for any 68020 bus access at full speed.
If there is a FPGA in the system, designing such a controller should not be a great problem.
A more battery friendly but also a lot more costly version would be the use of SRAM.
Regarding Flash - this should really be thought of primairly as bulk storage. Executing code directly from it would be a LOT slower than from RAM, although it offers write protection by default.
Given the relatively small address map, I would go for something like a paged FLASH system where one small part of the FLASH ROM is permanently present somewhere in the memory map, and is basically only used to start up the system (and quite possibly this also implies actually shadowing that memory area with RAM for faster access). The rest is treated as storage. That being said, given the ability of the 68020 to dynamically size it's bus, one could use a simple 8-bit FLASH at the maximum affordable capacity one could find. Since it's going to be copied into RAM the interface width is not really a problem, but an advantage - less lines to route and less chips to power. In fact, one could even use the unused FC line codes to access it in the same address map without interfering with anything else. Alternatives would be serial FLash or perhaps even a SDC card in SPI mode.
To sum it up:
Implement the whole memory map as RAM, then 'punch holes' in it where you want your boot FLASH and IO to be. THings are generally much more flexible that way. Of course, a 'DSMCL' style control line may be provided to do clever things with peripherals and RAM such as shadowing.
- cheap Spartan6 in TQFP pack or slightly bigger smallest BGA ( 256 balls). Used in start just for simple logic and upgraded gradualy.
BGA means multilayer board, and a good pick-and-place + oven facility. Although I know some people who have done it in the oven at home, if you want reliabllity, it needs to be done professionally. Also, if it doesn't solder well, you can pretty much kiss the PCB goodbye unless again you have access to professional equipment (or a good oven?

). TQFP can be soldered (and unexpectedly easily) at home with a soldering iron with a tip barely smaller than a shovel
- MIcrochip's MIPS with USB and Ethernet interface ( and SERial ports and parallel port, spi, I2C, , PWM for native "sound", etc etc)
- materializing the logic for floppy interface in FPGA-
- basic IDE, no DMA etc stuff - at least in start.
- USB/PS2 for mouse and keyboard
- fast SPI for QL's native keayboard with beeper and microdrives
- interface for SDD and CF for simple data moving between mule and PC
- microdrive and native net interface, if feasible.
- battery support. Not just for RTC, but whole machine.
OK, lets just forget microdrives, period. They are quite frankly a shameless and rather unique way of exploiting Murphy's law, they work only because it's the worst fate that could happen to them.
Floppy interfaces are a problem initially since most of the software is on floppies, but today the capacity is trivial and one could well solder a CF card or something onto the board FOREVER and pretty much not run out of program file storage.
Supporting a 'native' QL keyboard is actually an advantage. This sort of system, as it has been said, should not be viewed as a QL replacement, but as a 'useful box that can do many things and is simple to program'. This does on occasion reduce the thing to a literal 'box' with a simple keyboard of a few keys and, say, a small LCD. Making this simple to achieve is a bonus.
Regarding the native network, it's actually nothing more than a simple serial port at a constant bit rate, and the rest (protocol and related timing) is software. It is based on a block transfer, so in theory the same protocol could work over all sorts of different media (CAN comes to mind, as does infra-red). There was once something called FastNet, which was based on an old UART chip with a hardware FIFO added, it used a modified 'net' driver. Same author as Qubide

Regarding support of old style IO resources, the only one that might be of use is the RTC counter. This could well be implemented in software on a second smaller microcontroller (PIC?) that offers external CPU access via some sort of shared memory or similar mechanism.
In fact, a long time ago I thought up a small board with it's own 68008, SRAM, a bit of hardware dual port RAM (2k if I recall correctly) and initially a fast dual UART. Later on the idea was expanded by replacing the dual UART with a PC style multi-IO chip, primarily to implement serial ports, keyboard, mouse and parallel port. The board also had an EPROM for the QL to recognize, that would hold code to start up the board. The extra 68008 side actually had no ROM at all and the idea was to fill up the dual=port RAM with start-up code from the QL side, then start the other 68008 and load the rest of the code from the EPROM on the QL side or from a file through the dual port RAM. Once the code was loaded, the DPRAM would be used to communicate with the other 68008 as a sort of mailbox/FIFO, while it would be doing it's thing sorting out the peripherals and reducing their flow of data to simple streams the QL could use. This sort of thing should be MUCH easier to do today with microcontrollers essentially having all that hardware and more on a single chip.
- instead of discrete paralel FLASH chips use serial FLASH for FPGA configuration and copying the user data into RAM automatically at boot at FPGA.
See note about FLASH, having a FPGA on-board opens up many new ways of booting the machine.