I'm about 90% complete in (re) coding the NET and FSERVE drivers to work alongside the 'Message Queue server' that drives the QLUB Adapter and still have one fundamental design item to overcome to fully support the original drivers.
I can partially workaround the isaue with a cludgey approach, but thought to reach out for any ideas you might have developed already or could think about before I release the next version.
It's a bit tricky to explain, but here goes:
Both the simple NET driver as well as the client-side of FSERVE (Nx_) rely on the ability to 'retry' an operation when OPENing or CLOSEing a remote channel. It's an effective but non-multitasking friendly approach that kills job scheduling and other IO during the process, which may last up to 25 seconds.
QDOS's native IOSS retry mechanism works perfectly well for TRAP #3 operations such as byte/string Input and Output, but the same facility is not extended to TRAP #2 operations such as OPEN or CLOSE, which are rightly considered 'memory management' type operations and should therefore be atomic. I.e., you get one chance to OPEN or CLOSE before an error or success is reported - whereas TRAP #3 IO can be deferred by returning a 'Not Complete' error, in which case the IOSS will continue to reschedule/retry the operation until it timeouts or succeeds.
When it comes to network ops, the actual op could take several 50Hz 'ticks' to complete, with several timeouts occuring in-between before success. For normal IO, this is fine due to the retry mechanism as QDOS will give CPU time to other jobs and IO between attempts, but for OPEN or CLOSE, we end up stalling QDOS and anything else running in the meantime, whilst we stay in Supervisor mode throughout. The native NET and FSERVE client drivers manually retry and timeout after 20-odd seconds if they don't complete before. To see that in action on a regular QL, start the TK2 CLOCK job then attempt to open a non-existent file (OPEN #n, "Nx_somefile" ) on another remote QL that's running FSERVE. The clock won't get updated until the OPEN times-out.
Likewise and a bit more irritating, if you are also running FSERVE on the client QL, its own file-server job will be inactive and thus not respond to remote inbound requests whilst it attempts to access that non-existent file at the other end.
Now, when we introduce the QLUB Adapter connected to a PC/Mac/Unix box running a QDOS/SMSQE emulator, we inherently add some additional latency to the NET IO as the messages/packets get queued-up to pass out the SERial/USB port and then we wait for a reply from the QLUB to say 'all done' - rescheduling jobs and other IO between polls for a
reply from the QLUB. This doesn't hurt overall throughput much (if at all) as the QLUB takes care of the actual bit-banging down the NET line, freeing up the emulator between polls and it can retry packets much more rapidly than a native QL anyway.
However, given the principal aim of running FSERVE on the emulator to host files and devices accessible to other native QL stations, it does mean that we either live with an intermittently unresponsive file- server or else limit the emulated QDOS from making its own FSERVE Client requests outbound.
Furthermore, when sending a file through a simple NETO_x channel from the emulator/QLUB, this will also leverage that 'manual' blocking retry mechanism when the very last packet/block (flagged as EOF) is being delivered to the remote QL each time the NET channel is CLOSEd.
Not a huge limitation and, like I say, it could be worked around to an extent.
But I don't like it

So, any ideas about how we might effectively and safely re-enter the Scheduler from within an OPEN/CLOSE operation (which is running in Supervisor mode) and thus allow for these long-ish timeframes without stalling other jobs/IO running concurrently in the emulator?