IF SMSQ/E does not have to relocate it's code, there is no need to re-write the exception vectors. If it has to re-locate, then the need to rewrite the vectors is obvious as they have to track the relocation of the code.
As far as I can tell, a static version is possible, but it needs more than access to the exception vectors. SMSQ/E is after all a 'QDOS compatible' and QDOS has vectored routines, with word (16 bit) vector addresses. This means that in case of a static SMSQ/E, the first 48k of ROM must be available for SMSQ/E code. This obviously covers the need to have exception vectors available since they reside in the first 256 bytes (and most are actually not used).
Modularity enables the rest of the code to reside elsewhere, for instance somewhere at the top of the memory map.
All that being said, this only has a real use in a system where code execution from ROM is not appreciably slower than from RAM. GC/SGC are definitely NOT that, and as far as I can tell neither is Q40/60 although the former are far worse in that respect because the ROM width is only 8 bits, whereas on the Q40/60 it's the full 32 bits, so even at the same access speed it would be 4x faster. But... how about on something closer to the original QL hardware, based on, say, a 68008FN so that a larger address map is available? One could provide 512k of storage, the top 384k located at the top of the 4M address map. The bottom 128k is divided into two 64k blocks, of which either can be mapped into the first 64k of the address map. Point being, a 'backup' OS like Minerva in one, the first 64k of SMSQ/E in the other 64k. The backup is in case you 'brick' the flash

In such a system it would be indeed feasible to run SMSQ/E directly from ROM.
This means there is no need for relocation, so a static image can be produced. In GC and SGC, relocation is used because both of them use shadow RAM to execute 'ROM' code, rather than real ROM, because the ram is full width and much faster than ROM, so they actually execute any ROM from a RAM copy. Also, normally there is no ROM storage in such a system where the SMSQ/E code could fit in any form (*) so it has to be loaded by an already running OS (**) so some form of relocation is needed anyway, fortunately GC/SGC provides the mechanism by default due to it's use of ROM shadowing.
Since SMSQ/E requires code to reside in the first 48k of the address map (for reasons mentioned above, CPU vectors plus OS vectored routines) just like any other QL OS, it uses the same mechanism of copying itself to shadow RAM in those positions. The remaining code runs from what the OS sees as actual real RAM. No idea about Q40/60 but on GC/SGC, there was already a discussion on this, and there are 3 major code portions, one in the shadow of the regular QL ROM (first 48k), another in the 32k space starting from $10000 (this is where the GC/SGC ROM copies itself) and the third one at the top of RAM. This way at least some of the code runs from 'emulated ROM' and increases the amount of working RAM available to the OS by that much.
Also, what is the current size of SMSQ/E? If I remember correctly it's around 256k for GC/SGC, and indeed running it on a slower 68008 system would probably use the GC/SGC version as a template. In the above example it would leave some of the flash capacity free for other ROM images or extensions. Bare SMSQ/E code would not be recognized by a different running OS as extensions since it does not have the required ROM header flags at modulo 16k intervals.
(*) Aurora provides a mechanism to fit a 512k ROM (or pre-programmed flash (***)) into it's ROM socket. The contents of the entire 512k ROM can be accessed as 16 pages of 32k (using a paging register) and page 0 (first 32k of the ROM chip) automatically appears as the first 32k of the address map, as usual. The following 16k (making a total of 48k ROM as original) is always the 16k of the ROM chip above the first 32k, but for this example it's not important).
A ROM image can be constructed that holds a small loader at the start, followed by the code of any OS you want (or multiple OSs for that matter). The loader would then execute instead of an actual OS and knowing it runs on a SGC, it could easily copy a static SMSQ/E map into all the proper places on the SGC, including shadowed areas, instead of the SGC ROM running and copying ROM to RAM etc...) establishing a copy of the SMSQ/E code where it needs to be, and then simply jump to the starting point for the code to be executed. And, you have ROM based SMSQ/E. Unfortunately, this was never supported. I suppose now that SMSQ/E is in the public domain, someone could generate an image for such a ROM.
(**) Alternatively, the SGC ROM itself could contain a loader and copy code either via this mechanism on Aurora or obtain it from another medium such as floppy or whatever.
(***) SGC does not generate write cycles to either of the ROM spaces it implements. Yes, there are two - one is initially the 'real' ROM at address zero, but there is also a copy always available (even after the actual ROM is replaced by shadow RAM) at $400000 if memory serves me correctly. But, in either case it can only be read, so there is no way the Aurora can write a flash chip if it was used instead of a read-only EPROM. Sadly
