SBASIC / SuperBASIC Reference Manual - HTML

Anything QL Software or Programming Related.
RWAP
RWAP Master
Posts: 2893
Joined: Sun Nov 28, 2010 4:51 pm
Location: Stone, United Kingdom
Contact:

SBASIC / SuperBASIC Reference Manual - HTML

Post by RWAP »

I have just been reminded of another little project which I started some time ago, but has been put on the back burner due to my ill health.

Basically, the SBASIC/SuperBASIC Reference Manual could now be made available as a HTML series of pages to be uploaded to a website somewhere.

The problem is converting the manual to HTML.

The manual was originally written in Text87 Plus 4 and printed using the ESC/P2 printer to an EPSON printer. This output was captured using QPC2 and QPCPrint, and output to PDF files, but due to the Text87 formatting (which adds micro inches to each letter to ensure the text is left and right justified without huge gaps between words) the PDF files cannot be searched (as each word is basically seen as a series of single letters).

Back at the end of 2012 / start of 2013, I started writing a SuperBASIC program which takes the ESC/P2 output and converts it to HTML, as close as possible to the original layout, with an attempt to form cross-referencing to each keyword!

Is anyone willing / capable to finish off that program and do the conversion?

It is not a simple task, although much of the leg work has already been done - remember the manual is well over 1000 pages! You will also ideally need Text87 Plus 4 complete with the ESC/P2 drivers as I will not be able to find the time to spend on this project.....

We can then find a nice website to upload the manual (possibly my own site) and make it freely available.


User avatar
dilwyn
Mr QL
Posts: 3126
Joined: Wed Dec 01, 2010 10:39 pm

Re: SBASIC / SuperBASIC Reference Manual - HTML

Post by dilwyn »

Strangely enough, saying it's not searchable is only partly true!

Searching the paragraphs is difficult for the reason given (micro-justification) but strangely enough (at least in the PDFs in the CD copy I have of the Reference Manual) you can search for keyword headings as they were on unjustified one line headings.

I don't know what the original Text 87 files were like, or indeed how many individual files there were, might it be easier to go through the Text 87 files to reformat them without justification to make it easier to manipulate them elsewhere? At this stage you'll tell me it's one file per page or something like that. Eek.

(shoot me if you want if I don't know what I'm on about!)

Hope someone does step forward to help with this, it would be a fantastic online reference.


RWAP
RWAP Master
Posts: 2893
Joined: Sun Nov 28, 2010 4:51 pm
Location: Stone, United Kingdom
Contact:

Re: SBASIC / SuperBASIC Reference Manual - HTML

Post by RWAP »

The Text87 files are organised in the same way as the PDF files you have :D

Even removing the justification does not really help to get the files into a better format - the idea of using HTML is more that they can then be edited and updated by others, rather than using a read-only format file.

Text87 could of course export them as plain text, but that is not really much help when you are talking of 1000+ pages....


swensont
Forum Moderator
Posts: 325
Joined: Tue Dec 06, 2011 3:30 am
Location: SF Bay Area
Contact:

Re: SBASIC / SuperBASIC Reference Manual - HTML

Post by swensont »

HTML is not that great of a format. It's good for web pages, but if you want to keep a document locally, then HTML has it's failings. If it's something to be published (and not edited), then PDF is the way to go. If you want to publish and allow edits, then some document fornat is good. For portability reasons, I'd suggest an version of Word .doc (and not a .docx). Most word processors can handle .doc files (like Open Office, Libre Office, Google Docs, Abiword, etc).

Is it possible to convert the Text87 files to something like Rich Text Format (RTF)? This will keep the paragraph formatting intact. Or just save as plain text and we can go from there.

I've been thinking about wrting some tool for Open Office that lets you mark a bunch of sentences and then combines them into a paragraph (just removed all of the <cr><lf>'s expect the last one. Back in 1990 I used an editor that had this feature. Even Word does not have this (that I am aware of).

If someone can convert from Text87 (which I don't have) to Text, I can take it from there. In my free time at work, it is good to have some brain dead projects to do.


Derek_Stewart
Font of All Knowledge
Posts: 4779
Joined: Mon Dec 20, 2010 11:40 am
Location: Sunny Runcorn, Cheshire, UK

Re: SBASIC / SuperBASIC Reference Manual - HTML

Post by Derek_Stewart »

Hi,

Maybe possible to convert PDF copy of the manual to HTML.

i have some programmes that say they can do this in Windows.


Regards, Derek
Ralf R.

Re: SBASIC / SuperBASIC Reference Manual - HTML

Post by Ralf R. »

I have the latest Version of T87+4, so if someone needs help...

:mrgreen:


RWAP
RWAP Master
Posts: 2893
Joined: Sun Nov 28, 2010 4:51 pm
Location: Stone, United Kingdom
Contact:

Re: SBASIC / SuperBASIC Reference Manual - HTML

Post by RWAP »

swensont wrote:HTML is not that great of a format. It's good for web pages, but if you want to keep a document locally, then HTML has it's failings. If it's something to be published (and not edited), then PDF is the way to go. If you want to publish and allow edits, then some document fornat is good. For portability reasons, I'd suggest an version of Word .doc (and not a .docx). Most word processors can handle .doc files (like Open Office, Libre Office, Google Docs, Abiword, etc).

Is it possible to convert the Text87 files to something like Rich Text Format (RTF)? This will keep the paragraph formatting intact. Or just save as plain text and we can go from there.

I've been thinking about wrting some tool for Open Office that lets you mark a bunch of sentences and then combines them into a paragraph (just removed all of the <cr><lf>'s expect the last one. Back in 1990 I used an editor that had this feature. Even Word does not have this (that I am aware of).

If someone can convert from Text87 (which I don't have) to Text, I can take it from there. In my free time at work, it is good to have some brain dead projects to do.
I agree a word format would be good, but it makes sense to have hyperlinks of some sort, so you can quickly go to related keywords (and back).

The conversion program could of course output to any format, if we know the codes to use!

Text87 can't output to RTF or similar, you are basically stuck with plain text or nothing from memory...


tcat
Super Gold Card
Posts: 633
Joined: Fri Jan 18, 2013 5:27 pm
Location: Prague, Czech Republic

Re: SBASIC / SuperBASIC Reference Manual - HTML

Post by tcat »

Hi,
If someone can convert from Text87 (which I don't have) to Text, I can take it from there. In my free time at work, it is good to have some brain dead projects to do.
As I also do not have Text 87 app, can I have a look at the source Text87 document, I would like to see the structure, anything similar to QUILL format?

Perhaps I may help with this first step.

Tom


RWAP
RWAP Master
Posts: 2893
Joined: Sun Nov 28, 2010 4:51 pm
Location: Stone, United Kingdom
Contact:

Re: SBASIC / SuperBASIC Reference Manual - HTML

Post by RWAP »

The Text87 file structure is intrinsically linked to the printer driver, so it is not at all easy to work out the format....


User avatar
1024MAK
Super Gold Card
Posts: 593
Joined: Sun Dec 11, 2011 1:16 am
Location: Looking forward to summer in Somerset, UK...

Re: SBASIC / SuperBASIC Reference Manual - HTML

Post by 1024MAK »

Some years ago I wrote a program to strip out control sequences from transcripts of an inquiry that I wanted to read on a Psion PDA. Including unwanted LF, CR and spaces. I wrote the program as it was 100's of pages that needed processing. And before you ask, no I don't know where my copies of this program are stored (and it's not SuperBASIC).

Could the same be done with the output of Text87?

I also find that when extracting text from PDF or other file formats, it is often quicker and easier to convert all the text to plain text. Then add all the formatting back in after. Otherwise it can become a bit of a nightmare trying to everything to match up and look right.

Mark


:!: Standby alert :!:
“There are four lights!”
Step up to red alert. Sir, are you absolutely sure? It does mean changing the bulb :!:
Looking forward to summer in Somerset later in the year :)

QL, Falcon, Atari 520STFM, Atari 1040STE, more PC's than I care to count and an assortment of 8 bit micros (Sinclair and Acorn)(nearly forgot the Psion's)
Post Reply