[bksvol-discuss] Re: HTML

  • From: "Monica Ballard" <MBallard1@xxxxxxxxxxx>
  • To: <bksvol-discuss@xxxxxxxxxxxxx>
  • Date: Fri, 3 Feb 2006 22:52:00 -0500

   I've been validating a strange file of unknown origins and am curious as
to how it was produced.  I went to investigate the HTML question and
downloaded a book (Felicity Learns a Lesson by Valerie Tripp).  This was
listed as HTML, but when I unzipped the file, it had a .tif extension.
Kurzweil thought it was gobblygook.  Since .tif is a graphics format, I
opened it with several programs and mostly got a picture of the cover page.
While Word tried unsuccessfully to open it, I noticed a bunch of things that
appeared to be page breaks. One program, Photoshop Essentials, gave me an
error message saying the file was designed to be viewed on a video monitor.
How strange. I was finally able to open it in Infranview and convert all the
pages to .jpg. They appear to be unrecognized photocopies. So, I'm in the
process of OCRing and cleaning up the text. I'd sure like to know how it was
made so if I run into something like this again, it won't take me so long to
figure out how to process it.

Monica
--- Begin Message ---
  • From: "Gerald Hovas" <geraldhovas@xxxxxxxxxxx>
  • To: <bksvol-discuss@xxxxxxxxxxxxx>
  • Date: Wed, 1 Feb 2006 13:16:26 -0500
Mickey,

There was some discussion on the list last month by Jake, Pratik, and myself
as to whether or not page breaks are even supported by the HTML standard
since page breaks aren't normally found in HTML files.  I do know Word
allows page breaks in HTML files, so you can try validating one of the books
in Word if you like.  I've written Janice that the issue needs to be looked
into since if the HTML standard doesn't support page breaks, then HTML needs
to be dropped from the acceptable formats for submission.  If you do try
validating one of the books, then please let me know if you are able to
verify whether or not the book you validate contains page breaks.

HTH

Gerald


-----Original Message-----
From: bksvol-discuss-bounce@xxxxxxxxxxxxx
[mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx]On Behalf Of mickey
Sent: Tuesday, January 31, 2006 8:33 PM
To: bksvol-discuss@xxxxxxxxxxxxx
Subject: [bksvol-discuss] HTML


There are several HTML files on the step-one page. Will they convert to
.rtf, retaining page breaks? I've been a little afraid to try them.

Thanks for any help.

Mickey

 To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line.  To get a list of
available commands, put the word 'help' by itself in the subject line.

 To unsubscribe from this list send a blank Email to
bksvol-discuss-request@xxxxxxxxxxxxx
put the word 'unsubscribe' by itself in the subject line.  To get a list of
available commands, put the word 'help' by itself in the subject line.


--- End Message ---

Other related posts: