I've been validating a strange file of unknown origins and am curious as to how it was produced. I went to investigate the HTML question and downloaded a book (Felicity Learns a Lesson by Valerie Tripp). This was listed as HTML, but when I unzipped the file, it had a .tif extension. Kurzweil thought it was gobblygook. Since .tif is a graphics format, I opened it with several programs and mostly got a picture of the cover page. While Word tried unsuccessfully to open it, I noticed a bunch of things that appeared to be page breaks. One program, Photoshop Essentials, gave me an error message saying the file was designed to be viewed on a video monitor. How strange. I was finally able to open it in Infranview and convert all the pages to .jpg. They appear to be unrecognized photocopies. So, I'm in the process of OCRing and cleaning up the text. I'd sure like to know how it was made so if I run into something like this again, it won't take me so long to figure out how to process it. Monica
--- Begin Message ---
- From: "Gerald Hovas" <geraldhovas@xxxxxxxxxxx>
- To: <bksvol-discuss@xxxxxxxxxxxxx>
- Date: Wed, 1 Feb 2006 13:16:26 -0500
Mickey, There was some discussion on the list last month by Jake, Pratik, and myself as to whether or not page breaks are even supported by the HTML standard since page breaks aren't normally found in HTML files. I do know Word allows page breaks in HTML files, so you can try validating one of the books in Word if you like. I've written Janice that the issue needs to be looked into since if the HTML standard doesn't support page breaks, then HTML needs to be dropped from the acceptable formats for submission. If you do try validating one of the books, then please let me know if you are able to verify whether or not the book you validate contains page breaks. HTH Gerald -----Original Message----- From: bksvol-discuss-bounce@xxxxxxxxxxxxx [mailto:bksvol-discuss-bounce@xxxxxxxxxxxxx]On Behalf Of mickey Sent: Tuesday, January 31, 2006 8:33 PM To: bksvol-discuss@xxxxxxxxxxxxx Subject: [bksvol-discuss] HTML There are several HTML files on the step-one page. Will they convert to .rtf, retaining page breaks? I've been a little afraid to try them. Thanks for any help. Mickey To unsubscribe from this list send a blank Email to bksvol-discuss-request@xxxxxxxxxxxxx put the word 'unsubscribe' by itself in the subject line. To get a list of available commands, put the word 'help' by itself in the subject line. To unsubscribe from this list send a blank Email to bksvol-discuss-request@xxxxxxxxxxxxx put the word 'unsubscribe' by itself in the subject line. To get a list of available commands, put the word 'help' by itself in the subject line.
--- End Message ---