PDF Format on portable devices
Aug. 28th, 2010 11:33 pm![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
Portable Document Format (PDF) is a format used by many publishers as a final layout format. I use it at work when we send publications to a printer to print. PDF is great for making sure if I sent the file to five different printers each printing company would be able to create the same publication. "PDF is used for representing two-dimensional documents in a manner independent of the application software, hardware, and operating system." (wikipedia)
The thing that makes PDF so attractive to those sending documents to publishers is the very thing that makes PDFs challenging to use on portable devices with various screen sizes.
PDF is designed to preserve the print layout which includes margins, font size, font type, page size, and artwork layout. This is also the reason websites are not shared in PDF format. Imagine if every website had to make a PDF for all different sized monitor screens? Basically HTML re-flows the text to fit the size of the monitor screen or browser.
Portable reading devices start at 3-inches and go up to about 8-inches to give you a general idea of screen sizes we're talking about. The e-ink devices are generally 5-inches, 6-inches, and 8-inches. I read on a Sony PRS-505 which is 6-inches.
Now, when writers write a story in Microsoft Word or Open Office (or any other word processing program) the default size of the paper layout in the document is 8x11 with 1 inch margins on all sides. This is the case because business print out a lot of documents. Letterheads and reports are not generally printed out on 4x6 paper. They are printed out on 8x11 paper. Microsoft Word does give the users the option to change the page size of their document which comes in handy for my job because we sometimes have figures that are 11x17. But for fan fiction reading and writing purposes there has not been a need for anyone to change the page size in their word processing program. When an author hits Print to PDF in their Word program the PDF that is created is sized to 8x11.
This is important to know because keeping in mind that PDFs do not re-flow to fit the screen what happens when a person transfers a regular PDF to a device with a 6 inch screen is that the PDF tries to preserve the original formatting. So everything gets shrunk so all of page 1 of a 8x11 document will fit on a 4x6 screen this gives the user super tiny text that's nearly impossible to read as well as a very wide margin.
The Sony does have limited reflow capacity but it does not always work right. If I'm correct the Kindle does not have reflow on PDFs. So if you hit the magnify button to enlarge the text you get paragraphs that kind of look like.
"sentence one goes here
and then breaks
off
to start a new sentence here.
The paragraphs look ugly as
a result."
I've seen that on PDFs on my own device.
So when I say I do not like PDFs and think it's a godawful format for reading on ebook devices that is why.
The only way the PDF could possibly come close to looking good on a portable device is if the author follows the directions listed here...Conversions to PDF Now keeping in mind that e-ink devices have 3 main screen sizes this means the author would have to make a 5 inch version, a 6 inch, and if they were really nice a 8 inch too. This is a lot of work for any one person to do and is not practical at all.
This is why for ebook readers mobipocket and epub are the two main formats people use to read on their devices. Mobipocket and epub are like HTML for browsers. They reflow so the paragraphs look good at any font size. The images fit to the screen and the formats will work on 3 inch screens and on up to 22 inch computer monitor screens.
Having said all that there are times when PDFs do have an advantage over epub and mobipocket format. Someone brought up a good point in my last post that some character encoding for words with diacritics does not translate well into epub or mobipocket. There are ways to embed fonts onto ebook devices so they could be read properly but lets be serious the average internet user is not going to have time to mess with it and it's not as simple as hitting a few buttons. I'm hoping in future devices there is more language support for characters outside the English language.
However for the English written fan fiction fandom the PDF format is not my first choice of formats to read on portable devices.
That authors are now more willing to share PDFs of their stories is great for readers who use them. However, as a reader that reads on a Sony PRS-505 and knowing the limitations of the PDF format I doubt I'll ever download a PDF to read on my device or future e-ink devices.
This post is partly in response to something I'd read a few weeks ago at the spnanonmeme. Someone said that I supported or encouraged authors to share PDF formatted stories for the SPN BB. I just wanted to make sure it's very clear that I do not support PDF and I will never ask an author for a PDF version of their story. I don't actually think I have that much influence on how authors share their works. Otherwise there would be a lot more single file HTML versions out on the web.
I don't recommend PDF for anyone reading on portable devices (unless it's been specifically formatted for the screen size of the device you're using). The single file download options I do prefer are HTML or a Word Document. I appreciate all authors that provide either of these with their long stories that are posted to LJ.
Adobe has made some changes in the new Adobe Acrobat software to make creating PDFs more ebook friendly like adding tags to the PDF. I don't know how to do this personally. I'm not sure if anyone else even knows what I'm talking about except for people who work with PDFs professionally.
But the tagging does make it easier for PDFs to reflow to fit a screen. The problem is these options are not known to the majority of fan fiction writers who do make PDFs. I'm willing to bet based on the metadata I've seen on PDFs that most authors maybe use Print to PDF to create their PDFs.
You also need the paid for version of Adobe Acrobat to edit metadata in PDFs. This is probably why most PDFs have the weirdest author and titles when they are loaded onto my Sony PRS. Most authors do not know that the file name of their document is generally inserted into the title and that their MS Word Author names are inserted into the author section (this sometimes shows the author's real name). People reading PDFs on their computers would not see this unless they open go to Document Properties. But for navigating on portable devices the devices use the PDF metadata to sort the ebook.
PDFS are not going away because there is still an audience for them but I wanted this post to make clear how they work on portable devices of which a growing number of fans are buying.
For those who are interested in reading about ebook file formats I highly recommend
elf's Ebook Formats guide
For my next ebook related post I was hoping to cover the topic of quality control by sharing all the mistakes I've made when making ebooks. I'm hoping by sharing my experiences it'll give others creating ebooks an idea of what to watch out for when they create their own ebook versions of fan fiction.
The thing that makes PDF so attractive to those sending documents to publishers is the very thing that makes PDFs challenging to use on portable devices with various screen sizes.
PDF is designed to preserve the print layout which includes margins, font size, font type, page size, and artwork layout. This is also the reason websites are not shared in PDF format. Imagine if every website had to make a PDF for all different sized monitor screens? Basically HTML re-flows the text to fit the size of the monitor screen or browser.
Portable reading devices start at 3-inches and go up to about 8-inches to give you a general idea of screen sizes we're talking about. The e-ink devices are generally 5-inches, 6-inches, and 8-inches. I read on a Sony PRS-505 which is 6-inches.
Now, when writers write a story in Microsoft Word or Open Office (or any other word processing program) the default size of the paper layout in the document is 8x11 with 1 inch margins on all sides. This is the case because business print out a lot of documents. Letterheads and reports are not generally printed out on 4x6 paper. They are printed out on 8x11 paper. Microsoft Word does give the users the option to change the page size of their document which comes in handy for my job because we sometimes have figures that are 11x17. But for fan fiction reading and writing purposes there has not been a need for anyone to change the page size in their word processing program. When an author hits Print to PDF in their Word program the PDF that is created is sized to 8x11.
This is important to know because keeping in mind that PDFs do not re-flow to fit the screen what happens when a person transfers a regular PDF to a device with a 6 inch screen is that the PDF tries to preserve the original formatting. So everything gets shrunk so all of page 1 of a 8x11 document will fit on a 4x6 screen this gives the user super tiny text that's nearly impossible to read as well as a very wide margin.
The Sony does have limited reflow capacity but it does not always work right. If I'm correct the Kindle does not have reflow on PDFs. So if you hit the magnify button to enlarge the text you get paragraphs that kind of look like.
"sentence one goes here
and then breaks
off
to start a new sentence here.
The paragraphs look ugly as
a result."
I've seen that on PDFs on my own device.
So when I say I do not like PDFs and think it's a godawful format for reading on ebook devices that is why.
The only way the PDF could possibly come close to looking good on a portable device is if the author follows the directions listed here...Conversions to PDF Now keeping in mind that e-ink devices have 3 main screen sizes this means the author would have to make a 5 inch version, a 6 inch, and if they were really nice a 8 inch too. This is a lot of work for any one person to do and is not practical at all.
This is why for ebook readers mobipocket and epub are the two main formats people use to read on their devices. Mobipocket and epub are like HTML for browsers. They reflow so the paragraphs look good at any font size. The images fit to the screen and the formats will work on 3 inch screens and on up to 22 inch computer monitor screens.
Having said all that there are times when PDFs do have an advantage over epub and mobipocket format. Someone brought up a good point in my last post that some character encoding for words with diacritics does not translate well into epub or mobipocket. There are ways to embed fonts onto ebook devices so they could be read properly but lets be serious the average internet user is not going to have time to mess with it and it's not as simple as hitting a few buttons. I'm hoping in future devices there is more language support for characters outside the English language.
However for the English written fan fiction fandom the PDF format is not my first choice of formats to read on portable devices.
That authors are now more willing to share PDFs of their stories is great for readers who use them. However, as a reader that reads on a Sony PRS-505 and knowing the limitations of the PDF format I doubt I'll ever download a PDF to read on my device or future e-ink devices.
This post is partly in response to something I'd read a few weeks ago at the spnanonmeme. Someone said that I supported or encouraged authors to share PDF formatted stories for the SPN BB. I just wanted to make sure it's very clear that I do not support PDF and I will never ask an author for a PDF version of their story. I don't actually think I have that much influence on how authors share their works. Otherwise there would be a lot more single file HTML versions out on the web.
I don't recommend PDF for anyone reading on portable devices (unless it's been specifically formatted for the screen size of the device you're using). The single file download options I do prefer are HTML or a Word Document. I appreciate all authors that provide either of these with their long stories that are posted to LJ.
Adobe has made some changes in the new Adobe Acrobat software to make creating PDFs more ebook friendly like adding tags to the PDF. I don't know how to do this personally. I'm not sure if anyone else even knows what I'm talking about except for people who work with PDFs professionally.
But the tagging does make it easier for PDFs to reflow to fit a screen. The problem is these options are not known to the majority of fan fiction writers who do make PDFs. I'm willing to bet based on the metadata I've seen on PDFs that most authors maybe use Print to PDF to create their PDFs.
You also need the paid for version of Adobe Acrobat to edit metadata in PDFs. This is probably why most PDFs have the weirdest author and titles when they are loaded onto my Sony PRS. Most authors do not know that the file name of their document is generally inserted into the title and that their MS Word Author names are inserted into the author section (this sometimes shows the author's real name). People reading PDFs on their computers would not see this unless they open go to Document Properties. But for navigating on portable devices the devices use the PDF metadata to sort the ebook.
PDFS are not going away because there is still an audience for them but I wanted this post to make clear how they work on portable devices of which a growing number of fans are buying.
For those who are interested in reading about ebook file formats I highly recommend
![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
For my next ebook related post I was hoping to cover the topic of quality control by sharing all the mistakes I've made when making ebooks. I'm hoping by sharing my experiences it'll give others creating ebooks an idea of what to watch out for when they create their own ebook versions of fan fiction.
Tagging
Date: 2010-08-29 04:16 pm (UTC)Conversion--not printing, but conversion--from Word 2003 or later will automatically tag the PDF. People who have Acrobat Pro (or possibly Acrobat Standard) instead of the free reader program can use the "Advanced--> Accessibility--> Add tags to document" feature, which works on most documents but occasionally fails due to bizarre font encodings. (There are some professional ebooks I can't tag.)
Books converted from InDesign are not automatically tagged; I don't know if this is an available option. Books converted by third-party software (PDFWriter and such) are almost never tagged.
Manual tagging is possible, but nightmarish. I say this as a person who loves line-by-line proofreading. It's like line-by-line proofreading, with an annoying UI and complex program options that aren't described anywhere. Oh, and if you do too many things without saving, Acrobat will crash & lose all your work. (Acrobat's instructions about tagging are "here's the dropdown; click 'yes' to continue.")
Tagging has two purposes:
1) If it works well, it allows much better reflow; it avoids those broken-line problems. (Often does not work that way for double-spaced docs; the auto-tag reads each line as a separate paragraph, and manual fixing is, erm, nightmarish. Would have to be done for every single line in the book.)
2) Allowing read-aloud programs to read the text properly. Again, it helps if the auto-tagging is done right, but the "each line is a paragraph" thing is probably less disruptive to this function than to reflow.
Purpose #2 is fairly irrelevant for novels (I believe the read-aloud programs will work on untagged documents; they just aren't as clear about things like chapter breaks); it can be important for charts & tables that need to be read in the right order. Also, tagging allows you to add alt text to images.
Re: Tagging
Date: 2010-08-29 07:14 pm (UTC)Thanks for the detailed explanation. :) May I add your explanation to the main post because I found it very useful and easy to understand and I think it'll help others. I could try paraphrasing but what took you a few paragraphs would probably take me twenty to say the same thing. You're a good teacher. :)
Re: Tagging
Date: 2010-08-29 07:24 pm (UTC)Feel free to add any parts of my explanation that would help. :)
I used to work for a company that was all gung-ho on PDF tagging because about 8 years ago, accessibility standards for gov't documents changed, and they were all required to be accessible to screen readers; the company thought it'd get in on the ground floor of making accessible PDFs. It didn't work out that way--the tech is too weird & obscure & nonstandardized; most gov't documents just switched to "searchable, auto-tagged PDF" and completely ignored how *mangled* that was for anything based on scans.
Re: Tagging
Date: 2010-08-29 07:33 pm (UTC)The weird thing is you'd think PDFs would be standardized to some extent...One company did come out with the program, right?
Metadata
Date: 2010-08-29 04:20 pm (UTC)Most people just don't know these options exist.
(Someday, I will write snarky RPF about m/m publishing houses, based entirely on whose name shows up as the "author" of their books.)
Re: Metadata
Date: 2010-08-29 07:10 pm (UTC)Re: Metadata
Date: 2010-08-29 07:20 pm (UTC)My current software quest: A portable PDF printer driver, so friends can convert web to PDF at work, where they're not allowed to install anything. (I'm told by geekfriends this may not be possible; drivers are apparently more touchy than that.)
http://www.softpedia.com/get/Office-tools/PDF/BeCyPDFMetaEdit.shtml is the program I wave around for PDF metadata editing; it also will remove metadata from some PDFs that Acrobat won't. So far this has been some Wowio books, and other locked things with weird embedded coding.
Re: Metadata
Date: 2010-08-29 07:29 pm (UTC)Downloading this metadata program.
I think most of this PDF knowledge isn't widely known or easy to find out. Most people deal with Adobe and I know for myself I found it difficult to find free alternatives that could do the same things that Adobe does.
Re: Metadata
Date: 2010-08-29 07:37 pm (UTC)The main useful PDF function I haven't seen in free (or cheap) software is bookmark editing. There's some that'll remove bookmarks (which is brainless; printing to a new PDF will do that), and there's cheap programs that will split a big PDF by bookmarks; extracting a list of bookmarks and editing current ones both seem to require either Acro Pro or one of the *expensive* alternates. (Foxit Pro, maybe? I haven't worked with that one.)
Re: Metadata
Date: 2010-08-29 07:49 pm (UTC)I can see how bookmark editing is helpful. :)
no subject
Date: 2010-08-30 03:45 am (UTC)Ah, so this is what happened when my PDFs failed to convert in Stanza. In my first try I had done a batch conversion and I suspect that a problem with one file led to a problem with almost all of them, and unusable transfers.
Most authors do not know that the file name of their document is generally inserted into the title and that their MS Word Author names are inserted into the author section (this sometimes shows the author's real name).
Another "Aha" there. I'd been wondering.
no subject
Date: 2010-08-30 03:55 am (UTC)no subject
Date: 2010-09-23 03:43 pm (UTC)no subject
Date: 2010-09-24 03:20 am (UTC)no subject
Date: 2010-09-26 10:45 am (UTC)Thank you for the link to elf's post. I will look into those alternatives.
no subject
Date: 2010-09-26 07:26 pm (UTC)