22 September 2013
A major outcome of the recent AAP ePub3 conference was ePub pagebreaks that match the original source print-book from which a digital book was created. This is what we did.
A major outcome of the recent AAP ePub3 conference was ePub pagebreaks that match the original source print-book from which a digital book was created.
Two criteria were identified:
Although there is a lobby that doesn't particularly like seeing print book page numbers in eBooks, it is as good a linear reading reference as anything. It particularly adds a needed positional syntax for tag and linked indexes.
We have used the pagebreak property since 2008 (long before ePub3) for both ePub2 index linking and for textbook WebApps. In addition to accessibility, textbook navigation in a blended learning environment (print and eBooks) becomes easy. When the teacher says "Turn to page 87" everyone gets to the correct place instantly.
In all our retrodigitization we always capture pagebreaks and have done since 1999, both in earlier XML production systems and in the current XHTML5 system. The pagebreak selector can be used in several ways inside IGP:Digital Publisher.
IGP:Digital Publisher "Insert Pagebreaks" is brought to you courtesy of MOD51, the same technology that delivers awesome typography reporting AND importing to XHTML from PDF.
IGP:Digital Publisher produces both a print/rgb PDF and various eBook packages from the same XHTML source. Inserting page-breaks back into the XHTML after the final PDF has been generated was a tedious manual process until MOD51 matured. Here is an earlier article on the marvel that MOD51 is and the digital content problems it solves.
With IGP:Digital Publisher once you have created your perfectly kerned and tracked PDF for a print edition you can now create pagebreaks in your IGP:FoundationXHTML (FX) (and hence your ePub any-number).
FX always contains the pagebreak element as a span within content without exception. It is a map to a specific print book pagination and must not affect the value of the content. This is what it looks like.
<span class="pagebreak-rw">87</span>
These selectors have the epub:type="pagebreak" attribute inserted on generation of an ePub3 format. That gets a little more verbose and looks like this in the final ePub3 package. It is also a link target as well.
<span class="pagebreak-rw" epub:type="pagebreak">87</span>
It works like this:
Why would there be missing pagebreaks? Content is infinitely complex so there can always be conditions where any process fails with real world digital content. It doesn't happen often, but is a reality. The Page Break Insertion processor maintains an inventory of page boundary matches it inserted and verified. Any missing pagebreak numbers can then be easily reported.
We have handled problem areas such as columns, footnotes and floating blocks such as Figures, Illustrations and Tables. Page-breaks are inserted relevant to the galley text flow rather than any media content block items.
Because IGP:FoundationXHTML is semantically driven, when an ePub3 file is created a very rich set of epub:type properties is generated and included.
It is probably a truism that an IGP:Digital Publisher generated ePub3 is the most specification compliant package available from any production method or system.
Check out this article on the epub:type mapping richness in an ePub3 generated by IGP:Digital Publisher.
So pagebreaks need to be used in Reading Systems. Those that support epub:type="pagebreak" all use them differently.
AZARDI uses them very explicitly and "in your face". This is for academic and education content where referencing a page number can be very important.
AZARDI pagebreaks packaged from IGP:Digital Publisher is a little unique in that the page numbers are grouped by book section. Regretfully silly limitations in the ePub3 spec don't allow the section titles to be inserted in a page navigation structure. This is discussed in detail here.
On the AZARDI Interface there is a View Page Numbers button that toggles through four states starting with no visible page break:
Finally you can click the button one more time and there are no page breaks.
To see this in action check out any of the Guy de Maupassant ePub3 sample books available here. The ePub3 page numbers relate to the A5 Print PDF.
Our pagebreak automation journey appears to be over for the time being.
IGP:Digital Publisher automatically inserts pagebreaks into IGP:FoundationXHTML documents for many purposes, one of which is pagebreaks in ePub3 for accessibility and page navigation. We also added the Source ISBN field to the standard packaging metadata fields.
In many ways this demonstrates how a content-centric production approach provides flexibility, adaptability and productivity not available with a design-tool or XML first driven approach.
Posted by Richard Pipe
Start a real digital content strategy with
The complete digital publishing content management and production solution.
Available as for Small and Medium publisher:
IGP:Digital Publisher is also available as a full site license purchase.
Contact us for more information...
Use one master XHTML file to instantly create multiple print, e-book and Internet formats.