11 August 2013
Publishers and authors wants math to be at least as straight-forward to produce as a print book with high-quality equation presentation. This post explain the various options to create MathML and deliver it to all platforms and devices.
Publishers and authors wants math to be at least as straight-forward to produce as a print book with high-quality equation presentation. This post explain the various options to create MathML using IGP:Digital Publisher and deliver it to all platforms and devices using AZARDI:Content Fulfilment. The concepts and approaches also apply generally, not just to our products.
Because most ePub 3 reading systems are Webkit based there is a voice that says MathML cannot be done. That voice is mostly right. It cannot be done easily. With Google branching Blink from Webkit and removing MathML completely, expect to see no native MathML in an Android device near you soon.
If you want to understand the state of mathematics production and browser support check out this very long discussion about MathML in Mozilla on Google Groups.
Wonderfully and wickedly the HTML5 specification supports both MathML and SVG as part of the family. No poor-cousin namespace stuff. This is very good of course but Firefox is the only browser that has anyway reasonable MathML
Here is some simple MathML; the beloved and "demonstration-overworked" quadratic equation. To set the tone for this article it is presented in MathML, SVG and PNG.
Here the MathML is show at the default page font size and font-size 2em. If you hover over the small equation it will become big, red and bold. Firefox has the STIX fonts installed by default so there is nothing else to do except "enjoy the maths".
If you are using Firefox you will see the MathML equation. If you are using any other browser you will see whatever the browser wants you to see. Generally a string of text.
The SVG was generated from the MathML equation is resizable and shown here small and large. It can be SVG outlines (which makes large and complicated SVG files), or include font support (best for file size and rendering speed). In this case you have to make the font available in the SVG.
The advantage of SVG is that it can scale with a reading system. It will also work for ePub2 if the reading system supports SVG. Many earlier Android readers don't.
Any equation image has to be made larger and sized down to work even reasonably well. It is dependent on the reading devices anti-aliasing for final quality but is nearly always going to have that soft edge.
The top image is a 1:1 screen grab of a browser rendition. If this was taken from a PDF it would not appear as sharp. The next image is the same as the image above but has been scaled up to represent some sort of scaling in a reading system. You can immediately see the anti-aliasing artifacts. These will be clearly visible on 160ppi devices but may be less of a problem on 300ppi + devices.
Finally this is a PNG image that has been created at twice the standard text size and scaled proportionately with the image above. You can immediately see the clarity improvement and minimization of anti-aliasing artifacts.
If you were to package this MathML into an ePub3 and view it in a reader other than AZARDI you will see nothing coherent. But this doesn't mean excellent presentation Math cannot be done.
Finally (but certainly not least) here is native MathML rendered by MathJax. Right click on the equation to explore the various presentation options. You should be able to view this in a very wide range of browsers.
There are two challenges for the publisher who needs maths in an ePub3:
Challenge 1. Understand and implement the production options and methods to create valuable and high-quality MathML that does not have to be constantly changed into the future. In the context of IGP:Digital Publisher that MathML needs to be available in print as well as for any digital reading environment or reading system.
Challenge 2. Address the fact that AZARDI is the only ePub3 reading system that allows comprehensive presentation of MathML. Understanding all other ePub3 reading systems and how to make content processing deliver decisions for MathML.
We will address the e-book delivery system issues before the production issues simply because the production method has to be able to deliver maths for a wide range of reading systems.
Understanding the reading system limitations should not define the production method, but does exactly define the deliverable format packaging requirements. (The IGP:Digital Publisher/FoundationXHTML approach is produce to the highest tagging standards and then process down for device and channel dumbness. This saves a lot of time and money.)
With ePub2 books math equations have typically been handled as inline and block images, generally very badly (with some exceptions). This is definitely not the way to do it in 2013!
|MathML||Native MathML rendering works if a reading system substantially supports MathML. AZARDI built on Firefox is currently the only real option with a full pass in the ACID 2 test, and 60% in MathML ACID 3.||A reading system must be built on Mozilla like AZARDI Desktop. It also means Android and Apple devices are out. IOS has 15% MathML support but it is patchy and unworkable as a product delivery strategy.|
|PNG or JPG||Maths as images will work in all reading systems that support images and CSS positioning. How the images are produced is very important. Always use PNG images for text (256 colors). The files are smaller and there are no compression artifacts.||The books generally look cheap and disgusting (as illustrated above). Generally done by creating equations from screen-grabs of print PDFs. Therefore it does not work that well for digital. Done well it can be excellent when required.|
In 2013 the digital reader-franca format for math is SVG. It is a compromise. But it is better than PNG or (shudder) JPG images.
One of the main reasons AZARDI Desktop is built on Mozilla XUL-Runner (Firefox) is because of the excellent MathML support. Fortunately the MathJax project is up to steam and we are able to give similar presentation to MathML in the Mobile versions of AZARDI (which are currently based on Webkit) . But we cannot do anything very interactive with the mobile math and accessibility is compromised.
The combo native/MathJax MathML distribution and delivery approach eliminates a significant set of production problems. It's as good as it gets. This is also a fully working strategy if you are publishing additional resources to online webpages.
MathJax is a great project that is a compensation for the lack of MathML support in Webkit (Safari/Chrome et.al.) and Internet Explorer. The big plus is your content is delivering MathML.
MathJax must be installed and controlled by you. This is done by either:
Some independent Android ePub3 reading systems have MathJax internally. AZARDI Mobile for Android and IOS uses MathJax internally for example.
The obvious downside for using MathJax as part of an e-book package is a lot of overhead files and packaging when using MathJax just to create what are ultimately SVG equations. MathJax is also still a dynamic project and is constantly updated. A book package becomes a snapshot of the current version and has to be maintained for packaging.
Ideally reading systems based on Webkit will incorporate MathJax directly into the reader and then use the manifest math property to apply MathJax as required. That means MathJax can be updated with the Reading system software rather than on a book by book basis.
The MathJax site has a useful list of MathML support and capabilities in various ePub reading systems. This further highlights the problem for production and delivery to specific reading systems.
When a reading system uses MathML native rendering it is possible to introduce interactivity into equations for the insertion of values, etc. However this is highly dangerous and uncharted waters. Proceed with caution all ye who go there!
There is a lot of instructional interactive value additions that can be added to natively rendered MathML. We are of course too cowardly to walk down this dark untrodden path without our hand being held, or a lot more experience than we currently have. Just thinking out loud!
There are two types of MathML; Presentation and Content. All of the MathML we see today presentation MathML. Content MathML allows equations to be accessible, with reading and dyslexic syntax highlighting being two of the accessibility options opened up.
Moving forward we hope to have to consider content MathML in e-book reading systems. That appears to be some time off. Meanwhile there are Mozilla plug-ins that let it happen now.
It is possible to XSL process presentation MathML to content MathML (and vice-versa) which means what we create today using presentation MathML will be usable into the future with a little processing frenzy.
The EPub 3 specification has a Switch mechanism designed to allow experimenting with the insertion of XML in content and ostensibly to support ePub 2 backward compatibility. In this both the image and MathML can theoretically be included. Sounds good? Read on!
The problem with this definition is it doubles the production challenges for a publisher. Now MathML is not the only challenge. You also have to know what reading systems support MathML AND if they support Switch and package and test accordingly. Your production costs and complexity escalate horribly and needlessly.
"The epub:switch element allows an XML fragment to be conditionally inserted into the content model of an XHTML Content Document." IDPF ePub 3 Specification.
The ePub 3 switch "strategy" is not a working or practical solution for education textbooks (or probably any content). Do not use XML in a book unless you want to guarantee it will not work in most reading systems, or you are distributing content with your own reading system. In 2013 there are always alternatives to arcane XML and namespaces at the delivery and presentation step of digital content publishing.
In 2013 Education textbook publishers can never require ePub 2 backward compatibility because neither the format nor available reading systems will provide the features required.
Just forget the epub:switch element. If your consultant recommends it. Fire them.
Since no ePub 3 reading systems support epub:switch even thinking about it is back to ground-zero. While the IDPF will keep talking about what is happening in the reading systems of the future, (feet on the ground team) today there is no reading system support for this. It is also an un-necessary feature and is addressed adequately by forward-looking HTML5 techniques.
AZARDI doesn't support switch because it supports ePub2 and ePub3 natively and ignores arbitrary non-HTML5 features. Backward compatibility is not required for education content because the ePub2 reading systems in the marketplace don't offer the presentation options needed for real ePub3 textbooks that will outperform Inkling, iPublish, etc..
Ideally the maths is created, maintained and available as MathML. There are two primary types of content that publishers will probably have to deal with: 1) Backlist-retrodigitization and 2) frontlist-new production.
Typically publishers are mostly challenged by the conversion of backlist content. The perceived costs and complexities is what stops many digital content strategies when it comes to maths textbooks.
In this digital content world the maths needs to be MathML especially if HTML5 is your target distribution format (whether directly to the Internet or via a reading system package such as ePub3 or E0. The exception to this is basic arithmetic that needs MathML extensions to work. As amazing as it seems MathML does not easily let us create long-anything-arithmetic.
It is highly unlikely any existing backlist maths textbooks were created with MathML. Typically they would have been created using an MS Word Math plugin, or in a scripted environment like LaTeX. Publishers even go straight from word-processor manuscripts to print PDF because typesetting is so difficult. On the desktop maths production is specialist plug-ins all the way.
It is highly likely there is a processor to convert your word-processor file or LaTeX file to something that can be used directly as an XHTML input, or for extraction of MathML from the processed output. There has been a lot of work done around the world to address these issues.
Depending on how your Maths has been produced you have a number of different paths to convert it the source content into MathML. It may have been sourced from a print production environment, or from one of the Math Plug-ins in MS Word or even better Libre Office which has a plug-in export option XHTML + MathML 2.0. These generally allow the export of Math as XHTML.
Word has an excellent plug-in to export Daisy accessible ready XHTML that converts proprietary plug-in math stuff in a word-processor to MathML. This can be then be directly imported into IGP:Digital Publisher (there is a Daisy XHTML import option).
The problem is, depending on how the document was authored, you may see things like the transposition of Inline Math and Block Math (the danger of working in a WYSIWIG tools). There is usually some work to sort out Word-processor XHTML/MathML imports unless the source documents/manuscripts have been edited with professional controls.
If your production has been outsourced and the PDFs created using tools about which you as a publisher have no knowledge, you may have the "create the equations again" challenge. It's a fact of digital content life!
There is another issue with MathML in the production environment that is not largely considered from a print perspective. It should be a very serious consideration for education publishers. MathML has high reuse potential in many education areas. After all, how many times do you need to create the quadratic equation? Make sure you change your thinking on MathML and see it as an asset, not just a production by-product.
We are bundling a large library of pre-defined MathML equations covering K-12 Maths and Physics in the Q4 release of IGP:Digital Publisher for Education Publishers, along with the AIE (AZARDI Interactive Engine) and the ALL-IN (AZARDI Learning Library-Interactive Now).
In IGP:Writer (the IGP:Digital Publisher editing interface) MathML template equations will be inserted directly into a document. Variables and operators can be modified and edited directly in the interface.
For textbooks that need math in general and MathML in particular, this suite of complimentary tools will considerably reduce the cost of text books production and time to market, while making compelling products and quality assurance a lot easier.
Because IGP:Digital Publisher can process: MathML, Math as SVG, Math as PNG (shudder) or package MathJax; all as format packaging/generation options, the production cost and complexity threats and problems of using MathML are eliminated.
The issue is then about processing/packaging and getting it into the ePub3 distribution format in a way that is going to be usable within the limitations of the reading system. This may change in the future and the digital content reading system diaspora that ePub 3 has created will mean specialist content will inevitably have to be handled in different ways for different distribution channels.
The same content should also be available for print. It is in fact considerably easier to get MathML to print, Online and to ePub 3 or E0 from XHTML source content than through desktop programs such as InDesign.
IGP:Digital Publisher uses PrinceXML for XHTML/CSS3 to print production. while PrinceXML does not support MathML, the MathML is converted to SVG by the Format On Demand processor for the output PDF.
Textbook, academic and other publishers who need mathematics in their books need to ensure the mathematics production is carried out in an environment that makes it reusable, processable and flexible. That means processing it to, or creating it as, MathML; making sure it is available for reuse and editing at any time.
How you get to MathML doesn't matter as much as having maths tagged in a reliable method that can be instantly processed to a deliverable format, and is valuable into the future. The faster you make the transition the better. Education Publishers can put-it-off; but it is going to be a business imperative sooner rather than later.
The release of the ALL-IN Mathematics library in Q4 will make a world of difference to education publishers for both their print and reading systems content; especially for front-list textbook and resource-book production. We will be releasing a demostration long-arithmetic book shortly to show how this all comes together easily. Maths books always need more work than most other genres but with the right approach and tools it can be even easier than yesterdays desktop production challenges.
In IGP:Digital Publisher we focus on using MathML as the only real serious production, archiving and presentation format for education maths digital content.
Posted by Richard Pipe
Start a real digital content strategy with
The complete digital publishing content management and production solution.
Available as for Small and Medium publisher:
IGP:Digital Publisher is also available as a full site license purchase.
Use one master XHTML file to instantly create multiple print, e-book and Internet formats.