Unraveling the mysteries of an EPUB — warning, still under construction!

An EPUB primer

What is an e-book?

They come in a number of formats: PDF, EPUB, MOBI, AZW

  • The common version is the EPUB (The final format for most e-book readers)
  • Amazon’s MOBI and AZW formats can be converted from an EPUB
    • I upload EPUBs at the KDP site and retrieve MOBIs to proof
  • PDF (Portable Document Format) is a standard format developed by Adobe. It’s been around since 1993.
  • EPUB, MOBI and AZW are based on HTML (XHTML), the same language used for Web sites. The chapters of the book are bundled with a number of other files that provide the parameters—such as font type and size, table of contents, paragraph styles.

The EPUB

EPUB is short for “electronic publication.” It’s a standard file format used in the production of e-books. Unlike a fixed digital format like a PDF, an EPUB allows content reflow based on screen size or font size.

An EPUB is a ZIP file that bundles content. Instead of the .zip extension, it uses the file extension .epub. It utilizes markup in HTML (XHTML), XML, and CSS, and may contain images, audio, and video files.

An EPUB’s format is formally known as the Open eBook Publication Structure (OEBPS). Most EPUBs are 2.0.1 or 3.0. The latter version provides abilities for more precise layouts, and specialized formatting. Keep in mind, however, that older devices may not be able to support the advanced abilities of EPUB 3.0.

The information in this document will focus on EPUB 2.0.1.

There are three specifications for EPUBs:

  • The Open Container Format (OCF) specifies the order of the files in an EPUB
  • The Open Packaging Format (OPF) defines the contents of the file and its metadata
  • The Open Publication Structure (OPS) specifies the physical contents of the e-book

Tools of the trade

Calibre

  • Free
  • E-book reader—provides “look and feel”
  • E-book converter
  • E-book editor
  • I use Calibre frequently to convert from one e-book type to another, and I like to use the e-book reader when I’m checking my formatting

Jutoh

  • E-book converter
  • E-book creator
  • You can format your book without worrying about the XHTML, XML, etc. Extremely user-friendly.

Sigil

  • Free
  • E-book editor
  • This is my “go-to” tool for XHTML editing.

Adobe Digital Editions

  • Free
  • E-book reader—provides “look and feel”

Kindlegen

  • Free
  • A program you can download to convert your EPUB into a MOBI
  • I use this tool to build MOBIs that I then load onto my Kindle and iPad to check formatting

Kindle Previewer

  • Free
  • E-book reader—simulates “look and feel” for a variety of devices
  • Handles EPUBs, MOBIs, HTML
  • I also use this tool to evaluate my format over a variety of e-book readers

Editors

In theory, you can use Notepad to edit all of the components in your EPUB, but it’s a primitive approach. There are other free/inexpensive options:

 

Tag basics

To understand what’s going on in your EPUB, you need to understand tags

  • Tags are surrounded by <>
  • There’s usually a start tag <> and an end tag</>
  • Overlapping tags must be nested

Correct: <i>I like to live <b>boldly.</b> How about you?</i>

HTML sample

 
 

Hot mess: <i><h1>Your</i> piece of pie? I don’t think so!</h1>

 

hotmess error

 

 

 

 

 

Sigil’s attempt to fix:

Sigilcode

 

 

sigil html

 

But, the first set of italic tags aren’t necessary.

 

<h1><i>Your</i> piece of pie? I don’t think so!</h1>

 

hotmess fix

 
 

Getting to the heart of the matter

Always make a back-up copy before opening your EPUB!

  • Change the extension from .epub to .zip and—voila!—you can open it!
  • You can use one of the editors listed above (or Notepad) to edit the various files.
  • To revert back to an EPUB:
    • Highlight all of the components except the mimetype file and zip it
    • Move the mimetype file into it and then change the extension back to .epub.

 

Dissecting an EPUB

Mimetype (mandatory)

Plain text, at the highest level, identifies file as an EPUB. Cannot be compressed.
 
Mimetype
 
 

 

META-INF (mandatory)

 
metaingf

 

 

 

  • Folder, at the highest level
  • The file container.xml is required
  • Other files are optional:
    • xml
    • xml (manifest for container contents)
    • xml (for container-level metadata)
    • xml (reserved for digital rights management (DRM) information)
    • xml (holds digital signatures of the container and its contents)

 

container.xml (mandatory)

 
containerxml

 

 

 

  • Located in META-INF folder, directs e-book readers to location of content.opf file.

 
contentopf
 
 

encyrption.xml (optional)

encryptionxml

 

 

 

Located in META-INF folder, it works with some embedded fonts.

  • May need to select different fonts if you remove it
  • Nook doesn’t allow it
  • I had troubles trying to use it in my Smashwords EPUB, so I eliminated it

 

com.apple.ibooks.display-options.xml (optional)

appleentry

 

 

 

  • Located in META-INF folder, works with embedded fonts for an iBook EPUB
  • Consists of a set of display options that tell iOS devices how to present the content
  • This may be a deprecated file (included for older iBooks, but not used by newer ones)

 

OEBPS (mandatory)

 
oebps

 

 

 

  • Folder, at the highest level
  • Named after the overall set of specifications: Open eBook Publication Structure (OEBPS). It contains all of the e-book’s content (text, images, etc.).

Inside OEBPS:

You can have folders such as:

  • CSS or Styles (contains Cascading Stylesheets)
  • Fonts (location for embedded fonts)
  • Images (such as author picture)
    • Acceptable image formats: JPEG, PNG, GIF, SVG
  • Text (XHTML/HTML files)
  • Audio
  • Video
  • opf (mandatory): file that describes all of the contents in the EPUB
  • ncx (mandatory for EPUB 2.0.1): navigation control file (special TOC in e-book reader)

 

Fonts (optional)

 
Fonts1

 

 

 

 

 

 
 
 

Folder that contains the fonts you want to embed into your EPUB. This is required for fonts not readily available on e-book readers. Keep in mind that not all e-book readers will utilize embedded fonts.
 
Fonts2

 

 

 

 

 

 

Styles (optional)

 
Styles

 

 

 

 
 
 

  • Folder for Cascading Stylesheets
  • This folder is optional; you can insert the stylesheets directly into the OEBPS folder.
  • Cascading Stylesheets have a file extension of .css
  • Stylesheets are optional, can have various names, and you may choose to include more than one.

 
Styles2

 

 

 

 

Here’s an example with multiple stylesheets that reside in OEBPS. The developer decided to separate the overall page style from the paragraph styles. I’ve found numerous examples of this in books converted with Calibre.
 
Styles3

 

 

 

 

 
 
 

Text (optional)

 
 
Text1

 

 

 

 

  • Folder that holds all of the XHTML/HTML files
  • I recommend that your break your manuscript into separate files for each chapter. This can save a lot of headaches.
  • I separate each front matter and back matter component into a separate file.
  • Avoid spaces in your filenames

Again, I like to use descriptive file names:
 
Text2

 

 

 

 

 

 

Sample of converted XHTML/HTML with generic filenames:
 
Text3

 

 

 

 

content.opf (mandatory)

 
contentopf2

 

 

 

 
 
 

Metadata

Contains the information about your book, such as:

  • Unique ID
  • Title
  • Author (Creator)
  • Description
  • Publisher
  • ISBN
  • Language

 
Metadata

 

 

Manifest

Lists every component in the OEBPS folder except the .opf file.

  • Navigation control file (toc.ncx)
  • Stylesheets
  • Images
  • XHTML/HTML files
  • Font files

 

Manifest

 

Spine

Lists the order of the XHTML/HTML files
 
Spine

 

 

 

 

 

 

 
 
 
 
 

Guide

 
Lists special files used by e-book readers

  • Cover (if you bundle it in your EPUB)
  • Table of contents
  • Landing spot (when the book is initially opened in the e-book reader, it’s to this spot)

 

Guide

 
 
 

toc.ncx

 
The navigation control file creates the special table of contents found on e-book readers.

Each navPoint indicates a landing spot.
 
Kindle sample

 

 

 

 

 

The playOrder indicates the sequence. You must keep your entries in order, and without gaps (right: 1,2,3,4,5   wrong: 1,2,4,5,7)
 
tocncx

 

 

 

 

 

 

 

 

 

 

 

Cascading Stylesheets

The best way to understand this concept is to correlate a CSS paragraph style to a Word style.

Word                    CSS

Normal                 body

Header1                h1

Conversion tools usually convert into generic labels. I prefer to use descriptive ones.

 

Sample of styles after a conversion:
 
CSS1

 

 

 

 

 

 

 

Sample from Tourist Trapped:
 
CSS2

 

 

 

 

 

 

 

 
 
 

If you have embedded fonts, you define them at the top of your stylesheet, or in the page stylesheet (if you choose to have one).
 
 
CSSfonts

 

 

 

 

 

 

 

You can also add styles for spans of text—for example if you want to change the font for a sentence in the middle of a paragraph.
 
CSSspan

 

 

 

 

 

 

 
 
 
Resource: w3schools.com CSS tutorial
 

My e-book is broken!

Always make a copy before proceeding!

Run it through an EPUB validator

  • A number of tools have an EPUB check embedded in them
  • The International Digital Publishing Forum provides an online EPUB Validator at: idpf.org
  • This may provide some clues, as it will give you errors and associated line numbers

Find the bad spot and compare it to code that’s working

  • Find the point where it isn’t working and open it up in an editor.
    • In a tool like Sigil, you can go to the “funky spot” and switch to the XHTML and it’ll take you to that point in the code.
  • Compare the malfunctioning code to similar spot that’s working. Change the code to align and then test.

The Internet is your friend!

  • Other useful sites
  • Other resources
    • Zen of eBook Formatting by Guido Henkel
    • E-Book Formatting for Novelists: A step-by-step guide for the independent novelist or small press by K.C. May (free at Smashwords and Barnes & Noble)
    • APE How to Publish a Book by Guy Kawasaki
    • eBook Formatting and Publishing Guide for Epub & Kindle Mobi Books using Sigil ebook editor by Suzanne Fyhrie Parrott
    • Ebook Formatting: KF8, Mobi & EPUB by Matt Harrison (gives a variety of coding examples—rather techie)
    • EPUB From the Ground Up: A Hands-On Guide to EPUB 2 and EPUB 3 by Jarret Buse (another rather techie resource)
    • EPUB Straight to the Point: Creating ebooks for the Apple iPad and other ereaders (One-Off) by Elizabeth Castro (pretty in-depth resource)
    • Aaron Shepard’s Kindle series

Comments are closed