EPUB Parser
EPUB Parser gem parses EPUB 3 book loosely.
Installation
gem install epub-parser
Usage
As command-line tools
epub-open
epub-open
tool provides interactive shell(IRB) which helps you research about EPUB book.
See EpubOpen.
As a library
Use EPUB::Parser.parse
at first:
This book object can yield page by spine’s order(spine defines the order to read that the author determines):
book.each_page_on_spine do |page|
# do something...
end
page
above is an EPUB::Publication::Package::Manifest::Item object and you can call #href to see where is the page file:
book.each_page_on_spine do |page|
file = page.href # => path/to/page/in/zip/archive
html = Zip::Archive.open('/path/to/book.epub') {|zip|
zip.fopen(file.to_s) {|file| file.read}
}
end
html = page.read
doc = Nokogiri.HTML(html)
# do something with Nokogiri as always
For several utilities of Item, see Item page.
By the way, although book
above is a EPUB::Book object, all features are provided by EPUB::Book::Features module. Therefore YourBook class can include the features of EPUB::Book::Features:
require 'epub'
class YourBook < ActiveRecord::Base
include EPUB::Book::Features
end
book = EPUB::Parser.parse(
'uploaded-book.epub',
class: YourBook # *************** pass YourBook class
)
book.instance_of? YourBook # => true
book.required = 'value for required field'
book.save!
book.each_page_on_spine do |epage|
page = YouBookPage.create(
:some_attr => 'some attr',
:content => epage.read,
:another_attr => 'another attr'
)
book.pages << page
end
You are also able to find YourBook object for the first:
Switching XML Library
EPUB Parser tries to load Nokogiri, a Ruby bindings for Libxml2 and Libxslt and more at first. If Nokogiri is not available, then it tries Oga a fast XML parser. If both are not available, it fallbacks to REXML, a standard-bundled library. You can also specify REXML explicitly:
EPUB::Parser::XMLDocument.backend = :REXML
Switching ZIP library
EPUB Parser uses Archive::Zip, a pure Ruby ZIP library, by default. You can use Zip/Ruby, a Ruby bindings for libzip if you have already installed Zip/Ruby gem by RubyGems or Bundler.
Globally:
For each EPUB book:
Documentation
APIs
More documentations are avaiable in:
-
Publication includes document’s meta data, file list and so on.
-
Item represents a file in EPUB package.
-
FixedLayout provides APIs to declare how EPUB reader renders in such as reflowable or fixed layout.
-
Navigation describes how to use Navigation Document.
-
Searcher introduces APIs to search words and elements, and search by EPUB CFIs(a position pointer for EPUB) from EPUB documents.
-
UnpackedArchive describes how to handle directories which was generated by unzip EPUB files instead of EPUB files themselves.
-
MultipleRenditions describes about EPUB Multiple-Rendistions Publication and APIs for that.
Examples
Example usages are listed in Examples page.
Building documentation
If you installed EPUB Parser via gem command, you can also generate documentaiton by your own(rubygems-yardoc gem is needed):
$ gem install epub-parser
$ gem yardoc epub-parser
...
Files: 33
Modules: 20 ( 20 undocumented)
Classes: 45 ( 44 undocumented)
Constants: 31 ( 31 undocumented)
Methods: 292 ( 88 undocumented)
52.84% documented
YARD documentation is generated to:
/path/to/gempath/ruby/2.2.0/doc/epub-parser-0.2.0/yardoc
It will show you path to generated documentation(/path/to/gempath/ruby/2.2.0/doc/epub-parser-0.2.0/yardoc
here) at the end.
Or, generating yardoc command is possible, too:
$ git clone https://gitlab.com/KitaitiMakoto/epub-parser.git
$ cd epub-parser
$ bundle install --path=deps
$ bundle exec rake doc:yard
...
Files: 33
Modules: 20 ( 20 undocumented)
Classes: 45 ( 44 undocumented)
Constants: 31 ( 31 undocumented)
Methods: 292 ( 88 undocumented)
52.84% documented
Then documentation will be available in doc
directory.
Requirements
-
Ruby 2.2.0 or later
History
See CHANGELOG.
Note
This library is still in work. Only a few features are implemented and APIs might be changed in the future. Note that.
Currently implemented:
-
container.xml of EPUB Open Container Format (OCF) 3.0
-
EPUB Navigation Documents of EPUB Content Documents 3.0
-
metadata.xml of EPUB Multiple-Rendition Publications
License
This library is distributed under the term of the MIT Licence. See MIT-LICENSE file for more info.