Installing the Document Library
===============================

Introduction
------------

The Document Library is a Zope 3 application. If you're used to
installing Zope 2 software, pay special attention, as installing Zope 3
software can be quite different.

Zope 3 software is installed much like any Python package. A package
to be used in Zope 3 simply needs to be available on the Python path
(`sys.path`) somewhere, so that it can be imported from Python.

There are several ways to accomplish this. In this document we will
demonstrate the approach using the `path` directive in
`zope.conf`. Alternatives are manipulation of the PYTHONPATH
environment variable, and symlinking libraries into a Zope 3 instance
home's `lib.python` directory. 

Note that instead of explaining per package how to install it on the
Python path, we will give a single installation instructino at the
end, so all the needed libraries are placed on the Python path at
once.

Zope 3 packages typically have an additional requirement above the
installation of the package on the Python path: they also need to
include the `configure.zcml` file and (if available), a package's
`meta.zcml` file. This is done by placing special ZCML snippets in the
`etc/package-includes` directory of your Zope instance. 

Again, instead of explaining per package how to install its ZCML
snippet into `package-includes`, the documentlibrary itself ships with
all ZCML snippets needed, so at the end we will explain how to install
all of them at once.

Installing Zope 3
-----------------

The Document Library is written to work with Zope 3.2 and Python
2.4. Assuming you already have Python 2.4 installed on your system, a
Zope 3.2 instance is installed as follows on a Unixy system such as
Linux:

* download Zope 3.2 at http://www.zope.org/Products/Zope3 (at the time
  of writing, Zope 3.2.1 was the latest release).

* unpack the .tgz file. On Linux this is done this way::
  
    $ tar xvzf Zope-3.2.1.tgz

* Go to the directory that just unpacked, and do::

    $ ./configure --prefix /path/to/z321`

  Where `/path/to/zope/z321` is new directory in which you would like
  the Zope software to be installed. You could for instance place it
  in it `/home/myhome/z321`.

* Type `make`::

   $ make

  This will build Zope 3.

* Then type ``make install`::

   $ make install

  This will install Zope 3 in path you chose with `configure`. 

You have now completed installing the Zope 3 software. The next step
is to make a Zope 3 instance from it. You can make as many different
Zope 3 instances from the Zope 3 software as you like. The idea is
that the actual Zope 3 package, such as the Document Library, is
installed in a Zope 3 instance.

* Go to the Zope 3 software directory (such as `/path/to/z321`).

* There type::

    $ bin/mkzopeinstance

  This will start up a script that guides you through the creation of
  a Zope 3 instance. It first asks for the directory in which you want
  to install the instance. Again, let's give a new directory, such as
  `/home/myhome/DocumentLibraryZope`.

  Next, it will ask for a username for the administrator account,
  which password storage scheme you want to use (pick one), and for
  the password.

  The Zope instance will then be generated in the provided directory.

Let's test whether Zope runs at all:

* Go to the Zope 3 instance directory (such as
  `/home/myhome/DocumentLibraryZope`).

* Type `bin/zopectl fg` to start Zope:

  $ bin/zopectl fg

* You can now go to the Zope 3 web user interface at
  `http://yourhost:8080`. There is a `login` link to log in using your
  administrator account.

Document Library dependencies
-----------------------------

The Document Library makes use of a number of libraries beyond Zope 3
in its implementation, and these will need to be installed before the
Document Library can run.

libxml2/libxslt and lxml are assumed to be installed separately on the
Python level. See the installation for these packages for more
information, and also consult the sections below.

We will install all the other libraries in a directory in your Zope 3
instance called `pkg`. So, create a directory named `pkg` in your Zope
3 instance directory. So, you should now have a directory such as
`/home/myhome/DocumentLibraryZope/pkg`.

Before going into installation details we will give a quick summary 
of what gets installed where:

* libxml2/libxslt, lxml - system level

* pyoai - pure python extension, `pkg` directory

* zc.catalog, hurry - zope extensions, `pkg` directory

* documentlibrary - the application, `pkg` directory

If you have any trouble with installation, please also consult the
Troubleshooting section below.

libxml2/libxslt
~~~~~~~~~~~~~~~

libxml2 is an XML processing library implemented in C used by the
Document Library in its pyoai component. It can be found at
http://xmlsoft.org.

If you are on Linux, often your linux distribution already has a
libxml2 version recent enough to work. You need at least libxml
version 2.6.16 and libxslt version 1.1.12. Windows installers are also
available.

You do not need to install the libxml2-python or libxslt-python
bindings, as the lxml library is used for this instead.

lxml
~~~~

lxml is a Python binding for libxml2 and libxslt. It can be found at
http://codespeak.net/lxml

lxml needs libxml2 and libxslt installed. lxml itself can be installed
as an egg if you have `easy_install` in your system::

  $ easy_install lxml

But can also be installed from source. The lxml website provides
installation instructions and versions to download for various
platforms.

We are assuming here you will install lxml as a system-level library
so it is available from Python 2.4 everywhere, but you could also
install it locally (see instructions below).

pyoai
~~~~~

The pyoai library is a Python library that can be used to implement
OAI-PMH (http://www.openarchives.org) compliant services. It's not
Zope specific but can be used in Zope. It needs lxml (and thus
libxml2/libxslt, see above).

You can download pyoai at the infrae.com site, here:

http://www.infrae.com/download/oaipmh

You need pyoai 2.1.3 or later.

zc.catalog
~~~~~~~~~~

zc.catalog is an extension to the Zope 3 catalog which includes a few
new index types among other things. 

Since it hasn't had a release yet, we've packaged it along with the
documentlibrary for download. It's used by the Document Library to
enable set-based indexing and queries.

You need zc.catalog 0.1 or later.

hurry library
~~~~~~~~~~~~~

The hurry library is a Zope 3 extension which contains various generic
components that the Document Library uses, such as a workflow engine,
a query layer for Zope 3 and an advanced file upload widget.

You can download the hurry library here:

http://www.infrae.com/download/documentlibrary

You need hurry 0.8 or later.

Document Library
~~~~~~~~~~~~~~~~

Finally we are going to install the Document Library itself. You can
find it here:

http://www.infrae.com/download/documentlibrary

Unpack the documentlibrary package into the `pkg` directory like the
other extension packages.

Now we are going to edit `etc/zope.conf` so that the various packages
can be imported by Python when the Zope 3 instance starts. 

Go to the `etc` directory in the Zope instance directory, and edit
`zope.conf`. Add the following lines somewhere below the line that
starts with `%define INSTANCE`::

  path $INSTANCE/pkg/pyoai-2.1.3/src
  path $INSTANCE/pkg/zc.catalog-0.1/src
  path $INSTANCE/pkg/hurry-0.8/src
  path $INSTANCE/pkg/documentlibrary-1.1b/src

Now we need to add ZCML snippets to the `etc/package-includes`
directory of the Zope 3 instance to make sure that these packages are
loaded by Zope 3. In order to do this, copy all zcml files from the
`documentlibrary` directory into the Zope instance's
`etc/package-includes` directory. To be precise, you need to copy
the following files:

* zc.catalog-configure.zcml

* hurry-configure.zcml

* documentlibrary-configure.zcml

`pyoai` does not need such a file as it is a standalone Python
library.

Creating the Document Library root
----------------------------------

Now start Zope 3, go to `http://localhost:8080` (or wherever you are
installing) and click 'login'. You will now log into Zope 3, which is
necessary in order to install a Document Library application.

In the "Add" sidebar to the left you see the option 'Document
Library'. Click on this. You will now have to fill in the name of the
Document Library root. Call it `doclib` (or anything you like).

After this, you can go to `http://localhost:8080/doclib`. The Document
Library UI presents to you now.

Conversions
-----------

Unless you have the right conversion software installed, the only
combination of checkboxes on the "add document" screen that works is
`file downloadable` checked and the other checkboxes (`generate PDF`
and `generate plain text`) unchecked. `generate plain text` is checked
by default, so please uncheck it.

PDF to Text Conversions
~~~~~~~~~~~~~~~~~~~~~~~

The Document Library uses the `pdftotext` command (part of the xpdf
package) to convert PDF documents to plain text. You need this installed
on your system in order to enable PDF to text conversion (generate plain
text checkbox checked when a PDF file is uploaded).

Word Document conversions
~~~~~~~~~~~~~~~~~~~~~~~~~

Document Library uses Open Office to do conversions from Word
Documents to PDF and plain text. Open Office needs to be installed in
server mode to make use of it. Future versions of this document will
include instructions on how to install Open Office this way.

In addition, Document Library uses a system called BlueDCS to interface
with Open Office. This is a separate piece of software. It can be
checked out from SVN here:

https://svn.bluedynamics.net/svn/public/BlueDCS/trunk/BlueDCS/

Silva integration
-----------------

In this version, we have not documented the installation of the Silva
OAI component that ships in the `Products` directory of the Document
Library, nor the harvesting code to do full-text harvesting from the
Document Library. You may need unreleased versions of OAICore and
SilvaOAI to make it work; stay tuned or contact Infrae directly.

Troubleshooting
---------------

conversion errors when submitting a document
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

See the conversion section above.

does pyoai work correctly?
~~~~~~~~~~~~~~~~~~~~~~~~~~

pyoai ships with an automated test runner. If the tests work, then
you can be sure lxml and pyoai both work as intended. Go into the unpacked
`pyoai` directory and type the following::

  $ python test.py -v

If you see a lot of dots and no messages about failures or errors,
pyoai can find lxml and works correctly.

want to install more than one `zc` package
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

There are a whole number of packages in the `zc` namespace that can be
installed into Zope 3. The Document Library only needs `zc.catalog` in
this release, which is why a simple approach to installation was
chosen.

If you have other packages that you'd like to install that are in the
`zc` namespace package, such as for instance `zc.resourcelibrary` or
`zc.table` , the `zope.conf` path directive given for `zc.catalog`
does not work.

You will need to place all `zc` packages in the same `zc` namespace
package first, then make sure that `zc` package can be imported by
Python.

In the future Zope 3 packages are expected to be distributed as Python
eggs, which makes installation of sub-packages of the same namespace
package (such as `zc`) more easy.
