Cloudooo Home Cloudooo

    Cloudooo

    Links

    Cloudooo is a simple file conversion server that can be used and provided as a service to convert supported input formats into a number of different output formats. While Cloudooo had been created for conversion of OpenOffice.org documents into HTML and vice versa it has grown to also including several other converters for documents, images, audio and video formats. It is developed and maintained by Nexedi and is being used in several applications (simple cloudooo converter) and within ERP5.

    Features

    Cloudooo has multiple converters including:

    • LibreOffice document conversion
    • OnlyOffice conversion
    • PDF conversion
    • Image conversion
    • Video/Audio conversion

    Why use Cloudooo?

    Deployment and use of Cloudooo does not require too much effort to start using it because it is a web service. You can install only one server and enjoy all features.

    Additionally, Cloudooo was built and tested to support high demand of conversions.

    Getting Started

    Source Code

    You can find the source code in the following Git repository: https://lab.nexedi.com/nexedi/cloudooo.git (Github mirror) or browse it online.

    Requirements

    • Python >= 2.7

    Installing Cloudooo

    $ python setup.py install

    Once cloudooo is installed, you can install the following software:

    Warning: Except LibreOffice or OpenOffice, all softwares are optional.

    Installing Libreoffice

    To install Libreoffice, you can download the package directly from the official web site:

    $ wget http://downloadarchive.documentfoundation.org/libreoffice/old/5.2.4.2/deb/x86_64/LibreOffice_5.2.4.2_Linux_x86-64_deb.tar.gz

    Extract it:

    $ tar zxvf LibreOffice_5.2.4.2_Linux_x86-64_deb.tar.gz 

    And install all debian packages:

    $ cd LibreOffice_5.2.4.2_Linux_x86-64_deb/DEBS
    $ sudo dpkg -i *.deb

    Warning: Please, make sure you download the right package to your system here.

    Creating Configuration File

    A configuration file is used to start the application using paster. A sample configration can be found at https://lab.nexedi.com/nexedi/cloudooo/blob/master/cloudooo/sample/sample.conf.

    Copy this file to the current folder

    $ cp ./cloudooo/sample/sample.conf ./cloudooo.conf # Copy to current folder

    Next the required attributes for the configuration file need to be defined:

    • working_path - folder to run the application. This folder need be created.
    • uno_path - folder where UNO library is installed (ex. /opt/libreoffice/basis-link/program/)
    • soffice_binary_path - folder where soffice.bin is installed (ex. /opt/libreoffice/program/)

    Run Cloudooo

    $ paster serve ./cloudooo.conf

    or run as a daemon

    $ paster serve ./cloudoo.conf --daemon

    Stop Cloudooo

    $ kill -1 PASTER_PID

    Warning: Always use SIGHUP, because only with this signal all processes are stopped correctly.

    How To Use Cloudooo

    soffice (LibreOffice/OpenOffice.org)

    PyUno is used to connect to LibreOffice/OpenOffice.org through an open socket with only a process having access at a time. All clients receive the same object(proxy) when connecting with the XMLRPC Server. When receiving files XMLRPC will connect to 'soffice.bin' using PyUno, which opens a new document, writes to it, adds metadata and returns the document (edited/converted) to XMLRPC, which passes it back to the user and finalizes the call.

    This way XMLRPC allows to convert documents with or without metadata, return only metadata, convert metadata or convert the file into another format.

    OnlyOffice

    OnlyOffice is used internally through X2T Handler to convert Microsoft Office 2007 documents to OnlyOffice documents.

    ImageMagick

    Cloudooo uses convert to handle images, for example, png to jpg.

    WKHtmlToPdf

    wkhtmltopdf is a open source (LGPLv3) command line tools to render HTML into PDF. Looking new features and better results in PDF documents after convertion, we detected that wkhtmltopdf has more features and is faster to generate HTML documents in PDF. 

    FFMPEG

    FFMPEGHandler is a handler of cloudooo for developing GUI conversion applications using FFmpeg cross-platform. The FFMPEGHandler package defines a single class, Handler, which is the interface for audio and video convertion into cloudooo. FFMPEGHandler has been developed with python 2.6 and ffmpeg 0.6.1.

    Example:

    File Conversion

    >>> from cloudooo.handler.ffmpeg import Handler
    >>> handler = Handler('my_path_data', open("test.ogv").read(), 'ogv')
    >>> converted_data = handler.convert('mpeg')

    Getting information of file

    >>> from cloudooo.handler.ffmpeg import Handler
    >>> handler = Handler('my_path_data', open(test.ogv).read(), 'ogv')
    >>> metadata = handler.getMetadata()
    >>> metadata
    { 'ENCODER': 'Lavf52.64.2'}

    Note: When using the FFMPEGHandler library, it is required to import its dependencies as well and set the environment for the handler.

    Example

    >>> from cloudooo.handler.ffmpeg import Handler
    >>> self.kw = dict(env=dict(PATH="../software/parts/ffmpeg/bin"))
    >>> handler = Handler('my_path_data', open(test.ogv).read(), 'ogv', **self.kw)
    >>> converted_data = handler.convert('mpeg')
    

    How to call Cloudooo as a Webservice

    XMLRPC and WSGI will be used as a bridge to access all handlers . Cloudooo implements an XMLRPC server into WSGI (using paster).

    >>> from xmlrpclib import ServerProxy
    >>> input_data  = "Hello Word"
    >>> proxy = ServerProxy("http://localhost:8011")
    >>> pdf = proxy.convertFile(input_data, 'txt', 'pdf')
    >>> print(pdf)

    Latest News

    Documentation

    Tips and Tricks

    soffice (LibreOffice/OpenOffice.org)

    • soffice.bin stalled - finalize process, start 'soffice.bin' resubmit the document again (without restarting cloudooo).
    • soffice.bin crashed - finalize process, verify all processes are killed, restart soffice.bin resubmit the document again (without restarting cloudooo).
    • soffice.bin received document, then stalls - kill process and restart.
    • Document sent is corrupted - write error to log and verify that the process aren't kept in memory.
    • Loss of socket connection - cloudooo will kill and restart the process and resubmit the file.

    Tests

    Automated test results are published on www.erp5.com.

    Licence

    Cloudooo is Free Software, licensed under GPLv3+ with wide exception for Free and Open-Source software. Please see Nexedi licensing for rationale and options.