Metadata-Version: 1.1
Name: cld2-cffi
Version: 0.1.4
Summary: CFFI bindings around Google Chromium's embedded compact language detection library (CLD2)
Home-page: http://github.com/GregBowyer/cld2-cffi/
Author: Michael McCandless & Greg Bowyer
Author-email: mail@mikemccandless.com & gbowyer@fastmail.co.uk
License: Apache2
Description: CLD2-CFFI - Python (CFFI) Bindings for Compact Language Detector 2
        ==================================================================
        
        `CFFI <cffi.readthedocs.org>`_ bindings for CLD2
        
        -----
        
        |pypi| |build| |win-build| |coverage| |lint|
        
        -----
        
        
        This package contains the CLD (Compact Language Detection) library as
        maintained by Dick Sites (https://code.google.com/p/cld2/). The first
        fork was done at revision r161. It also contains python bindings that
        were originally created by `Mike
        McCandless <http://code.google.com/p/chromium-compact-language-detector>`_.
        The bindings have gone through several hands, with the latest changes being made
        to rework the bindings for `CFFI <cffi.readthedocs.org>`_.
        
        These bindings are identical in API to the original cld2 bindings, and as a
        result can be used as a drop in replacement.
        
        The LICENSE_ is the same as Chromium's LICENSE and is included in the
        LICENSE_ file for reference.
        
        ==========
        Installing
        ==========
        
        Should be as simple as
        
        .. code-block:: bash
        
           $ pip install cld2-cffi
        
        -------------------
        Development Version
        -------------------
        
        The **latest development version** can be installed directly from GitHub:
        
        .. code-block:: bash
        
            $ pip install --upgrade 'git+https://github.com/GregBowyer/cld2-cffi.git'
        
        =====
        Usage
        =====
        
        .. code-block:: python
        
            import cld2
        
            isReliable, textBytesFound, details = cld2.detect("This is my sample text")
            print('  reliable: %s' % (isReliable != 0))
            print('  textBytes: %s' % textBytesFound)
            print('  details: %s' % str(details))
        
            # The output looks like so:
            #  reliable: True
            #  textBytes: 24
            #  details: (('ENGLISH', 'en', 95, 1736.0), ('Unknown', 'un', 0, 0.0), ('Unknown', 'un', 0, 0.0))
        
        =============
        Documentation
        =============
        
        First, you must get your content (plain text or HTML) encoded into UTF8
        bytes. Then, detect like this:
        
        .. code-block:: python
        
            isReliable, textBytesFound, details = cld2.detect(bytes)
        
        ``isReliable`` 
            is True if the top language is much better than 2nd best language.
        
        ``textBytesFound`` 
            tells you how many actual bytes CLD analyzed (after removing HTML tags,
            collapsing areas of too-many-spaces, etc.).  
        
        ``details`` 
            has an entry per top 3 languages that matched, that includes the percent
            confidence of the match as well as a separate normalized score.
        
        The module exports these global constants:
        
        ``cld2.ENCODINGS``
            list of the encoding names CLD recognizes (if you provide hintEncoding, it
            must be one of these names).
        
        ``cld2.LANGUAGES``
            list of languages and their codes (if you provide hintLanguageCode, it must
            be one of the codes from these codes).
        
        ``cld2.EXTERNAL_LANGUAGES``
            list of external languages and their codes. Note that external languages
            cannot be hinted, but may be matched if you pass
            ``includeExtendedLanguages=True`` (the default).
        
        ``cld2.DETECTED_LANGUAGES``
            list of all detectable languages, as best I can determine (this was reverse
            engineered from a unit test, ie it contains a language X if that language
            was tested and passes for at least one example text).
        
        
        =======
        Authors
        =======
        
        Please see `AUTHORS <https://github.com/GregBowyer/cld2-cffi/blob/master/BUG_REPORTS.rst>`_.
        
        
        ==============
        Reporting bugs
        ==============
        Please see `BUG_REPORTS <https://github.com/GregBowyer/cld2-cffi/blob/master/BUG_REPORTS.rst>`_.
        
        
        ==========
        Contribute
        ==========
        
        Please see `CONTRIBUTING <https://github.com/GregBowyer/cld2-cffi/blob/master/CONTRIBUTING.rst>`_.
        
        
        =======
        Licence
        =======
        
        Please see LICENSE_.
        
        .. _LICENSE: https://github.com/GregBowyer/cld2-cffi/blob/master/LICENSE
        
        .. |pypi| image:: https://img.shields.io/pypi/v/cld2-cffi.svg?style=flat-square&label=latest%20version
            :target: https://pypi.python.org/pypi/cld2-cffi
            :alt: Latest version released on PyPi
        
        .. |build| image:: https://img.shields.io/travis/GregBowyer/cld2-cffi/master.svg?style=flat-square&label=OSX%20Linux%20build
            :target: http://travis-ci.org/GregBowyer/cld2-cffi
            :alt: Build status 
        
        .. |win-build| image:: https://img.shields.io/appveyor/ci/GregBowyer/cld2-cffi.svg?maxAge=2592000&style=flat-square&label=Windows%20Build
            :target: https://ci.appveyor.com/project/GregBowyer/cld2-cffi
            :alt: Windows Build Status::
        
        .. |coverage| image:: https://img.shields.io/codecov/c/github/GregBowyer/cld2-cffi.svg?style=flat-square
            :target: https://codecov.io/github/GregBowyer/cld2-cffi
            :alt: Coverage
        
        .. |lint| image:: https://landscape.io/github/GregBowyer/cld2-cffi/master/landscape.svg?style=flat-square
           :target: https://landscape.io/github/GregBowyer/cld2-cffi/master
           :alt: Code Health
        
Keywords: cld2,cffi
Platform: UNKNOWN
Classifier: License :: OSI Approved :: BSD License
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: C++
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Topic :: Text Processing :: Linguistic
