Metadata-Version: 1.1
Name: datapackage
Version: 0.8.9
Summary: Utilities to work with Data Packages as defined on dataprotocols.org
Home-page: https://github.com/frictionlessdata/datapackage-py
Author: Open Knowledge Foundation
Author-email: info@okfn.org
License: MIT
Description: # DataPackage.py
        
        [![Gitter](https://img.shields.io/gitter/room/frictionlessdata/chat.svg)](https://gitter.im/frictionlessdata/chat)
        [![Build Status](https://travis-ci.org/frictionlessdata/datapackage-py.svg?branch=master)](https://travis-ci.org/frictionlessdata/datapackage-py)
        [![Windows Build Status](https://ci.appveyor.com/api/projects/status/github/frictionlessdata/datapackage-py?branch=master&svg=true)](https://ci.appveyor.com/project/vitorbaptista/datapackage-py)
        [![Test Coverage](https://coveralls.io/repos/frictionlessdata/datapackage-py/badge.svg?branch=master&service=github)](https://coveralls.io/github/frictionlessdata/datapackage-py)
        ![Support Python versions 2.7, 3.3, 3.4 and 3.5](https://img.shields.io/badge/python-2.7%2C%203.3%2C%203.4%2C%203.5-blue.svg)
        
        A model for working with [Data Packages].
        
          [Data Packages]: http://dataprotocols.org/data-packages/
        
        ## Install
        
        ```
        pip install datapackage
        ```
        
        ## Examples
        
        
        ### Reading a Data Package and its resource
        
        ```python
        import datapackage
        
        dp = datapackage.DataPackage('http://data.okfn.org/data/core/gdp/datapackage.json')
        brazil_gdp = [{'Year': int(row['Year']), 'Value': float(row['Value'])}
                      for row in dp.resources[0].data if row['Country Code'] == 'BRA']
        
        max_gdp = max(brazil_gdp, key=lambda x: x['Value'])
        min_gdp = min(brazil_gdp, key=lambda x: x['Value'])
        percentual_increase = max_gdp['Value'] / min_gdp['Value']
        
        msg = (
            'The highest Brazilian GDP occured in {max_gdp_year}, when it peaked at US$ '
            '{max_gdp:1,.0f}. This was {percentual_increase:1,.2f}% more than its '
            'minimum GDP in {min_gdp_year}.'
        ).format(max_gdp_year=max_gdp['Year'],
                 max_gdp=max_gdp['Value'],
                 percentual_increase=percentual_increase,
                 min_gdp_year=min_gdp['Year'])
        
        print(msg)
        # The highest Brazilian GDP occured in 2011, when it peaked at US$ 2,615,189,973,181. This was 172.44% more than its minimum GDP in 1960.
        ```
        
        ### Validating a Data Package
        
        ```python
        import datapackage
        
        dp = datapackage.DataPackage('http://data.okfn.org/data/core/gdp/datapackage.json')
        try:
            dp.validate()
        except datapackage.exceptions.ValidationError as e:
            # Handle the ValidationError
            pass
        ```
        
        ### Retrieving all validation errors from a Data Package
        
        ```python
        import datapackage
        
        # This descriptor has two errors:
        #   * It has no "name", which is required;
        #   * Its resource has no "data", "path" or "url".
        descriptor = {
            'resources': [
                {},
            ]
        }
        
        dp = datapackage.DataPackage(descriptor)
        
        for error in dp.iter_errors():
            # Handle error
        ```
        
        ### Creating a Data Package
        
        ```python
        import datapackage
        
        dp = datapackage.DataPackage()
        dp.descriptor['name'] = 'my_sleep_duration'
        dp.descriptor['resources'] = [
            {'name': 'data'}
        ]
        
        resource = dp.resources[0]
        resource.descriptor['data'] = [
            7, 8, 5, 6, 9, 7, 8
        ]
        
        with open('datapackage.json', 'w') as f:
          f.write(dp.to_json())
        # {"name": "my_sleep_duration", "resources": [{"data": [7, 8, 5, 6, 9, 7, 8], "name": "data"}]}
        ```
        
        ### Using a schema that's not in the local cache
        
        ```python
        import datapackage
        import datapackage.registry
        
        # This constant points to the official registry URL
        # You can use any URL or path that points to a registry CSV
        registry_url = datapackage.registry.Registry.DEFAULT_REGISTRY_URL
        registry = datapackage.registry.Registry(registry_url)
        
        descriptor = {}  # The datapackage.json file
        schema = registry.get('tabular')  # Change to your schema ID
        
        dp = datapackage.DataPackage(descriptor, schema)
        ```
        
        ### Push/pull Data Package to storage
        
        Package provides `push_datapackage` and `pull_datapackage` utilities to
        push and pull to/from storage.
        
        This functionality requires `jsontableschema` storage plugin installed. See
        [plugins](#https://github.com/frictionlessdata/jsontableschema-py#plugins)
        section of `jsontableschema` docs for more information. Let's imagine
        we have installed `jsontableschema-mystorage` (not a real name) plugin.
        
        Then we could push and pull datapackage to/from the storage:
        
        > All parameters should be used as keyword arguments.
        
        ```python
        from datapackage import push_datapackage, pull_datapackage
        
        # Push
        push_datapackage(
            descriptor='descriptor_path',
            backend='mystorage', **<mystorage_options>)
        
        # Import
        pull_datapackage(
            descriptor='descriptor_path', name='datapackage_name',
            backend='mystorage', **<mystorage_options>)
        ```
        
        Options could be a SQLAlchemy engine or a BigQuery project and dataset name etc.
        Detailed description you could find in a concrete plugin documentation.
        
        See concrete examples in
        [plugins](#https://github.com/frictionlessdata/jsontableschema-py#plugins)
        section of `jsontableschema` docs.
        
        ## Developer notes
        
        These notes are intended to help people that want to contribute to this
        package itself. If you just want to use it, you can safely ignore them.
        
        ### Updating the local schemas cache
        
        We cache the schemas from <https://github.com/dataprotocols/schemas>
        using git-subtree. To update it, use:
        
            git subtree pull --prefix datapackage/schemas https://github.com/dataprotocols/schemas.git master --squash
        
Keywords: data dataprotocols jsontableschema frictionlessdata datascience
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Information Technology
Classifier: Topic :: Utilities
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3.3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
