graphit.graph_storage_drivers package

graphit.graph_storage_drivers.graph_arraystorage_driver module

file: graph_arraystorage_driver.py

Classes that store nodes, edges and their attributes as numpy arrays using the Pandas package

class graphit.graph_storage_drivers.graph_arraystorage_driver.ArrayStorage(*args, **kwargs)

Bases: graphit.graph_storage_drivers.graph_driver_baseclass.GraphDriverBaseClass

ArrayStorage class

Provides a Pandas DataFrame based storage for nodes and edges. The class supports weak referencing of the internal DataFrame (_storage) using the weakref module to reduce memory footprint and enable true synchronized views across different instances of the DictStorage class.

copy()

Return a deep copy of the storage class with the same view as the parent instance.

Returns

deep copy of storage instance

Return type

ArrayStorage

property dataframe
Returns

the original pandas DataFrame object

Return type

:pandas:DataFrame

del_data_reference(target)

Implements GraphDriverBaseClass abstract method.

The array storage does not support reference pointers

get(k[, d]) → D[k] if k in D, else d. d defaults to None.
get_data_reference(target, default=None)

Implements GraphDriverBaseClass abstract method.

The array storage does not support reference pointers

items() → a set-like object providing a view on D's items
iteritems()

D.items() -> a set-like object providing a view on D’s items

iterkeys()

Implements a Python dict style ‘keys’ method

Returns the keys of the DataFrame which equal the columns returned as pandas Index object. This is subsequently returned as plain list.

Returns

dataframe column indexes (keys)

Return type

:py:list

itervalues()

D.values() -> an object providing a view on D’s values

keys()

Implements a Python dict style ‘keys’ method

Returns the keys of the DataFrame which equal the columns returned as pandas Index object. This is subsequently returned as plain list.

Returns

dataframe column indexes (keys)

Return type

:py:list

set(key, value)

Implement dictionary setter

Note

Do not use this method directly to add new nodes or edges to the graph. Use the graph add_node or add_edge methods for this purpose instead.

Parameters
  • key – dictionary key to add or update

  • value – key value

set_data_reference(source, target)

The array storage class does not support data referencing and will simply copy the data stored in source to target

Parameters
  • source – source key having the data

  • target – target key referring to data of source

to_dict(return_full=False)

Return a shallow copy of the full dictionary.

If the current ArrayStorage represent a selective view on the parent dictionary then only return a dictionary with a shallow copy of the keys in the selective view.

Parameters

return_full (bool) – ignores is_view and return the full dictionary

Return type

:py:dict

values() → an object providing a view on D's values
viewitems()

D.items() -> a set-like object providing a view on D’s items

viewkeys()

Implements a Python dict style ‘keys’ method

Returns the keys of the DataFrame which equal the columns returned as pandas Index object. This is subsequently returned as plain list.

Returns

dataframe column indexes (keys)

Return type

:py:list

viewvalues()

D.values() -> an object providing a view on D’s values

graphit.graph_storage_drivers.graph_arraystorage_driver.init_arraystorage_driver(nodes, edges, data)

ArrayStorage specific driver initiation method

Returns a ArrayStorage instance for nodes and edges and a AdjacencyView for adjacency based on the initiated nodes and edges stores.

Parameters
  • nodes (:py:list, :py:dict, :graphit:graph_arraystorage_driver:DictStorage) – Nodes to initiate nodes DictStorage instance

  • edges (:py:list, :py:dict, :graphit:graph_arraystorage_driver:DictStorage) – Edges to initiate edges DictStorage instance

  • data (:py:list, :py:dict, :graphit:graph_dictstorage_driver:DictStorage) – graph data attributes to initiate data DictStorage instance

Returns

Nodes and edges storage instances and Adjacency view.

graphit.graph_storage_drivers.graph_dictstorage_driver module

file: graph_dictstorage_driver.py

Unified view based dictionary class used by the Graph class to store node, edge and adjacency information.

Based on GraphDriverBaseClass it implements key, value and items abstract methods using the Python 3.x concept of view based representations with the added feature to define views as data mask on the storage level.

class graphit.graph_storage_drivers.graph_dictstorage_driver.DictStorage(*args, **kwargs)

Bases: graphit.graph_storage_drivers.graph_driver_baseclass.GraphDriverBaseClass

DictStorage class

Provides a Python native dict like class with unified keys, values, and items based dictionary views across Python distributions. The class supports weak referencing of the internal dictionary (_storage) using the weakref module to reduce memory footprint and enable true synchronized views across different instances of the DictStorage class.

del_data_reference(target)

Remove JSON $ref data reference in target

Parameters

target – key of target to remove $ref from

get_data_reference(target, default=None)

Check if the key defines a reference to the data of another key using the $ref pointer.

Parameters
  • target – key to check

  • default – default to return if $ref pointer not found

Returns

referred key or None

items()

Implement Python 3 dictionary like ‘items’ method that returns a DictView class.

Returns

dictionary items as tuple of key/value pairs

Return type

ItemsView instance

iteritems()

Implement Python 3 dictionary like ‘items’ method that returns a DictView class.

Returns

dictionary items as tuple of key/value pairs

Return type

ItemsView instance

iterkeys()

Implement Python 3 dictionary like ‘keys’ method that returns a DictView class.

Returns

dictionary keys

Return type

KeysView instance

itervalues()

Implement Python 3 dictionary like ‘values’ method that returns a DictView class.

Returns

dictionary values

Return type

ValuesView instance

keys()

Implement Python 3 dictionary like ‘keys’ method that returns a DictView class.

Returns

dictionary keys

Return type

KeysView instance

values()

Implement Python 3 dictionary like ‘values’ method that returns a DictView class.

Returns

dictionary values

Return type

ValuesView instance

graphit.graph_storage_drivers.graph_dictstorage_driver.init_dictstorage_driver(nodes, edges, data)

DictStorage specific driver initiation method

Returns a DictStorage instance for nodes and edges and a AdjacencyView for adjacency based on the initiated nodes and edges stores.

Parameters
  • nodes (:py:list, :py:dict, :graphit:graph_dictstorage_driver:DictStorage) – Nodes to initiate nodes DictStorage instance

  • edges (:py:list, :py:dict, :graphit:graph_dictstorage_driver:DictStorage) – Edges to initiate edges DictStorage instance

  • data (:py:list, :py:dict, :graphit:graph_dictstorage_driver:DictStorage) – graph data attributes to initiate data DictStorage instance

Returns

Nodes and edges storage instances and Adjacency view.

graphit.graph_storage_drivers.graph_driver_baseclass module

Storage driver abstract base class

Graphit uses dedicated data stores for node and edge data. Access is enabled using an API that reassembles a Python dictionary with a node or edge ID as ‘key’ and a node/edge data dictionary as ‘value’.

The API is formalized in the GraphDriverBaseClass base class that defines abstract and derived methods required for the dict-like API similar to the ‘MutableMapping’ abstract base class from the collections module. The data stores are decoupled from the main graph object structure by weak referencing through Pythons weakref module.

The GraphDriverBaseClass uses data ‘views’ to allow instances of the driver class to represent a subset of nodes/edges while still having the same weak reference to the full data storage object. A ‘view’ is a simple list (_view) storing the node/edge primary keys.

The GraphDriverBaseClass facilitates easy creation of different storage backends that can be transparently used as Python dictionaries in graphit graphs.

Note

The developer of new a new storage driver is responsible for initializing an empty ‘_view’ list and providing support for ‘views’ in the implementation of the abstract methods.

class graphit.graph_storage_drivers.graph_driver_baseclass.GraphDriverBaseClass

Bases: collections.abc.MutableMapping

Abstract base class for graph data storage drivers

This is the boilerplate class for the implementation of node and edge storage driver classes in graphit. It exposes a Python dict-like API by using the ‘MutableMapping’ abstract base class from the ‘collections’ module. In addition it implements support for several methods to provide support for list-like method and rich set-like comparison methods.

The following abstract methods from the ‘MutableMapping’ class are required:

  • __getitem__, __setitem__, __delitem__, __iter__, __len__

The following abstract methods for dictionary ‘view’ like iteration are required:

  • iterkeys, itervalues, iteritems

All other methods are derived from the abstract methods but may be overloaded if it benefits efficiency for a given storage backend for instance.

Note

please note that the ‘GraphDriverBaseClass’ uses data ‘views’ to allow instances of the driver class to represent a subset of nodes edges while still having the same weak reference to the full data storage object. A ‘view’ is a simple list (_view) storing the node edge primary keys. The empty ‘_view’ list needs to be initialized by the storage driver and support for ‘views’ needs to be implemented for the abstract methods.

copy()

Return a deep copy of the storage class with the same view as the parent instance.

Returns

deep copy of storage instance

Return type

DictStorage

abstract del_data_reference(target)

Remove JSON $ref data reference in target

Parameters

target – key of target to remove $ref from

difference(other)

Return the difference between the key set of self and other

Return type

:py:class:set

fromkeys(keys, value=None)

Return a shallow copy of the dictionary for selected keys.

If the DictStorage instance represent a selective view of the main dictionary, only those keys will be considered.

Parameters
  • keys – keys to return dictionary copy for

  • value – Default value keys

abstract get_data_reference(target, default=None)

Check if the key defines a reference to the data of another key using the $ref pointer.

Parameters
  • target – key to check

  • default – default to return if $ref pointer not found

Returns

referred key or None

has_data_reference(target)

Check if the target key defines a JSON $ref pointer to the data of another (source) key/value pair.

Parameters

target – target key to check reference for

Return type

:py:bool

intersection(other)

Return the intersection between the key set of self and other

Parameters

other (:py:dict) – object to compare to

Return type

:py:class:set

property is_view

Does the current DictStorage represent a selective view on the parent dictionary?

Return type

bool

isdisjoint(other)

Returns a Boolean stating whether the key set in self overlap with the specified key set or iterable of other.

Parameters

other (:py:dict) – object to compare to

Return type

:py:bool

issubset(other, propper=True)

Keys in self are also in other but other contains more keys (propper = True)

Parameters
  • other (:py:dict) – object to compare to

  • propper (:py:bool) – ensure that both key lists are not the same.

Return type

:py:bool

issuperset(other, propper=True)

Keys in self are also in other but self contains more keys (propper = True)

Parameters
  • other (:py:dict) – object to compare to

  • propper (:py:bool) – ensure that both key lists are not the same.

Return type

:py:bool

abstract iteritems()

Implement Python 3.x dictionary like ‘items’ iterator method that returns a view on the items in the data store. The viewitems equals iteritems and implements the Python 2.7 equivalent.

If the storage instance does not support view based iterations then iteritems should default to the items method

Returns

data items as tuple of key/value pairs

Return type

items view instance

abstract iterkeys()

Implement Python 3.x dictionary like ‘keys’ iterator method that returns a view on the keys in the data store. The viewkeys equals iterkeys and implements the Python 2.7 equivalent.

If the storage instance does not support view based iterations then iterkeys should default to the keys method

Returns

data keys

Return type

keys view instance

abstract itervalues()

Implement Python 3.x dictionary like ‘values’ iterator method that returns a view on the values in the data store. The viewvalues equals itervalues and implements the Python 2.7 equivalent.

If the storage instance does not support view based iterations then itervalues should default to the values method

Returns

data values

Return type

values view instance

query(match_func)

Storage query method

Use Python lambda functions to query the storage based on primary storage keys (node ID/egde ID) and values (attribute dictionary). Other function are allowed as well as long as they match the function signature described below.

The used lambda function excepts two arguments in the following order: the primary key first and the attribute dictionary (value) as seconds argument.

Example: lambda k,v: v[‘weight’] > 2 and k == 13

Parameters

match_func (:py:lambda) – lambda query function

Returns

list of primary storage identifiers (keys) matching the lambda query

Return type

:py:list

remove(key)

Implements list-like key removal

Parameters

key – key to remove

reset_view()

Reset the selective view on the DataFrame

set(key, value)

Implement dictionary setter

Note

Do not use this method directly to add new nodes or edges to the graph. Use the graph add_node or add_edge methods for this purpose instead.

Parameters
  • key – dictionary key to add or update

  • value – key value

set_data_reference(source, target)

Defines a reference in the target key to the data attributes (value) of the source key using a JSON $ref pointer

This method is for instance used in setting up the edge pair that defines an undirectional edge having the data of the second edge in the pair referring to that of the first.

Parameters
  • source – source key having the data

  • target – target key referring to data of source

set_view(keys)

Register keys to represent a selective view on the dictionary

Parameters

keys (list or tuple) – keys to set

symmetric_difference(other)

Return the symmetric difference between the key set of self and other

Return type

:py:class:set

to_dict(return_full=False)

Return a shallow copy of the full dictionary.

If the current DictStorage represent a selective view on the parent dictionary then only return a dictionary with a shallow copy of the keys in the selective view.

Parameters

return_full (bool) – ignores is_view and return the full dictionary

Return type

:py:dict

union(other)

Return the union between the key set of self and other

Return type

:py:class:set

viewitems()

Implement Python 2.7 dictionary like ‘items’ iterator method that returns a view on the items in the data store. It redirects the call to the iteritems method which is the Python 3.x equivalent and abstract method.

Returns

data items as tuple of key/value pairs

Return type

items view instance

viewkeys()

Implement Python 2.7 dictionary like ‘keys’ iterator method that returns a view on the values in the data store. It redirects the call to the iterkeys method which is the Python 3.x equivalent and abstract method.

Returns

data keys

Return type

keys view instance

viewvalues()

Implement Python 2.7 dictionary like ‘values’ iterator method that returns a view on the values in the data store. It redirects the call to the itervalues method which is the Python 3.x equivalent and abstract method.

Returns

data values

Return type

values view instance

graphit.graph_storage_drivers.graph_storage_views module

file: graph_storage_views.py

Contains classes that provide various types of ‘views’ on the data in the main node and edge storage.

These classes use the common storage driver API for nodes and edges and therefor they should work for every storage driver. A driver can implement it’s own version of a view for performance purposes for instance.

class graphit.graph_storage_drivers.graph_storage_views.AdjacencyView(nodes, edges)

Bases: object

Adjacency View class

Makes class adjacency information available through a dict-like API. Adjacency is based on the graphs nodes and edges. When the latter two are ‘views’ then AdjacencyView als behaves as a view.

An adjacency view is based on nodes and edges. The view is built on each call to a method of the AdjacencyView. This is not very efficient if the nodes and edges did not change in the meantime. As a solution one can either request the full adjacency dictionary by calling the class (__call__) or using the ‘with graph.adjacency as adj’ construct that will use the same adjacency dictionary for all calls made to the class while in the ‘with’ loop.

degree(nodes=None, method='degree', weight=None)

Return the degree of nodes in the graph

Reports a dictionary with for every node in the graph the number of connected edges of type:

  • indegree: edges from others to self

  • outdegree: edges from self to others

  • degree: indegree and outdegree combined

Edges connected to self are counted twice when using ‘degree’ as method

Parameters
  • nodes (:py:list) – Nodes to return degree for

  • method (:py:str) – degree type as ‘indegree’, ‘outdegree’ or both ‘degree’

  • weight (:py:str) – Name of edge weight attribute. If None, then each edge has weight 1. The degree is the sum of the edge weights adjacent to the node.

Returns

Degree

Return type

:py:dict

get(node, default=None)

Implements dict-like get method

Parameters
  • node – Node to get adjacency for

  • default – Default return value if node not in adjacency dict

Returns

adjacency

Return type

:py:list

property is_view
items()

Implements dict-like items method

Returns

Adjacency dict items tuples

Return type

:py:list

keys()

Implements dict-like keys method

Returns

Adjacency dict keys

Return type

:py:list

predecessors(node)

Return a list for all predecessors nodes of ‘node’.

These are all nodes for which there exist an edge from other nodes to self (node).

Returns

predecessors nodes

Return type

:py:list

successors(node)

Return a list for all successor nodes of ‘node’.

These are all nodes for which there exist and edge from ‘node’ to other. This is the same as the neighbours of ‘node’ or self[node].

Returns

successor nodes

Return type

:py:list

values()

Implements dict-like values method

Returns

Adjacency dict values

Return type

:py:list

class graphit.graph_storage_drivers.graph_storage_views.DataView(storage, data, default=None)

Bases: collections.abc.Set

A read-only DataView class on node/edge data

This class allows iteration over the data store primary keys (nodes or edges) and data attributes in a read-only fashion. The full data dictionary is considered by default. If the ‘data’ attribute is defined to something other then a boolean that parameter will be used as lookup key for the data dictionary only returning the corresponding value or ‘default’ if not found.

This class is available to provide compatibility with the NodeDataView and EdgeDataView classes from the NetworkX library. The same functionality can be achieved using the ‘iteritems’ or ‘itervalues’ methods of the default storage driver class that provide iterators over the attribute data in the node/edge data store.

get(item, default=None)

Implement dict-like ‘get’ method

Return node/edge attribute dictionary or specific value (_data). If _data key was not found return default value.

Parameters
  • item – node/edge primary key

  • default – default value to return

Returns

data dictionary or attribute value

Module contents