python-streamz
diff --git a/‎.coveragerc‎
Lines changed: 1 addition & 0 deletions b/‎.coveragerc‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎.gitignore‎
Lines changed: 1 addition & 0 deletions b/‎.gitignore‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎.travis.yml‎
Lines changed: 1 addition & 1 deletion b/‎.travis.yml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎README.rst‎
Lines changed: 4 additions & 2 deletions b/‎README.rst‎
Lines changed: 4 additions & 2 deletions
diff --git a/‎docs/source/api.rst‎
Lines changed: 1 addition & 0 deletions b/‎docs/source/api.rst‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/source/core.rst‎
Lines changed: 18 additions & 1 deletion b/‎docs/source/core.rst‎
Lines changed: 18 additions & 1 deletion
diff --git a/‎docs/source/dataframes.rst‎
Lines changed: 2 additions & 2 deletions b/‎docs/source/dataframes.rst‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/source/index.rst‎
Lines changed: 1 addition & 2 deletions b/‎docs/source/index.rst‎
Lines changed: 1 addition & 2 deletions
diff --git a/‎setup.py‎
Lines changed: 1 addition & 1 deletion b/‎setup.py‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎streamz/__init__.py‎
Lines changed: 1 addition & 1 deletion b/‎streamz/__init__.py‎
Lines changed: 1 addition & 1 deletion
@@ -12,3 +12,4 @@ omit =
 
 exclude_lines =
     if __name__ == '__main__':
+    pragma: no cover
@@ -12,3 +12,4 @@ log
 *.swo
 .cache/
 .ipynb_checkpoints/
+.vscode
@@ -7,7 +7,7 @@ language: python
 
 matrix:
   include:
-    - python: 3.6
+    - python: 3.7
 
 env:
   - STREAMZ_LAUNCH_KAFKA=true
 
@@ -1,11 +1,11 @@
 Streamz
 =======
 
-|Build Status| |Doc Status| |Version Status|
+|Build Status| |Doc Status| |Version Status| |RAPIDS custreamz gpuCI|
 
 Streamz helps you build pipelines to manage continuous streams of data. It is simple to use in simple cases, but also supports complex pipelines that involve branching, joining, flow control, feedback, back pressure, and so on.
 
-Optionally, Streamz can also work with Pandas dataframes to provide sensible streaming operations on continuous tabular data.
+Optionally, Streamz can also work with both `Pandas <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html>`_ and `cuDF <https://docs.rapids.ai/api/cudf/stable/>`_ dataframes, to provide sensible streaming operations on continuous tabular data.
 
 To learn more about how to use Streamz see documentation at `streamz.readthedocs.org <https://streamz.readthedocs.org>`_.
 
@@ -21,3 +21,5 @@ BSD-3 Clause
    :alt: Documentation Status
 .. |Version Status| image:: https://img.shields.io/pypi/v/streamz.svg
    :target: https://pypi.python.org/pypi/streamz/
+.. |RAPIDS custreamz gpuCI| image:: https://img.shields.io/badge/gpuCI-custreamz-green
+   :target: https://github.com/jdye64/cudf/blob/kratos/python/custreamz/custreamz/kafka.py
@@ -95,6 +95,7 @@ Definitions
 
 .. autofunction:: filenames
 .. autofunction:: from_kafka
+.. autofunction:: from_kafka_batched
 .. autofunction:: from_textfile
 
 .. currentmodule:: streamz.dask
 
@@ -156,7 +156,8 @@ Branching and Joining
    zip_latest
 
 You can branch multiple streams off of a single stream.  Elements that go into
-the input will pass through to both output streams.
+the input will pass through to both output streams.  Note: ``graphviz`` and
+``networkx`` need to be installed to visualize the stream graph.
 
 .. code-block:: python
 
@@ -377,3 +378,19 @@ For operations like this Streamz adds virtually no overhead.
 
 Streams provides higher level APIs for situations just like this one.  You may
 want to read further about :doc:`collections <collections>`
+
+
+Metadata
+--------
+
+Metadata can be emitted into the pipeline to accompany the data as a list of dictionaries. Most functions will pass the metadata to the downstream function without making any changes. However, functions that make the pipeline asynchronous require logic that dictates how and when the metadata will be passed downstream. Synchronous functions and asynchronous functions that have a 1:1 ratio of the number of values on the input to the number of values on the output will emit the metadata collection without any modification. However, functions that have multiple input streams or emit collections of data will emit the metadata associated with the emitted data as a collection.
+
+
+Reference Counting and Checkpointing
+------------------------------------
+
+Checkpointing is achieved in Streamz through the use of reference counting. With this method, a checkpoint can be saved when and only when data has progressed through all of the the pipeline without any issues. This prevents data loss and guarantees at-least-once semantics.
+
+Any node that caches or holds data after it returns increments the reference counter associated with the given data by one. When a node is no longer holding the data, it will release it by decrementing the counter by one. When the counter changes to zero, a callback associated with the data is triggered.
+
+References are passed in the metadata as a value of the `ref` keyword. Each metadata object contains only one reference counter object.
@@ -103,8 +103,8 @@ following:
 
        def on_old(self, state, old):
            total, count = state
-           total = total - new.sum()   # switch + for - here
-           count = count - new.count() # switch + for - here
+           total = total - old.sum()   # switch + for - here
+           count = count - old.count() # switch + for - here
            new_state = (total, count)
            new_value = total / count
            return new_state, new_value
 
@@ -5,8 +5,7 @@ Streamz helps you build pipelines to manage continuous streams of data.  It is
 simple to use in simple cases, but also supports complex pipelines that involve
 branching, joining, flow control, feedback, back pressure, and so on.
 
-Optionally, Streamz can also work with Pandas dataframes to provide sensible
-streaming operations on continuous tabular data.
+Optionally, Streamz can also work with both `Pandas <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html>`_ and `cuDF <https://docs.rapids.ai/api/cudf/stable/>`_ dataframes, to provide sensible streaming operations on continuous tabular data.
 
 To learn more about how to use streams, visit :doc:`Core documentation <core>`.
 
 
@@ -9,7 +9,7 @@
 
 
 setup(name='streamz',
-      version='0.5.2',
+      version='0.5.3',
       description='Streams',
       url='http://github.com/python-streamz/streamz/',
       maintainer='Matthew Rocklin',
 
@@ -8,4 +8,4 @@
 except ImportError:
     pass
 
-__version__ = '0.5.2'
+__version__ = '0.5.3'
Original file line number	Diff line number	Diff line change
`@@ -12,3 +12,4 @@ omit =`
`12`	`12`
`13`	`13`	`exclude_lines =`
`14`	`14`	`if __name__ == '__main__':`
	`15`	`+ pragma: no cover`
-Original file line number
+Diff line change
 *.swo
 .cache/
 .ipynb_checkpoints/
 +.vscode