You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/core.rst
+32-8Lines changed: 32 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,7 +15,9 @@ Map, emit, and sink
15
15
map
16
16
sink
17
17
18
-
You can create a basic pipeline by instantiating the ``Streamz`` object and then using methods like ``map``, ``accumulate``, and ``sink``.
18
+
You can create a basic pipeline by instantiating the ``Streamz``
19
+
object and then using methods like ``map``, ``accumulate``, and
20
+
``sink``.
19
21
20
22
.. code-block:: python
21
23
@@ -27,7 +29,10 @@ You can create a basic pipeline by instantiating the ``Streamz`` object and then
27
29
source = Stream()
28
30
source.map(increment).sink(print)
29
31
30
-
The ``map`` and ``sink`` methods both take a function and apply that function to every element in the stream. The ``map`` method returns a new stream with the modified elements while ``sink`` is typically used at the end of a stream for final actions.
32
+
The ``map`` and ``sink`` methods both take a function and apply that
33
+
function to every element in the stream. The ``map`` method returns a
34
+
new stream with the modified elements while ``sink`` is typically used
35
+
at the end of a stream for final actions.
31
36
32
37
To push data through our pipeline we call ``emit``
33
38
@@ -383,14 +388,33 @@ want to read further about :doc:`collections <collections>`
383
388
Metadata
384
389
--------
385
390
386
-
Metadata can be emitted into the pipeline to accompany the data as a list of dictionaries. Most functions will pass the metadata to the downstream function without making any changes. However, functions that make the pipeline asynchronous require logic that dictates how and when the metadata will be passed downstream. Synchronous functions and asynchronous functions that have a 1:1 ratio of the number of values on the input to the number of values on the output will emit the metadata collection without any modification. However, functions that have multiple input streams or emit collections of data will emit the metadata associated with the emitted data as a collection.
391
+
Metadata can be emitted into the pipeline to accompany the data as a
392
+
list of dictionaries. Most functions will pass the metadata to the
393
+
downstream function without making any changes. However, functions
394
+
that make the pipeline asynchronous require logic that dictates how
395
+
and when the metadata will be passed downstream. Synchronous functions
396
+
and asynchronous functions that have a 1:1 ratio of the number of
397
+
values on the input to the number of values on the output will emit
398
+
the metadata collection without any modification. However, functions
399
+
that have multiple input streams or emit collections of data will emit
400
+
the metadata associated with the emitted data as a collection.
387
401
388
402
389
403
Reference Counting and Checkpointing
390
404
------------------------------------
391
405
392
-
Checkpointing is achieved in Streamz through the use of reference counting. With this method, a checkpoint can be saved when and only when data has progressed through all of the the pipeline without any issues. This prevents data loss and guarantees at-least-once semantics.
393
-
394
-
Any node that caches or holds data after it returns increments the reference counter associated with the given data by one. When a node is no longer holding the data, it will release it by decrementing the counter by one. When the counter changes to zero, a callback associated with the data is triggered.
395
-
396
-
References are passed in the metadata as a value of the `ref` keyword. Each metadata object contains only one reference counter object.
406
+
Checkpointing is achieved in Streamz through the use of reference
407
+
counting. With this method, a checkpoint can be saved when and only
408
+
when data has progressed through all of the the pipeline without any
409
+
issues. This prevents data loss and guarantees at-least-once
410
+
semantics.
411
+
412
+
Any node that caches or holds data after it returns increments the
413
+
reference counter associated with the given data by one. When a node
414
+
is no longer holding the data, it will release it by decrementing the
415
+
counter by one. When the counter changes to zero, a callback
416
+
associated with the data is triggered.
417
+
418
+
References are passed in the metadata as a value of the `ref`
419
+
keyword. Each metadata object contains only one reference counter
A variety of tools are available to help you understand, debug, visualize your streaming objects:
4
+
A variety of tools are available to help you understand, debug,
5
+
visualize your streaming objects:
5
6
6
-
- Most Streamz objects automatically display themselves in Jupyter notebooks, periodically updating their visual representation as text or tables by registering events with the Tornado IOLoop used by Jupyter
7
-
- The network graph underlying a stream can be visualized using `dot` to render a PNG using `Stream.visualize(filename)`
8
-
- Streaming data can be visualized using the optional separate packages hvPlot, HoloViews, and Panel (see below)
7
+
- Most Streamz objects automatically display themselves in Jupyter
8
+
notebooks, periodically updating their visual representation as text
9
+
or tables by registering events with the Tornado IOLoop used by Jupyter
10
+
- The network graph underlying a stream can be visualized using `dot` to
11
+
render a PNG using `Stream.visualize(filename)`
12
+
- Streaming data can be visualized using the optional separate packages
13
+
hvPlot, HoloViews, and Panel (see below)
9
14
10
15
11
16
hvplot.streamz
12
17
--------------
13
18
14
-
hvPlot is a separate plotting library providing Bokeh-based plots for Pandas dataframes and a variety of other object types, including streamz DataFrame and Series objects.
19
+
hvPlot is a separate plotting library providing Bokeh-based plots for
20
+
Pandas dataframes and a variety of other object types, including
21
+
streamz DataFrame and Series objects.
15
22
16
-
See `hvplot.holoviz.org <https://hvplot.holoviz.org>`_ for instructions on how to install hvplot. Once it is installed, you can use the Pandas .plot() API to get a dynamically updating plot in Jupyter or in Bokeh/Panel Server:
23
+
See `hvplot.holoviz.org <https://hvplot.holoviz.org>`_ for
24
+
instructions on how to install hvplot. Once it is installed, you can
25
+
use the Pandas .plot() API to get a dynamically updating plot in
26
+
Jupyter or in Bokeh/Panel Server:
17
27
18
28
.. code-block:: python
19
29
@@ -23,15 +33,29 @@ See `hvplot.holoviz.org <https://hvplot.holoviz.org>`_ for instructions on how t
23
33
df = Random()
24
34
df.hvplot(backlog=100)
25
35
26
-
See the `streaming section <https://hvplot.holoviz.org/user_guide/Streaming.html>`_ of the hvPlot user guide for more details, and the
27
-
`dataframes.ipynb` example that comes with streamz for a simple runnable example.
36
+
See the `streaming section
37
+
<https://hvplot.holoviz.org/user_guide/Streaming.html>`_ of the hvPlot
38
+
user guide for more details, and the `dataframes.ipynb` example that
39
+
comes with streamz for a simple runnable example.
28
40
29
41
30
42
HoloViews
31
43
---------
32
-
hvPlot is built on HoloViews, and you can also use HoloViews directly if you want more control over events and how they are processed. See the `HoloViews user guide <http://holoviews.org/user_guide/Streaming_Data.html>`_ for more details.
44
+
45
+
hvPlot is built on HoloViews, and you can also use HoloViews directly
46
+
if you want more control over events and how they are processed. See
47
+
the `HoloViews user guide
48
+
<http://holoviews.org/user_guide/Streaming_Data.html>`_ for more
49
+
details.
33
50
34
51
35
52
Panel
36
53
-----
37
-
Panel is a general purpose dashboard and app framework, supporting a wide variety of displayable objects as "Panes". Panel provides a `streamz Pane <https://panel.holoviz.org/reference/panes/Streamz.html>`_ for rendering arbitrary streamz objects, and streamz DataFrames are handled by the Panel `DataFrame Pane <https://panel.holoviz.org/reference/panes/DataFrame.html>`_.
54
+
55
+
Panel is a general purpose dashboard and app framework, supporting a
56
+
wide variety of displayable objects as "Panes". Panel provides a
57
+
`streamz Pane
58
+
<https://panel.holoviz.org/reference/panes/Streamz.html>`_ for
59
+
rendering arbitrary streamz objects, and streamz DataFrames are
0 commit comments