API¶
This part of the documentation covers Streamly’s public code interface.
Stream¶
-
class
streamly.
Stream
(stream, length)¶ Provide a simple object to represent a stream resource with a known length for use with
Streamly
.If the length is unknown, the user should just pass the raw stream object directly to Streamly.
Parameters: - stream – the file-like object
- length (int) – the length of the stream
Streamly¶
-
class
streamly.
Streamly
(*streams, binary=True, header_row_identifier=<streamly._Sentinel object>, header_row_end_identifier=<streamly._Sentinel object>, footer_identifier=None, retain_first_header_row=True)¶ Provide a wrapper for streams (aka file-like objects).
Parameters: - streams – one or more stream objects to be read. Each object can either be a stream object or some sort of
container object that implements a stream attribute and optionally, a length attribute. i.e.
streamly.Stream
. - binary (bool) – whether or not the underlying streams return bytes when read. If it returns text, set this to
False
. Defaults toTrue
. - header_row_identifier – the value to use to identify where the header row starts. If reading the stream
returns bytes, this should be a byte string. If there is no header, explicitly pass
None
. Defaults to an empty byte string or empty string depending on the value of binary. I.e. the header row is encountered at the very start of the stream. - header_row_end_identifier – the value to use to identify where the header row ends. If reading the stream returns bytes, this should be a byte string. Defaults to a line feed byte string or a line feed string character depending on the value of binary.
- footer_identifier – the value to use to identify where the footer starts. Defaults to
None
, i.e. no footer. - retain_first_header_row (bool) – whether or not the read method should retain the header row of the first stream. Headers are removed from the second stream onwards regardless.
Raises: ValueError if no streams are passed.
Variables: - binary (bool) – see Parameters.
- contains_header_row (bool) –
True
if header_row_identifier is notNone
. - contains_footer (bool) –
True
if footer_identifier is notNone
. - current_stream (dict) – The stream details that will be referenced on the next read operation.
- current_stream_index (int) – The index of the current stream that will be referenced on the next read operation.
- end_reached (bool) –
True
if the final underlying stream has been exhausted. - footer_identifier – See Parameters.
- header_row_identifier – See Parameters.
- header_row_end_identifier – See Parameters.
- is_first_stream (bool) –
True
if the current stream is the first stream. - is_last_stream (bool) –
True
if the current stream is the last stream. - retain_first_header_row (bool) – See Parameters.
- streams (list) – the list of streams passed on instantiation but as dicts with items that are used to track progress.
- total_length (int) – The total length of all the streams. If any stream’s length is unknown, this value will be
None
. - total_length_read (int) – The total length read across all the streams.
- total_streams (int) – The amount of underlying streams.
-
read
(size=8192)¶ Read incrementally from the underlying streams.
Automatically handle the removal of headers and footers based on the instance properties and iterate through the underlying stream objects where there are more than one. Always return data of length, size, unless the underlying streams are exhausted. The subsequent read will return an empty byte string or empty string depending on self.binary.
Parameters: size (int) – the length to return Returns: either a byte string or string depending on what the underlying streams return when read
- streams – one or more stream objects to be read. Each object can either be a stream object or some sort of
container object that implements a stream attribute and optionally, a length attribute. i.e.