API

This part of the documentation covers Streamly’s public code interface.

Stream

class streamly.Stream(stream, length)

Provide a simple object to represent a stream resource with a known length for use with Streamly.

If the length is unknown, the user should just pass the raw stream object directly to Streamly.

Parameters:
  • stream – the file-like object
  • length (int) – the length of the stream

Streamly

class streamly.Streamly(*streams, binary=True, header_row_identifier=<streamly._Sentinel object>, header_row_end_identifier=<streamly._Sentinel object>, footer_identifier=None, retain_first_header_row=True)

Provide a wrapper for streams (aka file-like objects).

Parameters:
  • streams – one or more stream objects to be read. Each object can either be a stream object or some sort of container object that implements a stream attribute and optionally, a length attribute. i.e. streamly.Stream.
  • binary (bool) – whether or not the underlying streams return bytes when read. If it returns text, set this to False. Defaults to True.
  • header_row_identifier – the value to use to identify where the header row starts. If reading the stream returns bytes, this should be a byte string. If there is no header, explicitly pass None. Defaults to an empty byte string or empty string depending on the value of binary. I.e. the header row is encountered at the very start of the stream.
  • header_row_end_identifier – the value to use to identify where the header row ends. If reading the stream returns bytes, this should be a byte string. Defaults to a line feed byte string or a line feed string character depending on the value of binary.
  • footer_identifier – the value to use to identify where the footer starts. Defaults to None, i.e. no footer.
  • retain_first_header_row (bool) – whether or not the read method should retain the header row of the first stream. Headers are removed from the second stream onwards regardless.
Raises:

ValueError if no streams are passed.

Variables:
  • binary (bool) – see Parameters.
  • contains_header_row (bool) – True if header_row_identifier is not None.
  • contains_footer (bool) – True if footer_identifier is not None.
  • current_stream (dict) – The stream details that will be referenced on the next read operation.
  • current_stream_index (int) – The index of the current stream that will be referenced on the next read operation.
  • end_reached (bool) – True if the final underlying stream has been exhausted.
  • footer_identifier – See Parameters.
  • header_row_identifier – See Parameters.
  • header_row_end_identifier – See Parameters.
  • is_first_stream (bool) – True if the current stream is the first stream.
  • is_last_stream (bool) – True if the current stream is the last stream.
  • retain_first_header_row (bool) – See Parameters.
  • streams (list) – the list of streams passed on instantiation but as dicts with items that are used to track progress.
  • total_length (int) – The total length of all the streams. If any stream’s length is unknown, this value will be None.
  • total_length_read (int) – The total length read across all the streams.
  • total_streams (int) – The amount of underlying streams.
read(size=8192)

Read incrementally from the underlying streams.

Automatically handle the removal of headers and footers based on the instance properties and iterate through the underlying stream objects where there are more than one. Always return data of length, size, unless the underlying streams are exhausted. The subsequent read will return an empty byte string or empty string depending on self.binary.

Parameters:size (int) – the length to return
Returns:either a byte string or string depending on what the underlying streams return when read