Add file_read_backwards library

03eff2b0 · Дмитрий Никулин · Никита Ефремов · 16fbbe53 · 03eff2b0 · 03eff2b0
Commit 03eff2b0 authored Feb 25, 2019 by Дмитрий Никулин Committed by Никита Ефремов Apr 24, 2019
10 changed files
--- a/vendor/file_read_backwards-2.0.0.dist-info/DESCRIPTION.rst
+++ b/vendor/file_read_backwards-2.0.0.dist-info/DESCRIPTION.rst
+===============================
+file_read_backwards
+===============================
+.. image:: https://img.shields.io/pypi/v/file_read_backwards.svg
+        :target: https://pypi.python.org/pypi/file_read_backwards
+.. image:: https://img.shields.io/travis/RobinNil/file_read_backwards.svg?branch=master
+        :target: https://travis-ci.org/RobinNil/file_read_backwards.svg?branch=master
+.. image:: https://readthedocs.org/projects/file-read-backwards/badge/?version=latest
+        :target: https://file-read-backwards.readthedocs.io/en/latest/?badge=latest
+        :alt: Documentation Status
+.. image:: https://pyup.io/repos/github/RobinNil/file_read_backwards/shield.svg
+     :target: https://pyup.io/repos/github/RobinNil/file_read_backwards/
+     :alt: Updates
+Memory efficient way of reading files line-by-line from the end of file
+* Free software: MIT license
+* Documentation: https://file-read-backwards.readthedocs.io.
+Features
+--------
+This package is for reading file backward line by line as unicode in a memory efficient manner for both Python 2.7 and Python 3.
+It currently supports ascii, latin-1, and utf-8 encodings.
+It supports "\\r", "\\r\\n", and "\\n" as new lines.
+Usage Examples
+--------------
+An example of using `file_read_backwards` for `python2.7`::
+    #!/usr/bin/env python2.7
+    from file_read_backwards import FileReadBackwards
+    with FileReadBackwards("/tmp/file", encoding="utf-8") as frb:
+        # getting lines by lines starting from the last line up
+        for l in frb:
+            print l
+Another example using `python3.3`::
+    from file_read_backwards import FileReadBackwards
+    with FileReadBackwards("/tmp/file", encoding="utf-8") as frb:
+        # getting lines by lines starting from the last line up
+        for l in frb:
+            print(l)
+Another way to consume the file is via `readline()`, in `python3.3`::
+    from file_read_backwards import FileReadBackwards
+    with FileReadBackwards("/tmp/file", encoding="utf-8") as frb:
+        while True:
+            l = frb.readline()
+            if not l:
+                break
+            print(l, end="")
+Credits
+---------
+This package was created with Cookiecutter_ and the `audreyr/cookiecutter-pypackage`_ project template.
+.. _Cookiecutter: https://github.com/audreyr/cookiecutter
+.. _`audreyr/cookiecutter-pypackage`: https://github.com/audreyr/cookiecutter-pypackage
+=======
+History
+=======
+1.0.0 (2016-12-18)
+------------------
+* First release on PyPI.
+1.1.0 (2016-12-31)
+------------------
+* Added support for "latin-1".
+* Marked the package "Production/Stable".
+1.1.1 (2017-01-09)
+------------------
+* Updated README.rst for more clarity around encoding support and Python 2.7 and 3 support.
+1.1.2 (2017-01-11)
+------------------
+* Documentation re-arrangement. Usage examples are now in README.rst
+* Minor refactoring
+1.2.0 (2017-09-01)
+------------------
+* Include context manager style as it provides cleaner/automatic close functionality
+1.2.1 (2017-09-02)
+------------------
+* Made doc strings consistent to Google style and some code linting
+1.2.2 (2017-11-19)
+------------------
+* Re-release of 1.2.1 for ease of updating pypi page for updated travis & pyup.
+2.0.0 (2018-03-23)
+------------------
+Mimicing Python file object behavior.
+* FileReadBackwards no longer creates multiple iterators (a change of behavior from 1.x.y version)
+* Adding readline() function retuns one line at a time with a trailing new line and empty string when it reaches end of file.
+  The fine print: the trailing new line will be `os.linesep` (rather than whichever new line type in the file).
--- a/vendor/file_read_backwards-2.0.0.dist-info/INSTALLER
+++ b/vendor/file_read_backwards-2.0.0.dist-info/INSTALLER
+pip
--- a/vendor/file_read_backwards-2.0.0.dist-info/METADATA
+++ b/vendor/file_read_backwards-2.0.0.dist-info/METADATA
+Metadata-Version: 2.0
+Name: file-read-backwards
+Version: 2.0.0
+Summary: Memory efficient way of reading files line-by-line from the end of file
+Home-page: https://github.com/RobinNil/file_read_backwards
+Author: Robin Robin
+Author-email: robinsquare42@gmail.com
+License: MIT license
+Keywords: file_read_backwards
+Platform: UNKNOWN
+Classifier: Development Status :: 5 - Production/Stable
+Classifier: Intended Audience :: Developers
+Classifier: License :: OSI Approved :: MIT License
+Classifier: Natural Language :: English
+Classifier: Programming Language :: Python :: 2.7
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.3
+Classifier: Programming Language :: Python :: 3.4
+Classifier: Programming Language :: Python :: 3.5
+===============================
+file_read_backwards
+===============================
+.. image:: https://img.shields.io/pypi/v/file_read_backwards.svg
+        :target: https://pypi.python.org/pypi/file_read_backwards
+.. image:: https://img.shields.io/travis/RobinNil/file_read_backwards.svg?branch=master
+        :target: https://travis-ci.org/RobinNil/file_read_backwards.svg?branch=master
+.. image:: https://readthedocs.org/projects/file-read-backwards/badge/?version=latest
+        :target: https://file-read-backwards.readthedocs.io/en/latest/?badge=latest
+        :alt: Documentation Status
+.. image:: https://pyup.io/repos/github/RobinNil/file_read_backwards/shield.svg
+     :target: https://pyup.io/repos/github/RobinNil/file_read_backwards/
+     :alt: Updates
+Memory efficient way of reading files line-by-line from the end of file
+* Free software: MIT license
+* Documentation: https://file-read-backwards.readthedocs.io.
+Features
+--------
+This package is for reading file backward line by line as unicode in a memory efficient manner for both Python 2.7 and Python 3.
+It currently supports ascii, latin-1, and utf-8 encodings.
+It supports "\\r", "\\r\\n", and "\\n" as new lines.
+Usage Examples
+--------------
+An example of using `file_read_backwards` for `python2.7`::
+    #!/usr/bin/env python2.7
+    from file_read_backwards import FileReadBackwards
+    with FileReadBackwards("/tmp/file", encoding="utf-8") as frb:
+        # getting lines by lines starting from the last line up
+        for l in frb:
+            print l
+Another example using `python3.3`::
+    from file_read_backwards import FileReadBackwards
+    with FileReadBackwards("/tmp/file", encoding="utf-8") as frb:
+        # getting lines by lines starting from the last line up
+        for l in frb:
+            print(l)
+Another way to consume the file is via `readline()`, in `python3.3`::
+    from file_read_backwards import FileReadBackwards
+    with FileReadBackwards("/tmp/file", encoding="utf-8") as frb:
+        while True:
+            l = frb.readline()
+            if not l:
+                break
+            print(l, end="")
+Credits
+---------
+This package was created with Cookiecutter_ and the `audreyr/cookiecutter-pypackage`_ project template.
+.. _Cookiecutter: https://github.com/audreyr/cookiecutter
+.. _`audreyr/cookiecutter-pypackage`: https://github.com/audreyr/cookiecutter-pypackage
+=======
+History
+=======
+1.0.0 (2016-12-18)
+------------------
+* First release on PyPI.
+1.1.0 (2016-12-31)
+------------------
+* Added support for "latin-1".
+* Marked the package "Production/Stable".
+1.1.1 (2017-01-09)
+------------------
+* Updated README.rst for more clarity around encoding support and Python 2.7 and 3 support.
+1.1.2 (2017-01-11)
+------------------
+* Documentation re-arrangement. Usage examples are now in README.rst
+* Minor refactoring
+1.2.0 (2017-09-01)
+------------------
+* Include context manager style as it provides cleaner/automatic close functionality
+1.2.1 (2017-09-02)
+------------------
+* Made doc strings consistent to Google style and some code linting
+1.2.2 (2017-11-19)
+------------------
+* Re-release of 1.2.1 for ease of updating pypi page for updated travis & pyup.
+2.0.0 (2018-03-23)
+------------------
+Mimicing Python file object behavior.
+* FileReadBackwards no longer creates multiple iterators (a change of behavior from 1.x.y version)
+* Adding readline() function retuns one line at a time with a trailing new line and empty string when it reaches end of file.
+  The fine print: the trailing new line will be `os.linesep` (rather than whichever new line type in the file).
--- a/vendor/file_read_backwards-2.0.0.dist-info/RECORD
+++ b/vendor/file_read_backwards-2.0.0.dist-info/RECORD
+file_read_backwards-2.0.0.dist-info/DESCRIPTION.rst,sha256=UXNL9zcu_H5XjeCfnxqhADk3kQg5WS8qx8_GkyKDnv0,3647
+file_read_backwards-2.0.0.dist-info/INSTALLER,sha256=zuuue4knoyJ-UwPPXg8fezS7VCrXJQrAP7zeNuwvFQg,4
+file_read_backwards-2.0.0.dist-info/METADATA,sha256=SNqkrzocPrhWxro5NSVc6IVjEuy_ivz9UkP5XoTAQ_w,4417
+file_read_backwards-2.0.0.dist-info/RECORD,,
+file_read_backwards-2.0.0.dist-info/WHEEL,sha256=kdsN-5OJAZIiHN-iO4Rhl82KyS0bDWf4uBwMbkNafr8,110
+file_read_backwards-2.0.0.dist-info/metadata.json,sha256=J2rLVwakld4LYHi1CVSoZMAhxTAK-T5i0gbCDienb38,942
+file_read_backwards-2.0.0.dist-info/top_level.txt,sha256=J0c-zzN9i4B3noENqqGllyULovoXYowT-_VsvP5obD8,20
+file_read_backwards/__init__.py,sha256=EgTdw29vRAhhLjqLt6AIH-trsQOcv9w843hhm43x1tA,182
+file_read_backwards/__pycache__/__init__.cpython-36.pyc,,
+file_read_backwards/__pycache__/buffer_work_space.cpython-36.pyc,,
+file_read_backwards/__pycache__/file_read_backwards.cpython-36.pyc,,
+file_read_backwards/buffer_work_space.py,sha256=7OW2fFMeEB_HRamzOQigEabkFiCmLO50_byO9D1E6oM,6446
+file_read_backwards/file_read_backwards.py,sha256=Gi-P6vNTWtlR9J_2o0OnWEsDkZdjaJcQGkxstvIICvA,4069
--- a/vendor/file_read_backwards-2.0.0.dist-info/WHEEL
+++ b/vendor/file_read_backwards-2.0.0.dist-info/WHEEL
+Wheel-Version: 1.0
+Generator: bdist_wheel (0.30.0)
+Root-Is-Purelib: true
+Tag: py2-none-any
+Tag: py3-none-any
--- a/vendor/file_read_backwards-2.0.0.dist-info/metadata.json
+++ b/vendor/file_read_backwards-2.0.0.dist-info/metadata.json
+{"classifiers": ["Development Status :: 5 - Production/Stable", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Natural Language :: English", "Programming Language :: Python :: 2.7", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.3", "Programming Language :: Python :: 3.4", "Programming Language :: Python :: 3.5"], "extensions": {"python.details": {"contacts": [{"email": "robinsquare42@gmail.com", "name": "Robin Robin", "role": "author"}], "document_names": {"description": "DESCRIPTION.rst"}, "project_urls": {"Home": "https://github.com/RobinNil/file_read_backwards"}}}, "generator": "bdist_wheel (0.30.0)", "keywords": ["file_read_backwards"], "license": "MIT license", "metadata_version": "2.0", "name": "file-read-backwards", "summary": "Memory efficient way of reading files line-by-line from the end of file", "test_requires": [{"requires": ["mock"]}], "version": "2.0.0"}
\ No newline at end of file
--- a/vendor/file_read_backwards-2.0.0.dist-info/top_level.txt
+++ b/vendor/file_read_backwards-2.0.0.dist-info/top_level.txt
+file_read_backwards
--- a/vendor/file_read_backwards/__init__.py
+++ b/vendor/file_read_backwards/__init__.py
+# -*- coding: utf-8 -*-
+from .file_read_backwards import FileReadBackwards  # noqa: F401
+__author__ = """Robin Robin"""
+__email__ = 'robinsquare42@gmail.com'
+__version__ = '2.0.0'
--- a/vendor/file_read_backwards/buffer_work_space.py
+++ b/vendor/file_read_backwards/buffer_work_space.py
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+"""BufferWorkSpace module."""
+import os
+new_lines = ["\r\n", "\n", "\r"]
+new_lines_bytes = [n.encode("ascii") for n in new_lines]  # we only support encodings that's backward compat with ascii
+class BufferWorkSpace:
+    """It is a helper module for FileReadBackwards."""
+    def __init__(self, fp, chunk_size):
+        """Convention for the data.
+        When read_buffer is not None, it represents contents of the file from `read_position` onwards
+            that has not been processed/returned.
+        read_position represents the file pointer position that has been read into read_buffer
+            initialized to be just past the end of file.
+        """
+        self.fp = fp
+        self.read_position = _get_file_size(self.fp)  # set the previously read position to the
+        self.read_buffer = None
+        self.chunk_size = chunk_size
+    def add_to_buffer(self, content, read_position):
+        """Add additional bytes content as read from the read_position.
+        Args:
+            content (bytes): data to be added to buffer working BufferWorkSpac.
+            read_position (int): where in the file pointer the data was read from.
+        """
+        self.read_position = read_position
+        if self.read_buffer is None:
+            self.read_buffer = content
+        else:
+            self.read_buffer = content + self.read_buffer
+    def yieldable(self):
+        """Return True if there is a line that the buffer can return, False otherwise."""
+        if self.read_buffer is None:
+            return False
+        t = _remove_trailing_new_line(self.read_buffer)
+        n = _find_furthest_new_line(t)
+        if n >= 0:
+            return True
+        # we have read in entire file and have some unprocessed lines
+        if self.read_position == 0 and self.read_buffer is not None:
+            return True
+        return False
+    def return_line(self):
+        """Return a new line if it is available.
+        Precondition: self.yieldable() must be True
+        """
+        assert(self.yieldable())
+        t = _remove_trailing_new_line(self.read_buffer)
+        i = _find_furthest_new_line(t)
+        if i >= 0:
+            l = i + 1
+            after_new_line = slice(l, None)
+            up_to_include_new_line = slice(0, l)
+            r = t[after_new_line]
+            self.read_buffer = t[up_to_include_new_line]
+        else:  # the case where we have read in entire file and at the "last" line
+            r = t
+            self.read_buffer = None
+        return r
+    def read_until_yieldable(self):
+        """Read in additional chunks until it is yieldable."""
+        while not self.yieldable():
+            read_content, read_position = _get_next_chunk(self.fp, self.read_position, self.chunk_size)
+            self.add_to_buffer(read_content, read_position)
+    def has_returned_every_line(self):
+        """Return True if every single line in the file has been returned, False otherwise."""
+        if self.read_position == 0 and self.read_buffer is None:
+            return True
+        return False
+def _get_file_size(fp):
+    return os.fstat(fp.fileno()).st_size
+def _get_next_chunk(fp, previously_read_position, chunk_size):
+    """Return next chunk of data that we would from the file pointer.
+    Args:
+        fp: file-like object
+        previously_read_position: file pointer position that we have read from
+        chunk_size: desired read chunk_size
+    Returns:
+        (bytestring, int): data that has been read in, the file pointer position where the data has been read from
+    """
+    seek_position, read_size = _get_what_to_read_next(fp, previously_read_position, chunk_size)
+    fp.seek(seek_position)
+    read_content = fp.read(read_size)
+    read_position = seek_position
+    return read_content, read_position
+def _get_what_to_read_next(fp, previously_read_position, chunk_size):
+    """Return information on which file pointer position to read from and how many bytes.
+    Args:
+        fp
+        past_read_positon (int): The file pointer position that has been read previously
+        chunk_size(int): ideal io chunk_size
+    Returns:
+        (int, int): The next seek position, how many bytes to read next
+    """
+    seek_position = max(previously_read_position - chunk_size, 0)
+    read_size = chunk_size
+    # examples: say, our new_lines are potentially "\r\n", "\n", "\r"
+    # find a reading point where it is not "\n", rewind further if necessary
+    # if we have "\r\n" and we read in "\n",
+    # the next iteration would treat "\r" as a different new line.
+    # Q: why don't I just check if it is b"\n", but use a function ?
+    # A: so that we can potentially expand this into generic sets of separators, later on.
+    while seek_position > 0:
+        fp.seek(seek_position)
+        if _is_partially_read_new_line(fp.read(1)):
+            seek_position -= 1
+            read_size += 1  # as we rewind further, let's make sure we read more to compensate
+        else:
+            break
+    # take care of special case when we are back to the beginnin of the file
+    read_size = min(previously_read_position - seek_position, read_size)
+    return seek_position, read_size
+def _remove_trailing_new_line(l):
+    """Remove a single instance of new line at the end of l if it exists.
+    Returns:
+        bytestring
+    """
+    # replace only 1 instance of newline
+    # match longest line first (hence the reverse=True), we want to match "\r\n" rather than "\n" if we can
+    for n in sorted(new_lines_bytes, key=lambda x: len(x), reverse=True):
+        if l.endswith(n):
+            remove_new_line = slice(None, -len(n))
+            return l[remove_new_line]
+    return l
+def _find_furthest_new_line(read_buffer):
+    """Return -1 if read_buffer does not contain new line otherwise the position of the rightmost newline.
+    Args:
+        read_buffer (bytestring)
+    Returns:
+        int: The right most position of new line character in read_buffer if found, else -1
+    """
+    new_line_positions = [read_buffer.rfind(n) for n in new_lines_bytes]
+    return max(new_line_positions)
+def _is_partially_read_new_line(b):
+    """Return True when b is part of a new line separator found at index >= 1, False otherwise.
+    Args:
+        b (bytestring)
+    Returns:
+        bool
+    """
+    for n in new_lines_bytes:
+        if n.find(b) >= 1:
+            return True
+    return False
--- a/vendor/file_read_backwards/file_read_backwards.py
+++ b/vendor/file_read_backwards/file_read_backwards.py
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+"""FileReadBackwards module."""
+import io
+import os
+from .buffer_work_space import BufferWorkSpace
+supported_encodings = ["utf-8", "ascii", "latin-1"]  # any encodings that are backward compatible with ascii should work
+class FileReadBackwards:
+    """Class definition for `FileReadBackwards`.
+    A `FileReadBackwards` will spawn a `FileReadBackwardsIterator` and keep an opened file handler.
+    It can be used as a Context Manager. If done so, when exited, it will close its file handler.
+    In any mode, `close()` can be called to close the file handler..
+    """
+    def __init__(self, path, encoding="utf-8", chunk_size=io.DEFAULT_BUFFER_SIZE):
+        """Constructor for FileReadBackwards.
+        Args:
+            path: Path to the file to be read
+            encoding (str): Encoding
+            chunk_size (int): How many bytes to read at a time
+        """
+        if encoding.lower() not in supported_encodings:
+            error_message = "{0} encoding was not supported/tested.".format(encoding)
+            error_message += "Supported encodings are '{0}'".format(",".join(supported_encodings))
+            raise NotImplementedError(error_message)
+        self.path = path
+        self.encoding = encoding.lower()
+        self.chunk_size = chunk_size
+        self.iterator = FileReadBackwardsIterator(io.open(self.path, mode="rb"), self.encoding, self.chunk_size)
+    def __iter__(self):
+        """Return its iterator."""
+        return self.iterator
+    def __enter__(self):
+        return self
+    def __exit__(self, exc_type, exc_val, exc_tb):
+        """Closes all opened its file handler and propagates all exceptions on exit."""
+        self.close()
+        return False
+    def close(self):
+        """Closes all opened it s file handler."""
+        self.iterator.close()
+    def readline(self):
+        """Return a line content (with a trailing newline) if there are content. Return '' otherwise."""
+        try:
+            r = next(self.iterator) + os.linesep
+            return r
+        except StopIteration:
+            return ""
+class FileReadBackwardsIterator:
+    """Iterator for `FileReadBackwards`.
+    This will read backwards line by line a file. It holds an opened file handler.
+    """
+    def __init__(self, fp, encoding, chunk_size):
+        """Constructor for FileReadBackwardsIterator
+        Args:
+            fp (File): A file that we wish to start reading backwards from
+            encoding (str): Encoding of the file
+            chunk_size (int): How many bytes to read at a time
+        """
+        self.path = fp.name
+        self.encoding = encoding
+        self.chunk_size = chunk_size
+        self.__fp = fp
+        self.__buf = BufferWorkSpace(self.__fp, self.chunk_size)
+    def __iter__(self):
+        return self
+    def next(self):
+        """Returns unicode string from the last line until the beginning of file.
+        Gets exhausted if::
+            * already reached the beginning of the file on previous iteration
+            * the file got closed
+        When it gets exhausted, it closes the file handler.
+        """
+        # Using binary mode, because some encodings such as "utf-8" use variable number of
+        # bytes to encode different Unicode points.
+        # Without using binary mode, we would probably need to understand each encoding more
+        # and do the seek operations to find the proper boundary before issuing read
+        if self.closed:
+            raise StopIteration
+        if self.__buf.has_returned_every_line():
+            self.close()
+            raise StopIteration
+        self.__buf.read_until_yieldable()
+        r = self.__buf.return_line()
+        return r.decode(self.encoding)
+    __next__ = next
+    @property
+    def closed(self):
+        """The status of the file handler.
+        :return: True if the file handler is still opened. False otherwise.
+        """
+        return self.__fp.closed
+    def close(self):
+        """Closes the file handler."""
+        self.__fp.close()