Skip to content

Replace reflink dependency with an implementation using Python stdlib primitives #483

@eskultety

Description

@eskultety

With Yarn package manager support we introduced another dependency to the project to deal with faster copies of large artifacts - reflink (commit 2937416). However, the library we used was created merely as a hobby attempt to solve this in Python for the time being since there hadn't been a native Python support for COW at the time. That project seems to have been abandoned since with zero activity but with a note that Python does already implement the functionality natively.

That said, while it is true that Python added means to achieve the same thing in the meantime via a new os syscall mapping os.copy_file_range, proper high-level primitives haven't been introduced to shutil yet. Compared to the copy_file_range syscall the reflink libfrary used an alternative low-level C implementation relying on ioctl combined with the FICLONE flag because back then the copy_file_range syscall wasn't considered stable or production ready. That has changed in the meantime and we should be able to come up with a pretty straightforward implementation based on copy_file_range for what we need until high-level support lands in the shutil module and ditch a dependency on a project that is an abandonware.

Implementation-wise the above could be simplified to the following pseudocode snippet:

# core/utils.py

def reflink_copy(src, dst, *):
    try:
      os.copy_file_range(src, dst, count_bytes)
    except OSError as e:
      if e.errno == errno.EXDEV or e.errno == errno.ENOSYS:
          raise Cachi2Error("reflinks not supported")
      raise from e

References:

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions