-
Notifications
You must be signed in to change notification settings - Fork 41
Description
With Yarn package manager support we introduced another dependency to the project to deal with faster copies of large artifacts - reflink (commit 2937416). However, the library we used was created merely as a hobby attempt to solve this in Python for the time being since there hadn't been a native Python support for COW at the time. That project seems to have been abandoned since with zero activity but with a note that Python does already implement the functionality natively.
That said, while it is true that Python added means to achieve the same thing in the meantime via a new os
syscall mapping os.copy_file_range
, proper high-level primitives haven't been introduced to shutil yet. Compared to the copy_file_range
syscall the reflink libfrary used an alternative low-level C implementation relying on ioctl
combined with the FICLONE
flag because back then the copy_file_range
syscall wasn't considered stable or production ready. That has changed in the meantime and we should be able to come up with a pretty straightforward implementation based on copy_file_range
for what we need until high-level support lands in the shutil
module and ditch a dependency on a project that is an abandonware.
Implementation-wise the above could be simplified to the following pseudocode snippet:
# core/utils.py
def reflink_copy(src, dst, *):
try:
os.copy_file_range(src, dst, count_bytes)
except OSError as e:
if e.errno == errno.EXDEV or e.errno == errno.ENOSYS:
raise Cachi2Error("reflinks not supported")
raise from e
References:
- https://bugs.python.org/issue37157
- https://bugs.python.org/issue37159
- shutil: add reflink=False to file copy functions to control clone/CoW copies (use copy_file_range) python/cpython#81338 (more importantly this comment )
- gh-81340: Use
copy_file_range
inshutil.copyfile
copy functions python/cpython#93152 - https://man7.org/linux/man-pages/man2/copy_file_range.2.html
- https://lwn.net/Articles/659523/