Skip to content

Threads / Max Open Files & Cleanup Best Practice #336

@SoundsSerious

Description

@SoundsSerious

After successfully using diskcache for years some of my projects are encountering limits with max-open files, this had been documented here before #133. I have maybe 200 different disk-caches in my project for storing results to bound this for others.

Using twisted (async framework) I found that the high number of threads to access diskcache in a non-blocking way exacerbated this by created an open file per thread.

So the maximum number of open files = num procs x num_threads_per_proc x num_disk_caches assuming each thread process opens its own diskcache.

Ideally disk-cache would re-use its connections per thread (it may already I'm not sure) and then close the connection when the thread was no-longer alive. Perhaps this could be accomplished via a weakref storage of the thread objects that have called diskcache, and when they are garbage collected and would no longer appear in the thread tracking weakref, which would trigger the thread-cache instance to close / clean itself. This might be challenging to do on the fly via only __getitem__ calls

Not sure what the scope of use-case diskcache is aiming for here, however the best practice for me was to create access methods that closed the connection after its thread-pool operation was done. And additionally to close files at the end of the process to make sure nothing impacted a restart.

The first suggestion is simple enough for application implementation:

db.set(....)
db.close()

Secondly I found this script worked well to close all open files in all my subprocesses as well:

    def close_all_open_files(self):
        p = psutil.Process()
        subprocess = p.children(recursive=True)
        self.info(f'closing all open files for {os.getpid()} and {len(subprocess)} children')        
        errors = set()
        for proc in [p] + subprocess:
            for handler in proc.open_files():
                try:
                    self.info(f'closing {handler}')
                    os.close(handler.fd)
                except Exception as e:
                    errors.add(str(e))

        if errors:
            self.warning( f"{len(errors)} x Errors closing file| {str(errors)[0:1000]}")

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions