Skip to content

Optimization of argmin/argmax/argsort #122863

@dg-pb

Description

@dg-pb

Feature or enhancement

Proposal:

I have been thinking of new ways to signal C so that the above can be avoided by special casing certain conditions to default to more efficient subroutines.

Various techniques have already been used for this, such as special casing itertools.count(start=int, step=1), so that incrementing is done with Py_ssize_t. However, new integer creation still occurs, even if it is passed to another C function.

A quick example to see the performance difference:

a = range(50_000)
b = list(a)
%timeit sum(a)    # 1.38 ms
%timeit sum(b)    # 0.38 ms

So this is the potential performance benefit when new Python integers do not need to be created.

By looking at https://discuss.python.org/t/add-index-bool-keyword-to-min-max-sorted/60257 I got an idea of potential signals that could be useful for certain cases.

  1. enumerate object. It is very simple object and is a subset of zip(count(), obj). Its input clearly signals that input iterable is an iterator that returns pairs, where first element is incremented by 1.
  2. operator.itemgetter & operator.attrgetter can be checked to determine what key/index is to be applied.

I think combination of these 2 could help optimise some of more common routines.

So I start with argmax/argmin/argsort. Although, these are not as frequently used as min/max/sort, but are standard routines nevertheless.

Pseudocode:

def min(a, key=None):
    if isinstance(a, enumerate) and isinstance(a, itemgetter) and len(a._items) == 1 and a._items[0] in {0, 1}:
        return min_enumerate(a, a._items[0])
    else:
        # do the usual

To make it happen in C, the following steps would be needed:

  1. enumerate
    a) Add PyEnumerate_Check
    b) Provide internal access to enumobject counter. To retrieve and re-set after.
  2. itemgetter
    a) Add PyItemgetter_Check
    b) Provide internal access to itemgetter._items
  3. Implement special cases in
    a) bltinmodule.c:min_max
    b) listobject.c:list_sort_impl

Performance before and after

a = list(range(100_000))
# Before
%timeit min(enumerate(a), key=opr.itemgetter(0))    # 7.5 ms
%timeit max(enumerate(a), key=opr.itemgetter(0))    # 6.1 ms

%timeit min(enumerate(a), key=opr.itemgetter(1))    # 7.8 ms
%timeit max(enumerate(a), key=opr.itemgetter(1))    # 6.1 ms

# After
%timeit min(enumerate(a), key=opr.itemgetter(0))    # 0.28 ms
%timeit max(enumerate(a), key=opr.itemgetter(0))    # 0.28 ms

%timeit min(enumerate(a), key=opr.itemgetter(1))    # 1.2 ms
%timeit max(enumerate(a), key=opr.itemgetter(1))    # 1.2 ms

# Current most efficient method
%timeit a.index(min(a))    # 1.4 (finds in first index)
%timeit a.index(max(a))    # 2.6 (finds in last index)

See discourse for discussions and use case analysis

Has this already been discussed elsewhere?

I have already discussed this feature proposal on Discourse

Links to previous discussion of this feature:

https://discuss.python.org/t/add-index-bool-keyword-to-min-max-sorted/60257/36

Metadata

Metadata

Assignees

No one assigned

    Labels

    interpreter-core(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usagetype-featureA feature request or enhancement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions