py/objint_*: Convert back to small int after binary op. #9531

jimmo · 2022-10-06T04:08:00Z

Inspired by #9516 (comment) -- this behaviour was a surprise to me.

Before this change, long/mpz ints propagated into all future calculations.

For example, a relatively common operation like x = a * b // c where a,b,c all small ints would always result in a long/mpz int, even if it didn't need to, and then this would impact all future calculations with x.

This adds +24 bytes on PYBV11 but avoids heap allocations and potential surprises (e.g. big-big is now a small 0, and can safely be accessed with MP_OBJ_SMALL_INT_VALUE).

There are several places in the int_big* tests that result in small integers (especially zero). I don't think this change makes those tests less valid -- it still correctly checks that they evaluate to the right thing. What we're missing is a way to check that you can print a small-valued mpz (as all these tests will now be going via small int printing).

The longlong implementation doesn't have tests, but I adapted the new int_big_to_small.py to run on a 32-bit Linux build.

This work was funded through GitHub Sponsors.

codecov-commenter · 2022-10-06T04:28:03Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.42%. Comparing base (0600e4f) to head (557d31e).

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #9531   +/-   ##
=======================================
  Coverage   98.42%   98.42%           
=======================================
  Files         161      161           
  Lines       21249    21251    +2     
=======================================
+ Hits        20915    20917    +2     
  Misses        334      334

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

jimmo · 2022-10-06T04:45:28Z

The coverage results are pretty much as expected:

mpz_not_inpl (no longer tests ~0)
mpz_as_bytes (fill rest of buffer with sign-ext or zero)
mp_obj_int_get_checked no longer ever hits the "is actually a small int" path
mp_obj_get_int_maybe (always small int)
mp_obj_id (always small int)

I guess this can all be addressed through coverage.c. @dpgeorge if you want to go ahead with this PR, then I can do that.

dpgeorge · 2022-10-11T11:13:31Z

Well... maybe we can delete all that code that's no longer "reachable"...? If we can't write tests for it in Python, it's not strictly needed.

dpgeorge · 2022-10-11T11:14:21Z

Also would be worth running the performance benchmarks, to see if they change.

jimmo · 2022-10-11T12:54:47Z

Well... maybe we can delete all that code that's no longer "reachable"...? If we can't write tests for it in Python, it's not strictly needed.

You make a very good point. I'm not sure if I'm brave enough to remove them though and it is a very small number of lines.

I realised that pow3 can be used to make "small big ints" still, but we should probably also add the same logic there too.

dpgeorge · 2023-05-24T01:00:08Z

py/objint_mpz.c

@@ -308,6 +308,13 @@ mp_obj_t mp_obj_int_binary_op(mp_binary_op_t op, mp_obj_t lhs_in, mp_obj_t rhs_i
            }
        }

+        mp_int_t res_small;
+        if (mpz_as_int_checked(&res->mpz, &res_small)) {


My main concern with this PR is that this may impact performance, because now all operations try to convert to a small int.

We should run the benchmark suite to check.

Performance testing has been done, see below.

dpgeorge · 2024-07-01T02:23:29Z

I ran benchmark tests on this PR, on PYBV10:

diff of scores (higher is better)
N=100 M=100                  baseline ->    this-PR         diff      diff% (error%)
bm_chaos.py                    354.08 ->     355.34 :      +1.26 =  +0.356% (+/-0.03%)
bm_fannkuch.py                  73.86 ->      73.45 :      -0.41 =  -0.555% (+/-0.00%)
bm_fft.py                     2308.99 ->    2299.43 :      -9.56 =  -0.414% (+/-0.00%)
bm_float.py                   5735.70 ->    5753.49 :     +17.79 =  +0.310% (+/-0.01%)
bm_hexiom.py                    46.49 ->      46.53 :      +0.04 =  +0.086% (+/-0.00%)
bm_nqueens.py                 4222.19 ->    4202.58 :     -19.61 =  -0.464% (+/-0.00%)
bm_pidigits.py                 623.05 ->     676.10 :     +53.05 =  +8.515% (+/-0.21%)
bm_wordcount.py                 46.95 ->      46.47 :      -0.48 =  -1.022% (+/-0.00%)
core_import_mpy_multi.py       599.15 ->     599.81 :      +0.66 =  +0.110% (+/-0.01%)
core_import_mpy_single.py       97.62 ->      97.71 :      +0.09 =  +0.092% (+/-0.01%)
core_locals.py                  41.47 ->      41.13 :      -0.34 =  -0.820% (+/-0.00%)
core_qstr.py                   201.69 ->     202.16 :      +0.47 =  +0.233% (+/-0.00%)
core_str.py                     27.28 ->      27.34 :      +0.06 =  +0.220% (+/-0.00%)
core_yield_from.py             359.09 ->     358.79 :      -0.30 =  -0.084% (+/-0.00%)
misc_aes.py                    399.52 ->     404.30 :      +4.78 =  +1.196% (+/-0.00%)
misc_mandel.py                2973.05 ->    2991.61 :     +18.56 =  +0.624% (+/-0.01%)
misc_pystone.py               2348.74 ->    2371.03 :     +22.29 =  +0.949% (+/-0.00%)
misc_raytrace.py               369.76 ->     371.55 :      +1.79 =  +0.484% (+/-0.01%)
viper_call0.py                 503.99 ->     503.97 :      -0.02 =  -0.004% (+/-0.01%)
viper_call1a.py                527.68 ->     527.68 :      +0.00 =  +0.000% (+/-0.00%)
viper_call1b.py                380.63 ->     380.63 :      +0.00 =  +0.000% (+/-0.00%)
viper_call1c.py                384.99 ->     384.98 :      -0.01 =  -0.003% (+/-0.01%)
viper_call2a.py                514.71 ->     514.73 :      +0.02 =  +0.004% (+/-0.01%)
viper_call2b.py                335.73 ->     335.73 :      +0.00 =  +0.000% (+/-0.00%)

The main test is bm_pidigits.py which makes heavy use of big-integer arithmetic. And that test runs fasater! I guess because there's now less memory allocations, and smaller integers are used where possible which are faster.

github-actions · 2024-07-01T02:40:42Z

Code size report:

   bare-arm:    +0 +0.000% 
minimal x86:    +0 +0.000% 
   unix x64:   +48 +0.006% standard
      stm32:   +32 +0.008% PYBV10
     mimxrt:   +24 +0.007% TEENSY40
        rp2:   +32 +0.004% RPI_PICO_W
       samd:   +24 +0.009% ADAFRUIT_ITSYBITSY_M4_EXPRESS

dpgeorge · 2024-07-01T03:39:39Z

The coverage results are pretty much as expected
...
I guess this can all be addressed through coverage.c

Visiting this PR 20 months later, there's a lot less uncovered code now. There are only two small blocks that are now uncovered due to this PR: in mpz_not_inpl() when the argument is zero, and mp_obj_int_get_checked() when the argument is an mpz which is a small-int. It's not worth removing these small bits of code, in case they are needed in the future. So they have been left as-is and unix coverage tests added for them.

Before this change, long/mpz ints propagated into all future calculations, even if their value could fit in a small-int object. With this change, the result of a big-int binary op will now be converted to a small-int object if the value fits in a small-int. For example, a relatively common operation like `x = a * b // c` where a,b,c all small ints would always result in a long/mpz int, even if it didn't need to, and then this would impact all future calculations with x. This adds +24 bytes on PYBV11 but avoids heap allocations and potential surprises (e.g. `big-big` is now a small `0`, and can safely be accessed with MP_OBJ_SMALL_INT_VALUE). Performance tests are unchanged on PYBV10, except for `bm_pidigits.py` which makes heavy use of big-ints and gains about 8% in speed. Unix coverage tests have been updated to cover mpz code that is now unreachable by normal Python code (removing the unreachable code would lead to some surprising gaps in the internal C functions and the functionality may be needed in the future, so it is kept because it has minimal overhead). This work was funded through GitHub Sponsors. Signed-off-by: Jim Mussared <[email protected]>

Add four Pimoroni RP2350 boards

jimmo force-pushed the mpz-to-small-int branch from 77e1ada to 06c09d3 Compare October 6, 2022 04:10

dpgeorge added ports Relates to multiple ports, or a new/proposed port py-core Relates to py/ directory in source and removed ports Relates to multiple ports, or a new/proposed port labels Oct 11, 2022

jimmo mentioned this pull request Jan 17, 2023

docs/reference/speed_python: Performance of large integers. #10486

Open

dpgeorge reviewed May 24, 2023

View reviewed changes

jimmo mentioned this pull request Oct 4, 2023

Something funny going on rp2 with ints and ticks_diff #12586

Closed

dpgeorge force-pushed the mpz-to-small-int branch from 06c09d3 to 4e995ee Compare July 1, 2024 02:30

dpgeorge force-pushed the mpz-to-small-int branch from 4e995ee to 6c0d72c Compare July 1, 2024 03:36

dpgeorge force-pushed the mpz-to-small-int branch from 6c0d72c to 557d31e Compare July 1, 2024 03:55

dpgeorge merged commit 557d31e into micropython:master Jul 1, 2024
63 of 64 checks passed

tannewt pushed a commit to tannewt/circuitpython that referenced this pull request Aug 27, 2024

Merge pull request micropython#9531 from tannewt/pimoroni_rp2350

dfc3e99

Add four Pimoroni RP2350 boards

dpgeorge mentioned this pull request Jul 21, 2025

code paths that return large ints where small ints would suffice #17730

Open

jepler mentioned this pull request Jul 29, 2025

Fix more operations that can return small ints #17785

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

py/objint_*: Convert back to small int after binary op. #9531

py/objint_*: Convert back to small int after binary op. #9531

Uh oh!

jimmo commented Oct 6, 2022

Uh oh!

codecov-commenter commented Oct 6, 2022 •

edited by codecov bot

Loading

Uh oh!

jimmo commented Oct 6, 2022

Uh oh!

dpgeorge commented Oct 11, 2022

Uh oh!

dpgeorge commented Oct 11, 2022

Uh oh!

jimmo commented Oct 11, 2022

Uh oh!

dpgeorge May 24, 2023

Uh oh!

dpgeorge Jul 1, 2024

Uh oh!

dpgeorge commented Jul 1, 2024

Uh oh!

github-actions bot commented Jul 1, 2024

Uh oh!

dpgeorge commented Jul 1, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

py/objint_*: Convert back to small int after binary op. #9531

py/objint_*: Convert back to small int after binary op. #9531

Uh oh!

Conversation

jimmo commented Oct 6, 2022

Uh oh!

codecov-commenter commented Oct 6, 2022 • edited by codecov bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jimmo commented Oct 6, 2022

Uh oh!

dpgeorge commented Oct 11, 2022

Uh oh!

dpgeorge commented Oct 11, 2022

Uh oh!

jimmo commented Oct 11, 2022

Uh oh!

dpgeorge May 24, 2023

Choose a reason for hiding this comment

Uh oh!

dpgeorge Jul 1, 2024

Choose a reason for hiding this comment

Uh oh!

dpgeorge commented Jul 1, 2024

Uh oh!

github-actions bot commented Jul 1, 2024

Uh oh!

dpgeorge commented Jul 1, 2024

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Oct 6, 2022 •

edited by codecov bot

Loading