-
-
Notifications
You must be signed in to change notification settings - Fork 26.1k
Description
Describe the bug
The test_unsorted_indices
function occasionally fails on CI when comparing the coefficients of SVC(kernel="linear", probability=True, random_state=0)
trained on dense vs sparse data.
I suspect this is due to additional randomness introduced by the internal cross-validation and Platt scaling when probability=True
is set. See the SVC documentation for reference.
Steps/Code to Reproduce
Unfortunately, I haven't been able to reproduce the failure reliably. I've only seen it fail three times when creating or reviewing PRs, but the error disappears after re-running CI.
I've also tried looping through various random_state
values without triggering a failure locally.
For now, I'm labelling this with "Hard" and "Needs Reproducible Code."
Expected Results
def test_unsorted_indices(csr_container):
# test that the result with sorted and unsorted indices in csr is the same
# we use a subset of digits as iris, blobs or make_classification didn't
# show the problem
X, y = load_digits(return_X_y=True)
X_test = csr_container(X[50:100])
X, y = X[:50], y[:50]
tols = dict(rtol=1e-12, atol=1e-14)
X_sparse = csr_container(X)
coef_dense = (
svm.SVC(kernel="linear", probability=True, random_state=0).fit(X, y).coef_
)
sparse_svc = svm.SVC(kernel="linear", probability=True, random_state=0).fit(
X_sparse, y
)
coef_sorted = sparse_svc.coef_
# make sure dense and sparse SVM give the same result
assert_allclose(coef_dense, coef_sorted.toarray(), **tols)
should consistently pass.
Actual Results
In rare cases, the assertion fails:
AssertionError:
Not equal to tolerance rtol=1e-07, atol=0
Mismatched elements: 2 / 2880 (0.0694%)
Max absolute difference among violations: 3.46944695e-18
Max relative difference among violations: inf
ACTUAL: array([[ 0. , 0. , 0.001342, ..., -0.002487, 0. ,
0. ],
[ 0. , 0. , -0.000967, ..., -0.010943, -0.012946,...
DESIRED: array([[ 0. , 0. , 0.001342, ..., -0.002487, 0. ,
0. ],
[ 0. , 0. , -0.000967, ..., -0.010943, -0.012946,...
However, re-running CI causes the test to pass. This suggests the failure is non-deterministic, likely comming from subtle differences in how sparse vs dense inputs interact with the internal cross-validation and Platt scaling.
Versions
System:
python: 3.12.6 (tags/v3.12.6:a4a2d2b, Sep 6 2024, 20:11:23) [MSC v.1940 64 bit (AMD64)]
executable: D:\PycharmProjects\scikit-learn\venv1\Scripts\python.exe
machine: Windows-11-10.0.22621-SP0
Python dependencies:
sklearn: 1.8.dev0
pip: 25.0.1
setuptools: 70.2.0
numpy: 2.2.3
scipy: 1.15.2
Cython: 3.0.12
pandas: 2.2.3
matplotlib: 3.10.0
joblib: 1.4.2
threadpoolctl: 3.5.0
Built with OpenMP: True
threadpoolctl info:
user_api: blas
internal_api: openblas
num_threads: 20
prefix: libscipy_openblas
filepath: D:\PycharmProjects\scikit-learn\venv1\Lib\site-packages\numpy.libs\libscipy_openblas64_-43e11ff0749b8cbe0a615c9cf6737e0e.dll
version: 0.3.28
threading_layer: pthreads
architecture: Haswell
user_api: blas
internal_api: openblas
num_threads: 20
prefix: libscipy_openblas
filepath: D:\PycharmProjects\scikit-learn\venv1\Lib\site-packages\scipy.libs\libscipy_openblas-f07f5a5d207a3a47104dca54d6d0c86a.dll
version: 0.3.28
threading_layer: pthreads
architecture: Haswell
user_api: openmp
internal_api: openmp
num_threads: 20
prefix: vcomp
filepath: C:\Windows\System32\vcomp140.dll
version: None