Skip to content

MNT: Refactor default violin KDE estimator #30387

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

timhoffm
Copy link
Member

@timhoffm timhoffm commented Aug 4, 2025

Move the default KDE estimator from a private definition in violinplot() into violin_stats(). This makes it easier to test and debug violin_stats() as we don't have to explicitly provide a KDE method.

It also becomes logically simpler, because violinplot() is now only violin_stats() + violin().

This will make it easier to resolve #30355.

Move the default KDE estimator from a private definition
in `violinplot()` into `violin_stats()`. This makes it easier
to test and debug violin_stats() as we don't have to
explicitly provide a KDE method.

It also becomes logically simpler, because `violinplot()`
is now only `violin_stats()` + `violin()`.
@jklymak
Copy link
Member

jklymak commented Aug 7, 2025

Meta comments:

Is scipy really our only way to get this functionality?

I do wonder, in 2025, about our reluctance to make scipy a dependency. It would make most of mlab obsolete and has various standardized stats routines that we could use without duplicated effort. I think the view in the past was that scipy was a heavy dependency, and when folks had to compile locally was a pretty slow and tool intensive task. Now that most folks are covered by wheels, is this really the case?

I appreciate adding more dependencies is a slippery slope, but I bet some large fraction of folks using Matplotlib are installing it beside scipy in their environments.

@timhoffm
Copy link
Member Author

timhoffm commented Aug 7, 2025

I would be reluctant to add scipy it as a hard dependency. An optional dependency may be ok, in which case violinplot() would error out if scipy is not there. For the conda-world we would also have the option to do it analogous to pyqt: There is matplotlib-base which only has the minimal dependencies, and is typically good for CI and other packages to depend on matplotlib. And there is matplotlib, which includes the "heavy" dependencies on pyqt and tornado to give better out-of-the box experience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: violinplot with nan values fails silently
3 participants