Skip to content

Refactor of metrics.roc_curve method #350

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Sep 14, 2011
Merged

Refactor of metrics.roc_curve method #350

merged 6 commits into from
Sep 14, 2011

Conversation

ohe
Copy link
Contributor

@ohe ohe commented Sep 13, 2011

The new implementation, even if it looks very naive, reduces the
computation time of fpr/tpr vectors.
roc_curve computation time depends now only on the length of y_score.
For comparison, here are the results between the old and the new
implementation for the following vectors:

  • 10^6 length vector (y_score has 1000 unique values):
    • old impl.: 28.29 seconds
    • new impl.: 3.14 seconds
  • 10^6 length vector (y_score has 10000 unique values):
    • old impl.: 267.61 seconds
    • new impl.: 3.64 seconds

The new implementation, even if it looks very naive, reduces the
computation time of fpr/tpr vectors.
roc_curve computation time depends now only on the length of y_score.
For comparison, here are the results between the old and the new
implementation for the following vectors:

- 10^6 length vector  (y_score has 1000 unique values):
    - old impl.: 28.29 seconds
    - new impl.: 3.14 seconds

- 10^6 length vector (y_score has 10000 unique values):
    - old impl.: 267.61 seconds
    - new impl.: 3.64 seconds
tpr = np.empty(thresholds.size) # True positive rate
fpr = np.empty(thresholds.size) # False positive rate

# Buid tpr/fpr vector
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Buid/Build/ ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 0c25d43

@ogrisel
Copy link
Member

ogrisel commented Sep 13, 2011

Also please merge the upstream master into your branch as your as still using the old package names and this branch is not mergeable as it is.

@ogrisel
Copy link
Member

ogrisel commented Sep 13, 2011

ping @agramfort (this PR review is for you :)

@@ -121,16 +121,39 @@ def roc_curve(y_true, y_score):

y_score = y_score.ravel()
thresholds = np.sort(np.unique(y_score))[::-1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this "thresholds“ is not used later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooops .... reminiscence of the old implementation :-/.
Fixed in 52cc103

agramfort added a commit that referenced this pull request Sep 14, 2011
Refactor and speed up of metrics.roc_curve method
@agramfort agramfort merged commit 643e559 into scikit-learn:master Sep 14, 2011
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants