What is Precision, Recall and F1 Score

Precision, recall and F1 Score are parameters to measure the performance of a classification model.

For example if we want to implement Logistic Regression model on an imbalanced dataset we would like to calculate precision, recall and F1 score as accuracy may not be a good measure of model performance in this case.

Precision: Precision measures when the model predicts disease for a group of people ( for example), how often is it correct. Precision may not be good measure of model performance in case of disease prediction , however consider a situation where we want to find out a banks credit card customers whose credit limit should be decreased. In this problem we want False Positive( predicted that credit limit should be decreased but actually should not be decreased) cases to be minimum to provide better customer satisfaction.

Precision formula = True Positive/( True Positive + False Positive)

Recall: Recall measures out of total people having disease ( for example), how many of them is predicted correctly. Note that in this case we want to predict the disease even if we are not very sure, for being on safer side. Means we want to include False Negative( Predicted no disease but actually has disease), in our calculation. In case of disease prediction recall ( not precision) is a good measure for model performance. And we would like recall to be high , this will decrease the False negative, which we want.

Recall formula = True Positive/( True Positive + False Negative)

To check how a precision or recall can be calculated on a given dataset please check this post.

F1 Score: F1 score is harmonic mean of precision and recall. This is calculated as

F1 Score formula= 2*(Precision*Recall)/( Precision+Recall)

F1 score should be used as a method for model evaluation when both precision and recall is of equal importance for the model.

To calculate the precision, recall and F1 score in Jupyter notebook we need to first create the Logistic Regression Model and predict the output as per below usual code.

y_pred = model.predict(x_test)

Then use the below code to print the values of precision, recall and F1 score as shown below.

from sklearn.metrics import classification_report
print(classification_report(y_test, y_pred))
Classification Report for Logistic Regression
Classification Report for Logistic Regression

However you may also like to check about Confusion Matrix, which is a related concept.

Happy coding !!

Leave a Reply