MENU

Lesson 154: Precision

TOC

Recap: Accuracy

In the previous lesson, we discussed Accuracy, a metric that shows how accurately a model predicts overall. Specifically, accuracy indicates the proportion of correctly predicted data out of the total dataset and serves as a simple and general indicator of a machine learning model’s performance. However, relying solely on accuracy can be misleading, particularly when the dataset is imbalanced. This is why it’s important to evaluate models using other metrics such as Precision (the topic of this lesson) and Recall, which will be covered in the next lesson.


What is Precision?

Precision measures the proportion of instances that are correctly classified as positive out of all instances predicted as positive by the model. Precision focuses on the correctness of positive predictions and indicates how trustworthy those predictions are.

The formula for calculating precision is:

[
\text{Precision} = \frac{\text{True Positives (TP)}}{\text{True Positives (TP)} + \text{False Positives (FP)}}
]

Example: Understanding Precision

An analogy for understanding precision is to think of it as the “success rate of finding an item.” Imagine you are helping a friend find their lost keys. You find 10 keys, but only 7 belong to your friend. The precision is 7/10 = 0.7, or 70%. In this case, precision represents the percentage of the keys you found that were genuinely your friend’s. Finding many fake keys lowers the precision, just like incorrect predictions affect a model’s precision.


Example Calculation of Precision

To understand precision better, let’s consider a specific example.

Example: Spam Email Filter

Suppose you build a model to filter spam emails. The filter examines 100 emails and classifies 20 of them as spam. Of those 20, 15 are actual spam emails, while 5 are not (false positives).

In this case:

  • True Positives (TP): Emails correctly identified as spam = 15
  • False Positives (FP): Non-spam emails incorrectly identified as spam = 5

[
\text{Precision} = \frac{15}{15 + 5} = 0.75
]

The precision of this spam filter is 75%, meaning 75% of the emails identified as spam were indeed spam. This metric reflects the model’s accuracy in predicting positive cases.


When is High Precision Important?

High precision is critical in scenarios where the impact of false positives is significant. For example, in the medical field, an incorrect diagnosis indicating a disease when it is not present (false positive) could cause unnecessary anxiety and lead to further unneeded tests. Therefore, precision becomes an important metric in diagnostic models.

In cases where precision is high, the model’s positive predictions are very reliable, making it a valuable tool for specific applications. For instance, when false positives carry high risks or costs, ensuring a high precision rate helps maintain the credibility and reliability of the model’s outputs.


Precision and Its Relationship to Other Metrics

Precision is closely related to Recall and the F1 Score, each of which offers a different perspective on evaluating model performance. Balancing these metrics is essential:

  • Precision: The proportion of true positives among all instances predicted as positive.
  • Recall: The proportion of actual positives that the model correctly identified. (This will be explained in the next lesson.)
  • F1 Score: The harmonic mean of precision and recall, which balances both metrics.

A model with high precision means its predictions are highly reliable, but if recall is low, it may be missing many actual positives. Therefore, it is crucial to strike a balance based on the model’s purpose.

Example: Balancing Precision and Recall

An analogy to understand the balance between precision and recall is to think of the police catching thieves. If the police focus too much on catching criminals (high precision), they may arrest innocent people (false positives). Conversely, if they only focus on not arresting innocents (high recall), they may miss catching actual thieves. Hence, balancing both is necessary.


Summary

In this lesson, we covered Precision, a key metric in evaluating the reliability of a model’s positive predictions. Precision plays a critical role in situations where false positives have serious consequences. While precision is essential, it is also important to consider other metrics like recall and the F1 score to achieve a comprehensive evaluation of model performance.


Next Topic: Recall

In the next lesson, we will discuss Recall, which indicates how well the model identifies the actual positives within the dataset. Stay tuned!


Notes

  1. True Positive (TP): The number of correctly predicted positive instances.
  2. False Positive (FP): The number of instances incorrectly predicted as positive.
  3. Precision: The ratio of true positives to all instances predicted as positive.
  4. Recall: The ratio of true positives to all actual positives in the dataset.
  5. F1 Score: A metric that balances precision and recall.
Let's share this post !

Author of this article

株式会社PROMPTは生成AIに関する様々な情報を発信しています。
記事にしてほしいテーマや調べてほしいテーマがあればお問合せフォームからご連絡ください。
---
PROMPT Inc. provides a variety of information related to generative AI.
If there is a topic you would like us to write an article about or research, please contact us using the inquiry form.

Comments

To comment

TOC