5.1 A data mining routine has been applied to a transaction dataset and has classified 88 records as fraudulent (30 correctly so) and 952 as non-fraudulent (920 correctly so). Construct the confusion matrix and calculate the overall error rate.

Respuesta :

Confusion matrix and overall error rate is described below.

Explanation:

The error rate is not the same? Like, even if I prioritize non-fraudulent and create confusion matrix we get the same error rate right?

So we have 88 + 952 = 1040 records in total.

Using the current method, it appears that the error rate is ((88-30) + (952-920)) / 1040 = 90/1040 = 8.65%.

The 'analyst' says that the accuracy could be improved by using a different model. They propose to just assume all the classifications are non-fraudulent.

We can calculate that there are 30 + (952 - 920) = 62 fraudulent records. Then, the error rate for this model is simply the number of frauds divided by the number of records. 62 / 1040 = 5.96%.