多分类模型评价准则整理

Kappa系数

Kappa系数是基于混淆矩阵的计算得到的模型评价参数。计算公式如下:

系数的值在-1到1之间,系数小于0的话实际上就相当于随机了。
python实现为:

1
2
from sklearn.metrics import cohen_kappa_score
kappa = cohen_kappa_score(y_true,y_pred,label=None) #(label除非是你想计算其中的分类子集的kappa系数,否则不需要设置)

海明距离

海明距离也适用于多分类的问题,简单来说就是衡量预测标签与真实标签之间的距离,取值在0~1之间。距离为0说明预测结果与真实结果完全相同,距离为1就说明模型与我们想要的结果完全就是背道而驰。公式就不贴了(0*0 原谅我太懒),直接来python实例。

1
2
from sklearn.metrics import hamming_loss
ham_distance = hamming_loss(y_true,y_pred)

杰卡德相似系数

它与海明距离的不同之处在于分母。当预测结果与实际情况完全相符时,系数为1;当预测结果与实际情况完全不符时,系数为0;当预测结果是实际情况的真子集或真超集时,距离介于0到1之间。
我们可以通过对所有样本的预测情况求平均得到算法在测试集上的总体表现情况。

1
2
3
from sklearn.metrics import jaccard_similarity_score
jaccrd_score = jaccrd_similarity_score(y_true,y_pred,normalize = default)
#normalize默认为true,这是计算的是多个类别的相似系数的平均值,normalize = false时分别计算各个类别的相似系数

铰链损失

铰链损失(Hinge loss)一般用来使“边缘最大化”(maximal margin)。损失取值在0~1之间,当取值为0,表示多分类模型分类完全准确,取值为1表明完全不起作用。

1
2
from sklearn.metrics import hinge_loss
hinger = hinger_loss(y_true,y_pred)

案例

1
2
3
import pandas as pd
df = pd.read_csv('Consumer_Complaints.csv')
df.head()
1
df = df[pd.notnull(df['Consumer complaint narrative'])]
1
df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 4569 entries, 1 to 21662
Data columns (total 18 columns):
Date received                   4569 non-null object
Product                         4569 non-null object
Sub-product                     3106 non-null object
Issue                           4569 non-null object
Sub-issue                       2294 non-null object
Consumer complaint narrative    4569 non-null object
Company public response         2220 non-null object
Company                         4569 non-null object
State                           4556 non-null object
ZIP code                        4556 non-null object
Tags                            770 non-null object
Consumer consent provided?      4569 non-null object
Submitted via                   4569 non-null object
Date sent to company            4569 non-null object
Company response to consumer    4569 non-null object
Timely response?                4569 non-null object
Consumer disputed?              4568 non-null object
Complaint ID                    4569 non-null float64
dtypes: float64(1), object(17)
memory usage: 678.2+ KB
1
2
col = ['Product', 'Consumer complaint narrative']
df = df[col]
1
df.columns
Index(['Product', 'Consumer complaint narrative'], dtype='object')
1
df.columns = ['Product', 'Consumer_complaint_narrative']
1
2
3
4
5
df['category_id'] = df['Product'].factorize()[0]
from io import StringIO
category_id_df = df[['Product', 'category_id']].drop_duplicates().sort_values('category_id')
category_to_id = dict(category_id_df.values)
id_to_category = dict(category_id_df[['category_id', 'Product']].values)
1
df.head()

Product Consumer_complaint_narrative category_id
1 Credit reporting I have outdated information on my credit repor... 0
1
2
3
4
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(8,6))
df.groupby('Product').Consumer_complaint_narrative.count().plot.bar(ylim=0)
plt.show()

png

1
2
3
4
5
6
7
from sklearn.feature_extraction.text import TfidfVectorizer

tfidf = TfidfVectorizer(sublinear_tf=True, min_df=5, norm='l2', encoding='latin-1', ngram_range=(1, 2), stop_words='english')

features = tfidf.fit_transform(df.Consumer_complaint_narrative).toarray()
labels = df.category_id
features.shape
(4569, 12633)
1
2
3
4
5
6
7
8
9
10
11
12
13
from sklearn.feature_selection import chi2
import numpy as np

N = 2
for Product, category_id in sorted(category_to_id.items()):
features_chi2 = chi2(features, labels == category_id)
indices = np.argsort(features_chi2[0])
feature_names = np.array(tfidf.get_feature_names())[indices]
unigrams = [v for v in feature_names if len(v.split(' ')) == 1]
bigrams = [v for v in feature_names if len(v.split(' ')) == 2]
print("# '{}':".format(Product))
print(" . Most correlated unigrams:\n . {}".format('\n . '.join(unigrams[-N:])))
print(" . Most correlated bigrams:\n . {}".format('\n . '.join(bigrams[-N:])))
1
2
3
4
5
6
7
8
9
10
11
12
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfTransformer
from sklearn.naive_bayes import MultinomialNB

X_train, X_test, y_train, y_test = train_test_split(df['Consumer_complaint_narrative'], df['Product'], random_state = 0)
count_vect = CountVectorizer()
X_train_counts = count_vect.fit_transform(X_train)
tfidf_transformer = TfidfTransformer()
X_train_tfidf = tfidf_transformer.fit_transform(X_train_counts)

clf = MultinomialNB().fit(X_train_tfidf, y_train)
1
print(clf.predict(count_vect.transform(["This company refuses to provide me verification and validation of debt per my right under the FDCPA. I do not believe this debt is mine."])))
['Debt collection']
1
print(clf.predict(count_vect.transform(["I am disputing the inaccurate information the Chex-Systems has on my credit report. I initially submitted a police report on XXXX/XXXX/16 and Chex Systems only deleted the items that I mentioned in the letter and not all the items that were actually listed on the police report. In other words they wanted me to say word for word to them what items were fraudulent. The total disregard of the police report and what accounts that it states that are fraudulent. If they just had paid a little closer attention to the police report I would not been in this position now and they would n't have to research once again. I would like the reported information to be removed : XXXX XXXX XXXX"])))
['Credit reporting']
1
df[df['Consumer_complaint_narrative'] == "This company refuses to provide me verification and validation of debt per my right under the FDCPA. I do not believe this debt is mine."]

Product Consumer_complaint_narrative category_id
12 Debt collection This company refuses to provide me verificatio... 2
1
df[df['Consumer_complaint_narrative'] == "I am disputing the inaccurate information the Chex-Systems has on my credit report. I initially submitted a police report on XXXX/XXXX/16 and Chex Systems only deleted the items that I mentioned in the letter and not all the items that were actually listed on the police report. In other words they wanted me to say word for word to them what items were fraudulent. The total disregard of the police report and what accounts that it states that are fraudulent. If they just had paid a little closer attention to the police report I would not been in this position now and they would n't have to research once again. I would like the reported information to be removed : XXXX XXXX XXXX"]

Product Consumer_complaint_narrative category_id
61 Credit reporting I am disputing the inaccurate information the ... 0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn.naive_bayes import MultinomialNB
from sklearn.svm import LinearSVC

from sklearn.model_selection import cross_val_score


models = [
RandomForestClassifier(n_estimators=200, max_depth=3, random_state=0),
LinearSVC(),
MultinomialNB(),
LogisticRegression(random_state=0),
]
CV = 5
cv_df = pd.DataFrame(index=range(CV * len(models)))
entries = []
for model in models:
model_name = model.__class__.__name__
accuracies = cross_val_score(model, features, labels, scoring='accuracy', cv=CV)
for fold_idx, accuracy in enumerate(accuracies):
entries.append((model_name, fold_idx, accuracy))
cv_df = pd.DataFrame(entries, columns=['model_name', 'fold_idx', 'accuracy'])
1
2
3
4
5
6
import seaborn as sns

sns.boxplot(x='model_name', y='accuracy', data=cv_df)
sns.stripplot(x='model_name', y='accuracy', data=cv_df,
size=8, jitter=True, edgecolor="gray", linewidth=2)
plt.show()

png

1
cv_df.groupby('model_name').accuracy.mean()
model_name
LinearSVC                 0.822890
LogisticRegression        0.792927
MultinomialNB             0.688519
RandomForestClassifier    0.443826
Name: accuracy, dtype: float64
1
2
3
4
5
6
7
from sklearn.model_selection import train_test_split

model = LinearSVC()

X_train, X_test, y_train, y_test, indices_train, indices_test = train_test_split(features, labels, df.index, test_size=0.33, random_state=0)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
1
2
3
4
5
6
7
8
9
from sklearn.metrics import confusion_matrix

conf_mat = confusion_matrix(y_test, y_pred)
fig, ax = plt.subplots(figsize=(8,6))
sns.heatmap(conf_mat, annot=True, fmt='d',
xticklabels=category_id_df.Product.values, yticklabels=category_id_df.Product.values)
plt.ylabel('Actual')
plt.xlabel('Predicted')
plt.show()

png

1
2
3
4
5
6
7
8
from IPython.display import display

for predicted in category_id_df.category_id:
for actual in category_id_df.category_id:
if predicted != actual and conf_mat[actual, predicted] >= 6:
print("'{}' predicted as '{}' : {} examples.".format(id_to_category[actual], id_to_category[predicted], conf_mat[actual, predicted]))
display(df.loc[indices_test[(y_test == actual) & (y_pred == predicted)]][['Product', 'Consumer_complaint_narrative']])
print('')
'Consumer Loan' predicted as 'Credit reporting' : 10 examples.

Product Consumer_complaint_narrative
2720 Consumer Loan Quoting them, your first loan application, the...
7091 Consumer Loan While reviewing my XXXX credit report, I notic...
5439 Consumer Loan I have been recently checking my credit report...
12763 Consumer Loan We went to buy XXXX cars, and the dealership s...
13158 Consumer Loan I got a 30 day late XX/XX/2017 and it 's repor...
4134 Consumer Loan I took out an instalment loan in the amount XX...
13848 Consumer Loan I was turned down for a loan by Honda Finacial...
19227 Consumer Loan ONEMAIN # XXXX XXXX , IN XXXX ( XXXX ) XXXX Da...
11258 Consumer Loan I have not been given credit for the payments ...
11242 Consumer Loan Reliable Credit falsely submitted an applicati...
'Debt collection' predicted as 'Credit reporting' : 18 examples.

Product Consumer_complaint_narrative
18410 Debt collection Dear CFPB, I am asking you for assistance to i...
5262 Debt collection XXXX XXXX, XXXX ( This letter describes in det...
11834 Debt collection XXXX XXXX XXXX is reporting negatively on my c...
19652 Debt collection I recently paid of both debts on my credit acc...
15557 Debt collection Never have been a XXXX XXXX customer. I was at...
4431 Debt collection someone tried getting credit information and i...
15949 Debt collection This debt is from account from $ XX/XX/2008 an...
12475 Debt collection In XXXX XXXX, there was an account opened thro...
13548 Debt collection DIVERSIFIELD CONSULTANTS INC HAVE VIOLATED FCR...
6988 Debt collection Also collections refuses to stop reporting to ...
16498 Debt collection They called my son and told him that they are ...
12028 Debt collection Rubin & Rothman LLC ( R & R ) received default...
7131 Debt collection THIS IS FRAUD. I HAVE REQUESTED VERIFICATION A...
15630 Debt collection Barclays Bank Delaware obtained a judgment aga...
11112 Debt collection This account was a joint account with XXXX and...
16 Debt collection This complaint is in regards to Square Two Fin...
311 Debt collection Hunter Warfield has be unable to provide prope...
15988 Debt collection Unknown account, never have been notified and ...
'Mortgage' predicted as 'Credit reporting' : 6 examples.

Product Consumer_complaint_narrative
4637 Mortgage This complaint is in follow-up to Complaint # ...
5269 Mortgage The attached complaint was initially written t...
7343 Mortgage In 2014, I went to XXXX in order to buy a mobi...
15048 Mortgage Company repeatedly corrects my credit report a...
861 Mortgage Mortgage broker did Credit inquiry on my credi...
19781 Mortgage I am a card carrying XXXX and wanted to see if...
'Credit card' predicted as 'Credit reporting' : 9 examples.

Product Consumer_complaint_narrative
18643 Credit card I was told this account wiuld be deleted from ...
18574 Credit card This inquiry was n't me
19868 Credit card Capital One/Kohls has been reporting a past du...
19963 Credit card on XX/XX/XXXX my wallet was stolen with all my...
4706 Credit card American Express is reporting an account on my...
21566 Credit card Have disputed the reporting of the status of a...
13906 Credit card I have been the victim of identity theft fraud...
16853 Credit card I have requested XXXX XXXX to run a credit rep...
10505 Credit card I have been working since XXXX 2016 to get a i...
---------------------------------------------------------------------------

IndexError                                Traceback (most recent call last)

<ipython-input-22-9932ab8bdc5b> in <module>()
      3 for predicted in category_id_df.category_id:
      4   for actual in category_id_df.category_id:
----> 5     if predicted != actual and conf_mat[actual, predicted] >= 6:
      6       print("'{}' predicted as '{}' : {} examples.".format(id_to_category[actual], id_to_category[predicted], conf_mat[actual, predicted]))
      7       display(df.loc[indices_test[(y_test == actual) & (y_pred == predicted)]][['Product', 'Consumer_complaint_narrative']])


IndexError: index 11 is out of bounds for axis 0 with size 11
1
model.fit(features, labels)
LinearSVC(C=1.0, class_weight=None, dual=True, fit_intercept=True,
     intercept_scaling=1, loss='squared_hinge', max_iter=1000,
     multi_class='ovr', penalty='l2', random_state=None, tol=0.0001,
     verbose=0)
1
2
3
4
5
6
7
8
9
10
11
from sklearn.feature_selection import chi2

N = 2
for Product, category_id in sorted(category_to_id.items()):
indices = np.argsort(model.coef_[category_id])
feature_names = np.array(tfidf.get_feature_names())[indices]
unigrams = [v for v in reversed(feature_names) if len(v.split(' ')) == 1][:N]
bigrams = [v for v in reversed(feature_names) if len(v.split(' ')) == 2][:N]
print("# '{}':".format(Product))
print(" . Top unigrams:\n . {}".format('\n . '.join(unigrams)))
print(" . Top bigrams:\n . {}".format('\n . '.join(bigrams)))
# 'Bank account or service':
  . Top unigrams:
       . bank
       . account
  . Top bigrams:
       . debit card
       . overdraft fees
# 'Consumer Loan':
  . Top unigrams:
       . vehicle
       . car
  . Top bigrams:
       . personal loan
       . history xxxx
# 'Credit card':
  . Top unigrams:
       . card
       . discover
  . Top bigrams:
       . credit card
       . discover card
# 'Credit reporting':
  . Top unigrams:
       . equifax
       . transunion
  . Top bigrams:
       . xxxx account
       . trans union
# 'Debt collection':
  . Top unigrams:
       . debt
       . collection
  . Top bigrams:
       . account credit
       . time provided
# 'Money transfers':
  . Top unigrams:
       . paypal
       . transfer
  . Top bigrams:
       . money transfer
       . send money
# 'Mortgage':
  . Top unigrams:
       . mortgage
       . escrow
  . Top bigrams:
       . loan modification
       . mortgage company
# 'Other financial service':
  . Top unigrams:
       . passport
       . dental
  . Top bigrams:
       . stated pay
       . help pay
# 'Payday loan':
  . Top unigrams:
       . payday
       . loan
  . Top bigrams:
       . payday loan
       . pay day
# 'Prepaid card':
  . Top unigrams:
       . prepaid
       . serve
  . Top bigrams:
       . prepaid card
       . use card
# 'Student loan':
  . Top unigrams:
       . navient
       . loans
  . Top bigrams:
       . student loan
       . sallie mae
# 'Virtual currency':
  . Top unigrams:
       . https
       . tx
  . Top bigrams:
       . money want
       . xxxx provider
1
2
3
4
5
6
7
8
9
10
11
texts = ["I requested a home loan modification through Bank of America. Bank of America never got back to me.",
"It has been difficult for me to find my past due balance. I missed a regular monthly payment",
"I can't get the money out of the country.",
"I have no money to pay my tuition",
"Coinbase closed my account for no reason and furthermore refused to give me a reason despite dozens of request"]
text_features = tfidf.transform(texts)
predictions = model.predict(text_features)
for text, predicted in zip(texts, predictions):
print('"{}"'.format(text))
print(" - Predicted as: '{}'".format(id_to_category[predicted]))
print("")
"I requested a home loan modification through Bank of America. Bank of America never got back to me."
  - Predicted as: 'Mortgage'

"It has been difficult for me to find my past due balance. I missed a regular monthly payment"
  - Predicted as: 'Credit reporting'

"I can't get the money out of the country."
  - Predicted as: 'Bank account or service'

"I have no money to pay my tuition"
  - Predicted as: 'Debt collection'

"Coinbase closed my account for no reason and furthermore refused to give me a reason despite dozens of request"
  - Predicted as: 'Bank account or service'
1
2
3
from sklearn import metrics
print(metrics.classification_report(y_test, y_pred,
target_names=df['Product'].unique()))
                         precision    recall  f1-score   support

       Credit reporting       0.82      0.82      0.82       288
          Consumer Loan       0.83      0.60      0.70       100
        Debt collection       0.80      0.91      0.85       359
               Mortgage       0.90      0.93      0.92       317
            Credit card       0.73      0.77      0.75       165
Other financial service       0.00      0.00      0.00         1
Bank account or service       0.74      0.74      0.74       121
           Student loan       0.92      0.83      0.87       111
        Money transfers       0.50      0.23      0.32        13
            Payday loan       0.75      0.38      0.50        16
           Prepaid card       0.67      0.12      0.20        17

            avg / total       0.82      0.82      0.81      1508



/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/sklearn/metrics/classification.py:1428: UserWarning: labels size, 11, does not match size of target_names, 12
  .format(len(labels), len(target_names))
/home/ec2-user/anaconda3/envs/python3/lib/python3.6/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples.
  'precision', 'predicted', average, warn_for)