In this notebook, we will download a model, dataset, and metric from Hugigng Face Hub and generate a interactive HTML Model Card using Intel AI Safety Model Card Generator Tool.

1. Download and Import Dependencies

[ ]:

!pip install evaluate datasets transformers[torch] scikit-learn

[ ]:

from intel_ai_safety.model_card_gen.model_card_gen import ModelCardGen
from datasets import load_dataset
import evaluate
from transformers import AutoConfig,AutoModelForSequenceClassification,AutoTokenizer
import pandas as pd

from collections import Counter
from functools import reduce
import json
import numpy as np

2. Download Dataset from Hugging Face Datasets

[3]:

raw_dataset = load_dataset("hatexplain")
he_dataset = raw_dataset.map(lambda e: {'text': " ".join(e['post_tokens'])})

3. Transform Dataset

[4]:

def get_common_targets(elm, ignore=['Other', 'None']):
    """
    This function merges annotated targets from each annotator
    into a single list when annotators agree
    """
    targets = elm['annotators']['target']
    counts = reduce(lambda x, y: Counter(x) + Counter(y) , targets)
    result = [target for target, count in counts.items() if count > 1]
    if result:
        return {'target': result}
    else:
        return {'target': []}

he_dataset = he_dataset.map(get_common_targets)

[5]:

def get_top_communites(targets, top=10):
    target_counts = reduce(lambda x, y: Counter(x) + Counter(y) , targets)
    top_targets, _ =  zip(*target_counts.most_common(top))
    return set(top_targets)

TOP = get_top_communites(he_dataset['test']['target'])

def filter_top_target(elm):
    """
    This function filteras the identity groups targeted
    in each item with the top 10 most common identity groups
    """
    targets = set(elm['target']) & TOP
    return {'target': targets}

he_dataset = he_dataset.map(filter_top_target)

[6]:

def get_label(elm):
    """
    This fuction gets a ground truth label from annotators labels
    """
    label_map = {0: 1, # hatespech -> 1
                 1: 0, # normal -> 0
                 2: 1} # abusive -> 1

    labels = elm['annotators']['label']
    max_label = max(labels, key=labels.count)
    return {'label': label_map[max_label]}

he_dataset = he_dataset.map(get_label)

4. Download Modle and Process Outputs

[7]:

from torch.nn.functional import softmax

he_dataset.set_format("pt", columns=["post_tokens"], output_all_columns=True)
model = AutoModelForSequenceClassification.from_pretrained("Hate-speech-CNERG/dehatebert-mono-english")
tokenizer = AutoTokenizer.from_pretrained("Hate-speech-CNERG/dehatebert-mono-english")

def process(examples):
    bert_tokens =  tokenizer(examples['text'], return_tensors="pt")
    output = model(**bert_tokens)
    return {"output": softmax(output['logits'], dim=-1).flatten()}

test_ds = he_dataset['test'].map(process)

5. Get Bias Metric form Hugging Face

[8]:

metric = evaluate.load('Intel/bias_auc')
print(metric)

EvaluationModule(name: "bias_auc", module_type: "metric", features: {'target': Sequence(feature=Value(dtype='string', id=None), length=-1, id=None), 'label': Value(dtype='int64', id=None), 'output': Sequence(feature=Value(dtype='float32', id=None), length=-1, id=None)}, usage: """Args:
    target list[list[str]]:  list containing list of group targeted for each item
    label list[int]: list containing label index for each item
    output list[list[float]]: list of model output values for each
    subgroup list[str] (optional): list of subgroups that appear in target to compute metric over
Returns (for each subgroup in target):
    'Subgroup' : Subgroup AUC score,
    'BPSN' : BPSN (Background Positive, Subgroup Negative) AUC,
    'BNSP' : BNSP (Background Negative, Subgroup Positive) AUC score,
Example:
    >>> from evaluate import load

    >>> target = [['Islam'],
    ... ['Sexuality'],
    ... ['Sexuality'],
    ... ['Islam']]

    >>> label = [0, 0, 1, 1]

    >>> output = [[0.44452348351478577, 0.5554765462875366],
    ... [0.4341845214366913, 0.5658154487609863],
    ... [0.400595098733902, 0.5994048714637756],
    ... [0.3840397894382477, 0.6159601807594299]]
    >>> metric = load('Intel/bias_auc')
    >>> metric.add_batch(target=target,
                     label=label,
                     output=output)
    >>> metric.compute(subgroups = None)

""", stored examples: 0)

6. Run Bias Metric

[9]:

metric.add_batch(target=test_ds['target'],
                 label=test_ds['label'],
                 output=test_ds['output'])

subgroups = set(group for group_list in test_ds['target'] for group in group_list) - set(['Disability'])

metric_output = metric.compute(subgroups = subgroups)

[10]:

metric_output

[10]:

{'Refugee': {'Subgroup': 0.7311320754716981,
  'BPSN': 0.678562242798354,
  'BNSP': 0.7630785937287905},
 'Jewish': {'Subgroup': 0.5336263736263737,
  'BPSN': 0.6644343891402714,
  'BNSP': 0.6367426970788316},
 'Homosexual': {'Subgroup': 0.7495069033530571,
  'BPSN': 0.5786049393644331,
  'BNSP': 0.8356651669775779},
 'Islam': {'Subgroup': 0.6413432424993118,
  'BPSN': 0.7368656382740889,
  'BNSP': 0.6499369565573367},
 'Women': {'Subgroup': 0.7131519274376417,
  'BPSN': 0.7320302982149478,
  'BNSP': 0.6984689059759881},
 'None': {'Subgroup': 0.6752210816335599,
  'BPSN': 0.7373480790382473,
  'BNSP': 0.6139316239316239},
 'Arab': {'Subgroup': 0.6023809523809524,
  'BPSN': 0.541080680977054,
  'BNSP': 0.7807407407407407},
 'Caucasian': {'Subgroup': 0.7462406015037595,
  'BPSN': 0.8013460703430308,
  'BNSP': 0.6682660367609065},
 'African': {'Subgroup': 0.7112082928409459,
  'BPSN': 0.45742948342127554,
  'BNSP': 0.8486346379911739},
 'Other': {'Subgroup': 0.7295238095238096,
  'BPSN': 0.7852631578947368,
  'BNSP': 0.6521296296296296},
 'Overall generalized mean': {'Subgroup': 0.657871089806175,
  'BPSN': 0.606444668229416,
  'BNSP': 0.690561798916551}}

7. Transform Output for Model Card

Mode Card Generator take two pandas dataframes as input. We will creat a metrics_by_group dataframe from the Bias AUC metric above as well as a metrics_by_threshold containing performance metrics at threshold.

[11]:

metrics_by_group = (pd.DataFrame.from_dict(metric_output).
      T.
      reset_index().
      rename({'index': 'group'}, axis=1))
metrics_by_group['feature'] = ['target'] * len(metrics_by_group)
metrics_by_group

[11]:

	group	Subgroup	BPSN	BNSP	feature
0	Refugee	0.731132	0.678562	0.763079	target
1	Jewish	0.533626	0.664434	0.636743	target
2	Homosexual	0.749507	0.578605	0.835665	target
3	Islam	0.641343	0.736866	0.649937	target
4	Women	0.713152	0.732030	0.698469	target
5	None	0.675221	0.737348	0.613932	target
6	Arab	0.602381	0.541081	0.780741	target
7	Caucasian	0.746241	0.801346	0.668266	target
8	African	0.711208	0.457429	0.848635	target
9	Other	0.729524	0.785263	0.652130	target
10	Overall generalized mean	0.657871	0.606445	0.690562	target

[ ]:

from sklearn.metrics import precision_score, recall_score, f1_score, accuracy_score
import numpy as np
thetas = np.linspace(0, 1, 1001)
y_pred_prob = test_ds['output'][:,1]

metrics_dict ={
    'threshold': thetas,
    'precision': [precision_score(test_ds['label'], y_pred_prob > theta) for theta in thetas],
    'recall': [recall_score(test_ds['label'], y_pred_prob > theta) for theta in thetas],
    'f1': [f1_score(test_ds['label'], y_pred_prob > theta) for theta in thetas],
    'accuracy' : [accuracy_score(test_ds['label'], y_pred_prob > theta) for theta in thetas]
}

[13]:

metrics_by_threshold = pd.DataFrame.from_dict(metrics_dict)

[14]:

metrics_by_threshold

[14]:

	threshold	precision	recall	f1	accuracy
0	0.000	0.593555	1.0	0.744945	0.593555
1	0.001	0.593555	1.0	0.744945	0.593555
2	0.002	0.593555	1.0	0.744945	0.593555
3	0.003	0.593555	1.0	0.744945	0.593555
4	0.004	0.593555	1.0	0.744945	0.593555
...	...	...	...	...	...
996	0.996	0.000000	0.0	0.000000	0.406445
997	0.997	0.000000	0.0	0.000000	0.406445
998	0.998	0.000000	0.0	0.000000	0.406445
999	0.999	0.000000	0.0	0.000000	0.406445
1000	1.000	0.000000	0.0	0.000000	0.406445

1001 rows × 5 columns

8. Build Model Card

Simply add the dataframes into the ModelCardGen.generate class method to build a model card.

[15]:

mc =  {
    "schema_version": "0.0.1",
    "model_details": {
        "name": "Deep Learning Models for Multilingual Hate Speech Detection",
        "version": {
            "name": "25d0e4d9122d2a5c283e07405a325e3dfd4a73b3",
            "date": "2020"
        },
        "graphics": {},

        "citations": [
             {
                "citation": '''@article{aluru2020deep,
                title={Deep Learning Models for Multilingual Hate Speech Detection},
                author={Aluru, Sai Saket and Mathew, Binny and Saha, Punyajoy and Mukherjee, Animesh},
                journal={arXiv preprint arXiv:2004.06465},
                year={2020}
                }'''
             },
        ],
        "overview": 'This model is used detecting hatespeech in English language. The mono in the name refers to the monolingual setting, where the model is trained using only English language data. It is finetuned on multilingual bert model. The model is trained with different learning rates and the best validation score achieved is 0.726030 for a learning rate of 2e-5. Training code can be found here https://github.com/punyajoy/DE-LIMIT',
    }
}

[16]:

mcg = ModelCardGen.generate(metrics_by_group=metrics_by_group, metrics_by_threshold=metrics_by_threshold, model_card=mc)
mcg

[16]:

Model Card for Deep Learning Models for Multilingual Hate Speech Detection

Model Details

Overview

This model is used detecting hatespeech in English language. The mono in the name refers to the monolingual setting, where the model is trained using only English language data. It is finetuned on multilingual bert model. The model is trained with different learning rates and the best validation score achieved is 0.726030 for a learning rate of 2e-5. Training code can be found here https://github.com/punyajoy/DE-LIMIT

Model Performance

Overall Accuracy/Precision/Recall/F1

Version

name: 25d0e4d9122d2a5c283e07405a325e3dfd4a73b3

date: 2020

Citations

@article{aluru2020deep, title={Deep Learning Models for Multilingual Hate Speech Detection}, author={Aluru, Sai Saket and Mathew, Binny and Saha, Punyajoy and Mukherjee, Animesh}, journal={arXiv preprint arXiv:2004.06465}, year={2020} }

Quantitative Analysis

Metrics at Threshold

Metrics by Group

[17]:

mcg.export_html('ModelCard.html')