Merge branch 'IT20097660-Sashini' of http://gitlab.sliit.lk/2023-142/2023-142...

Merge branch 'IT20097660-Sashini' of http://gitlab.sliit.lk/2023-142/2023-142 into IT18161298-Srinidee
parents ba4ca7c7 ae517179
git-colab-terminal.ipynb
**/__pycache__/
.vscode/
.creds/
/models
/datasets
/wandb
/evaluations
\ No newline at end of file
Inputs ->
feature names -> List of features used in the TFIDF vectoriser. Used to give words in the final output
threshold T -> Threshold to consider an output as counterfactual
classifier_fn (C) -> classifier prediction probability function in the random forest classifier
max_iter -> Maximum number of iterations run before termination if a CF is not found
max_time -> Maximum time that the algorithm run before termination if a CF is not found
Output ->
list of words to remove or to change to reverse the model output.
Process ->
input -> Instance W -> document to classify. Has m words
c = initial predicted class
p = probability of the predicted class
r = revert or not. Set to zero if predicted class is positive.
n_explanations = 0
explanations = {}
combinations_to_expand = {}
prob_combinations_to_expand = {}
shap_combinations_to_expand = {}
W = [] indices of features
R = [] indices of replacement features of feature w_i if such replcement exsists.
for i = 1 to m:
p_n = C(w_i) # Instance with w_i removed or changed to r_i
if (p_n < T):
explanations = explanations U w_i
else:
combinations_to_expand = combinations_to_expand U w_i
prob_combinations_to_expand = prob_combinations_to_expand U w_i
end if
end for
iteration = 1
start time
while True:
if iteration > max_iter OR time > max_time:
end while
combi = word combinations to remove where change in prediction score towards reverse class is maximal
new_combi_set = expanded combinations of combi without the exisiting combinations in explanations
for combo in new_combi_set do:
p_n = C(w_i)
if (p_n < T):
explanations = explanations U w_i
else:
combinations_to_expand = combinations_to_expand U w_i
prob_combinations_to_expand = prob_combinations_to_expand U w_i
shap_combinations_to_expand = shap_combinations_to_expand U shap_vals(w_i)
end if
end for
iteration ++
increment time
end while
replcement antonyms are generated from the wordnet library
Gives faster results than removal as it pushes the results towards the reverse class
Need to choose a proper antonym as antonyms maybe chosen to push towards the current class.
This is prevented by using SHAP values to choose antonyms
Used paper - Text Counterfactuals via Latent Optimization and Shapley-Guided Search
\ No newline at end of file
Inputs ->
feature names -> List of features used in the TFIDF vectoriser. Used to give words in the final output
threshold T -> Threshold to consider an output as counterfactual
classifier_fn (C) -> classifier prediction probability function in the random forest classifier
max_iter -> Maximum number of iterations run before termination if a CF is not found
max_time -> Maximum time that the algorithm run before termination if a CF is not found
Output ->
list of words to remove to reverse the model output.
Process ->
input -> Instance W -> document to classify. Has m words
c = initial predicted class
p = probability of the predicted class
r = revert or not. Set to zero if predicted class is positive.
n_explanations = 0
explanations = {}
combinations_to_expand = {}
prob_combinations_to_expand = {}
shap_combinations_to_expand = {}
shap_vals = {}_n shapley values of each feature with reference point taken as zero vector
W = [] indices of features sorted in the descending order of shap values
for i = 1 to m:
p_n = C(w_i) -> Instance with the feature w_i removed
if (p_n < T):
explanations = explanations U w_i
else:
combinations_to_expand = combinations_to_expand U w_i
prob_combinations_to_expand = prob_combinations_to_expand U w_i
shap_combinations_to_expand = shap_combinations_to_expand U shap_vals(w_i)
end if
end for
iteration = 1
start time
while True:
if iteration > max_iter OR time > max_time:
end while
combi = word combinations to remove where shap_combinations_to_expand is maximal
new_combi_set = expanded combinations of combi withiut the exisiting combinations in explanations
for combo in new_combi_set do:
p_n = C(w_i)
if (p_n < T):
explanations = explanations U w_i
end while
else:
combinations_to_expand = combinations_to_expand U w_i
prob_combinations_to_expand = prob_combinations_to_expand U w_i
shap_combinations_to_expand = shap_combinations_to_expand U shap_vals(w_i)
end if
end for
iteration ++
increment time
end while
Does not always converge ->
Even though shap values individually give measures for each feature better than score change,
for a set of features, algebraic sum of shap values is not a good measure.
But for changes with less number of words like 1-4 words:
using shap values give faster results
observation - Also gives better results when converting negative results to positive results when using shap values
Can use feature_importance_ of Random forest instead of shapely values. But need to check if the feature contributes to positive or negative change in the current instance.
x1 -> (2.98)
x2 -> 2.98 - 0.6
x3 -> 2.98 + 1.3
x4 -> 2.98 + 2.0
Current plan - Random forest
Get shapley values of all features for the current model. -> Reduces the randomness of removing features.
(Text Counterfactuals via Latent Optimization and Shapley-Guided Search) THis paper gives replacements instead of removing the feature.
Get the features contributing to each tree and the whole model for the Random forest -> Reduces the affecting feature number.
Order the features by the shapley value.
Change the value of leaf nodes and try to find cunterfactuals.
Works only for the Random Forest.
Current Implementation - Random Forest
Get shapley values of all features for the current model. -> Reduces the randomness of removing features
Get shapely values of features.
Expand and prune the required instance to generate counterfactuals.
Expand and prune order of the counterfactuals is sorted according to the shapely value of each feature.
Run the prediction algorithm of the RF model to check if a desirable counterfactual is generated.
If a desirable length counterfactual is generated, output the counterfactual in a text format.
To be Developed ->
Get the desicion path of each tree in the random forest.
Find the features affecting the prediction.
Initially consider only the features with high shapely values and are in the set of features in decision trees.
Check for the speed of operation.
NEW
Previously -> used shap values (Accurate. But takes time to calculate.)
Therefore, Use feature_importance_ in random forest model.
These are calculated when training the model. -> Takes relatively less time when compared to shap. No time taken to calculate values.
Issues -> 2) Not instance specific
1) Does not give direction of class change
solved -> Get feature_importance_
take instance
remove feature importances not related to tte current instance -> Reduces memory consuption + takes less time to calculate.
Take each feature -> Check the affect of removing that feature -> Assign a class change sign to the feature_importance_.
Current Implementation - Logistic Regression
Get shapley values of all features for the current model. -> Reduces the randomness of removing features
Get shapely values of features.
Expand and prune the required instance to generate counterfactuals.
Expand and prune order of the counterfactuals is sorted according to the shapely value of each feature.
Run the prediction algorithm of the LR model to check if a desirable counterfactual is generated.
If a desirable length counterfactual is generated, output the counterfactual in a text format.
To be Developed
Get the feature importance of the current model using weights of features.
Check for replacements of features which maximises the change in prediction probability. -> Calculated using shap values.
Extension to https://arxiv.org/pdf/1906.09293.pdf -> by adding replacements instead of removals to maximise the change
\ No newline at end of file
......@@ -45,15 +45,14 @@ Provide a novel post-hoc ,model-specific, local XAI solution to enhance the mod
## Other necessary information
Frontend:
- ReactJS
- Flask
- Boostrap
- NextJS
- Mantine UI
Backend:
- Python
Version Control:
- GitHub
- Gitlab
Tools:
- VS Code
......
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "2fc1f8d7",
"metadata": {
"_cell_guid": "b1076dfc-b9ad-4769-8c92-a6c4dae69d19",
"_uuid": "8f2839f25d086af736a60e9eeb907d3b93b6e0e5",
"execution": {
"iopub.execute_input": "2023-05-23T22:23:11.513543Z",
"iopub.status.busy": "2023-05-23T22:23:11.512957Z",
"iopub.status.idle": "2023-05-23T22:23:13.539048Z",
"shell.execute_reply": "2023-05-23T22:23:13.537538Z"
},
"papermill": {
"duration": 2.04584,
"end_time": "2023-05-23T22:23:13.542144",
"exception": false,
"start_time": "2023-05-23T22:23:11.496304",
"status": "completed"
},
"tags": []
},
"outputs": [],
"source": [
"## IMPORTS\n",
"import numpy as np # linear algebra\n",
"import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)\n",
"import seaborn as sns\n",
"import matplotlib.pyplot as plt\n",
"from scipy import sparse\n",
"\n",
"import time\n",
"\n",
"from sklearn.model_selection import RandomizedSearchCV\n",
"from sklearn.model_selection import GridSearchCV\n",
"from sklearn.model_selection import train_test_split\n",
"from sklearn.model_selection import ParameterGrid\n",
"from sklearn import metrics\n",
"from sklearn.metrics import roc_auc_score, accuracy_score\n",
"from sklearn.svm import SVC\n",
"import sklearn.feature_extraction\n",
"from sklearn.pipeline import Pipeline\n",
"from sklearn.feature_extraction.text import TfidfVectorizer\n",
"from sklearn.feature_extraction.text import TfidfTransformer\n",
"from sklearn.ensemble import RandomForestClassifier\n",
"\n",
"import nltk\n",
"from nltk.corpus import stopwords \n",
"from nltk.tokenize import word_tokenize\n",
"from nltk.stem import WordNetLemmatizer\n",
"from nltk.corpus import wordnet\n",
"import joblib\n",
"\n",
"# Importing Shap for shapley values\n",
"import shap\n",
"\n",
"from ordered_set import OrderedSet\n",
"from scipy.sparse import lil_matrix\n",
"from itertools import compress"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5fe6fbb5",
"metadata": {
"execution": {
"iopub.execute_input": "2023-05-23T22:23:13.616410Z",
"iopub.status.busy": "2023-05-23T22:23:13.615606Z",
"iopub.status.idle": "2023-05-23T22:23:16.332440Z",
"shell.execute_reply": "2023-05-23T22:23:16.330586Z"
},
"papermill": {
"duration": 2.736958,
"end_time": "2023-05-23T22:23:16.336104",
"exception": false,
"start_time": "2023-05-23T22:23:13.599146",
"status": "completed"
},
"tags": []
},
"outputs": [],
"source": [
"# Import DataSet ds and Models\n",
"from src.datasets import IMDBDataset\n",
"\n",
"ds = IMDBDataset(config_path=\"./configs/datasets/imdb.yaml\", root=\"datasets/imdb\", download=True)\n",
"print(\n",
" ds.x_test.shape,\n",
" ds.x_train.shape,\n",
" ds.x_val.shape,\n",
" ds.y_test.shape,\n",
" ds.y_train.shape,\n",
" ds.y_val.shape,\n",
")\n",
"print(\n",
" type(ds.x_test),\n",
" type(ds.x_train),\n",
" type(ds.x_val),\n",
" type(ds.y_test),\n",
" type(ds.y_train),\n",
" type(ds.y_val),\n",
")\n",
"\n",
"from src.models import AnalysisModels as Models\n",
"\n",
"models = Models(config_path=\"./configs/models/analysis-models.yaml\", root=\"./models/analysis-models\", download=True)\n",
"print(models)\n",
"\n",
"loaded_plain_model_rf = models.rf.model\n",
"loaded_plain_model_svc = models.svm.model\n",
"loaded_plain_model_lr = models.lr.model\n",
"loaded_plain_model_knn = models.knn.model\n",
"feature_names = ds.feature_names\n",
"\n",
"## Preprocess text\n",
"\n",
"x_train_imdb = ds.x_train\n",
"x_test_imdb = ds.x_test\n",
"x_val_imdb = ds.x_val\n",
"\n",
"# Binarize y - Positive is 1\n",
"y_train_imdb = ds.y_train\n",
"y_test_imdb = ds.y_test\n",
"y_val_imdb = ds.y_val"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "32fe8eca",
"metadata": {},
"outputs": [],
"source": [
"input_encoder = joblib.load(\"datasets/imdb/tfidf.pkl\")\n",
"loaded_vocab = input_encoder.vocabulary_"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "f5b5b511",
"metadata": {
"papermill": {
"duration": 0.014519,
"end_time": "2023-05-23T22:23:57.584948",
"exception": false,
"start_time": "2023-05-23T22:23:57.570429",
"status": "completed"
},
"tags": []
},
"source": [
"## SEDC Function"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "47081014",
"metadata": {
"execution": {
"iopub.execute_input": "2023-05-23T22:24:22.859676Z",
"iopub.status.busy": "2023-05-23T22:24:22.858557Z",
"iopub.status.idle": "2023-05-23T22:24:22.865677Z",
"shell.execute_reply": "2023-05-23T22:24:22.864477Z"
},
"papermill": {
"duration": 0.030868,
"end_time": "2023-05-23T22:24:22.870299",
"exception": false,
"start_time": "2023-05-23T22:24:22.839431",
"status": "completed"
},
"tags": []
},
"outputs": [],
"source": [
"def classifier_fn_lr(x, negative_to_positive=0):\n",
" \"\"\"Returns the prediction probability of class 1 -> Not class 0\"\"\"\n",
" #print('loaded_plain_model_svc.decision_function(x) - ', loaded_plain_model_svc.decision_function(x))\n",
" prediction = loaded_plain_model_lr.predict_proba(x)\n",
" # If prediction is [1] retrurn the probability of class 1 else return probability of class 0\n",
" if (negative_to_positive == 1):\n",
" return prediction[:,0]\n",
" return prediction[:,1]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1f737388",
"metadata": {},
"outputs": [],
"source": [
"# Do not need\n",
"# get the accuracy score of the model loaded_plain_model_lr\n",
"from sklearn.metrics import confusion_matrix\n",
"y_pred = loaded_plain_model_lr.predict(x_test_imdb)\n",
"accuracy = accuracy_score(y_test_imdb, y_pred)\n",
"print(f'Accuracy: {accuracy:.2f}')\n",
"\n",
"cm = confusion_matrix(y_test_imdb, y_pred)\n",
"\n",
"# Create a heatmap for the confusion matrix\n",
"plt.figure(figsize=(8, 6))\n",
"sns.heatmap(cm, annot=True, fmt=\"d\", cmap=\"Blues\", cbar=False)\n",
"plt.xlabel('Predicted')\n",
"plt.ylabel('Actual')\n",
"plt.title('Confusion Matrix')\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f92714b2",
"metadata": {},
"outputs": [],
"source": [
"coefficients = loaded_plain_model_lr.coef_\n",
"coefficients = coefficients.reshape(-1)\n",
"coefficients[0]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c43e7a95",
"metadata": {},
"outputs": [],
"source": [
"def get_antonyms(word, model):\n",
" \"\"\"\" Get antonyms of a word and their indices in the feature vector\n",
" Args:\n",
" word: word to get antonyms for\n",
" model: trained model with feature_importances_\n",
"\n",
" Returns:\n",
" tuple of antonyms and their indices in the feature vector\n",
" \"\"\"\n",
" antonyms = []\n",
" antonyms_indices = []\n",
" feature_importance = []\n",
" temp_dict = {}\n",
" for syn in wordnet.synsets(word):\n",
" for i in syn.lemmas():\n",
" if i.antonyms():\n",
" antonyms.append(i.antonyms()[0].name())\n",
" # Remove duplicates in antonyms\n",
" antonyms = list(set(antonyms))\n",
"\n",
" for word in antonyms:\n",
" if word in loaded_vocab:\n",
" # antonyms_indices.append(ds.feature_names.tolist().index(word))\n",
" # feature_importance.append(\n",
" # abs(coefficients[loaded_vocab[word]]))\n",
" temp_dict[word] = abs(coefficients[loaded_vocab[word]])\n",
" # Sort the antonyms and their indices based on feature importance\n",
" # antonyms_indices = [x for _, x in sorted(\n",
" # zip(feature_importance, antonyms_indices), reverse=True)]\n",
" # antonyms = [x for _, x in sorted(\n",
" # zip(feature_importance, antonyms), reverse=True)]\n",
" # print(temp_dict)\n",
" \n",
" # return the key with the highest value\n",
" if len(temp_dict) > 0:\n",
" max_importance_idx = max(temp_dict, key=temp_dict.get)\n",
" return [loaded_vocab[max_importance_idx]]\n",
" else:\n",
" return []\n",
" \n",
" # print(antonyms)\n",
" # print(feature_importance)\n",
" # if len(feature_importance) > 0:\n",
" # max_importance_idx = np.argmax(feature_importance)\n",
" # return [antonyms_indices[max_importance_idx]]\n",
" # else:\n",
" # return []\n",
"\n",
" # if len(antonyms_indices) > 0:\n",
" # return [antonyms_indices[0]]\n",
" # else:\n",
" # return []"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f51db9b1",
"metadata": {
"execution": {
"iopub.execute_input": "2023-05-23T22:24:14.202042Z",
"iopub.status.busy": "2023-05-23T22:24:14.201564Z",
"iopub.status.idle": "2023-05-23T22:24:14.231018Z",
"shell.execute_reply": "2023-05-23T22:24:14.229850Z"
},
"papermill": {
"duration": 0.050135,
"end_time": "2023-05-23T22:24:14.234197",
"exception": false,
"start_time": "2023-05-23T22:24:14.184062",
"status": "completed"
},
"tags": []
},
"outputs": [],
"source": [
"def perturb_fn(x,inst, print_flag=0):\n",
" \"\"\" Function to perturb instance x -> Deform the array -> assign 0 to the x-th column \"\"\"\n",
" \"\"\"\n",
" Returns perturbed instance inst\n",
" \"\"\"\n",
" inst[:,x]=0\n",
" return inst\n",
"\n",
"def replace_fn(x,y,inst, print_flag=0):\n",
" \"\"\" Function to perturb instance x -> Deform the array -> assign 0 to the x-th column \"\"\"\n",
" \"\"\"\n",
" Returns perturbed instance inst\n",
" \"\"\"\n",
" new_inst = inst.copy()\n",
" try:\n",
" temp_x = inst[:,x]\n",
" temp_y = inst[:,y]\n",
" new_inst[:,x] = temp_y\n",
" new_inst[:,y] = temp_x\n",
" except:\n",
" new_inst[:,x]=0\n",
" return new_inst\n",
"\n",
"def conditional_replace_fn(x,y,inst, print_flag=0):\n",
" for i in range(len(x)):\n",
" if isinstance(y[i], str):\n",
" inst[:,x[i]] = 0\n",
" else:\n",
" temp_x = inst[:,x[i]]\n",
" temp_y = inst[:,y[i]]\n",
" inst[:,x[i]] = temp_y\n",
" inst[:,y[i]] = temp_x\n",
" return inst\n",
"\n",
"\n",
"\n",
"def print_instance(pert_inst, ref_inst, feature_names):\n",
" \"\"\" Function to print the perturbed instance \"\"\"\n",
" \"\"\"\n",
" Returns perturbed instance inst\n",
" \"\"\"\n",
" indices_active_elements_ref = np.nonzero(ref_inst)[1]\n",
" indices_active_elements_pert = np.nonzero(pert_inst)[1]\n",
" ref_set = set(indices_active_elements_ref)\n",
" pert_set = set(indices_active_elements_pert)\n",
" # elements in ref_set but not in pert_set\n",
" removed_word_indices = ref_set - pert_set\n",
" # elements in pert_set but not in ref_set\n",
" added_word_indices = pert_set - ref_set\n",
" printable_array = []\n",
" for item in indices_active_elements_ref:\n",
" printable_array.append(\"..\" + feature_names[item] + \"..\")\n",
" # Change formatting of removed words\n",
" for item in removed_word_indices:\n",
" printable_array[printable_array.index(\"..\" + feature_names[item] + \"..\")] = \"--\" + feature_names[item] + \"--\"\n",
" # change formatting of added words\n",
" for item in added_word_indices:\n",
" printable_array.append(\"++\" + feature_names[item] + \"++\")\n",
" printable_array.append(classifier_fn_lr(pert_inst))\n",
" print(printable_array)\n",
" return printable_array\n",
"\n",
"def print_ref_instance(ref_inst, feaaure_names):\n",
" printable_array = []\n",
" indices_active_elements = np.nonzero(ref_inst)[1]\n",
" for item in indices_active_elements:\n",
" printable_array.append(\"..\" + feature_names[item] + \"..\")\n",
" print(printable_array)\n",
"\n",
"\"\"\"\n",
"Input:\n",
" - comb: \"best-first\" (combination of) feature(s) that is expanded\n",
" (e.g., comb_to_expand)\n",
" - expanded_combis: list of combinations of features that are already \n",
" expanded as \"best-first\"\n",
" - feature_set: indices of the active features of the instance \n",
" - candidates_to_expand: combinations of features that are candidates to be \n",
" expanded in next iterations or candidates for \"best-first\"\n",
" - explanations_sets: counterfactual explanations already found\n",
" - scores_candidates_to_expand: scores after perturbation for the candidate\n",
" combinations of features to be expanded\n",
" - instance: instance to be explained\n",
" - cf: classifier prediction probability function\n",
" or decision function. For ScikitClassifiers, this is classifier.predict_proba \n",
" or classifier.decision_function or classifier.predict_log_proba.\n",
" Make sure the function only returns one (float) value. For instance, if you\n",
" use a ScikitClassifier, transform the classifier.predict_proba as follows:\n",
"\n",
" def classifier_fn(X):\n",
" c=classification_model.predict_proba(X)\n",
" y_predicted_proba=c[:,1]\n",
" return y_predicted_proba\n",
"\n",
"Returns:\n",
" - explanation_candidates: combinations of features that are explanation\n",
" candidates to be checked in the next iteration\n",
" - candidates_to_expand: combinations of features that are candidates to be \n",
" expanded in next iterations or candidates for \"best-first\"\n",
" - expanded_combis: [list] list of combinations of features that are already \n",
" expanded as \"best-first\" \n",
" - scores_candidates_to_expand: scores after perturbation for the candidate\n",
" combinations of features to be expanded\n",
" - scores_explanation_candidates: scores after perturbation of explanation candidates\n",
"\"\"\"\n",
"\n",
"def expand_and_prune(comb, replacement_comb_to_expand, expanded_combis, feature_set, candidates_to_expand, candidates_to_expand_replacements, explanations_sets, explanation_replacement_sets, scores_candidates_to_expand, instance, cf, revert=0, replacements=[]):\n",
" \"\"\" Function to expand \"best-first\" feature combination and prune explanation_candidates and candidates_to_expand \"\"\" \n",
" \n",
" comb = OrderedSet(comb)\n",
" replacement_comb_to_expand = OrderedSet(replacement_comb_to_expand)\n",
" print(\"comb: \", comb)\n",
" print(\"replacement_comb_to_expand: \", replacement_comb_to_expand)\n",
" expanded_combis.append(comb)\n",
" \n",
" old_candidates_to_expand = [frozenset(x) for x in candidates_to_expand]\n",
" old_candidates_to_expand = set(old_candidates_to_expand)\n",
" print(\"feature_set: \", feature_set)\n",
" feature_set_new = []\n",
" feature_set_new_replacements = []\n",
" ## If the feature is not in the current combination -> add it to a new list\n",
" for feature in feature_set:\n",
" list_feature = list(feature)\n",
" if (len(comb & feature) == 0): #set operation: intersection\n",
" replacement_feature = get_antonyms(feature_names[list_feature[0]], loaded_plain_model_rf)\n",
" replacement_feature = frozenset(replacement_feature)\n",
" if replacement_feature == frozenset():\n",
" new_string = \"0\"*(len(comb)+1)\n",
" replacement_feature = frozenset([new_string])\n",
" #print(\"replacement_feature: \", replacement_feature, \"feature: \", feature)\n",
" feature_set_new.append(feature) # If the feature is not in the current combination to remove from the instance\n",
" feature_set_new_replacements.append(replacement_feature)\n",
"\n",
" print(\"feature_set_new: \", feature_set_new)\n",
" print(\"feature_set_new_replacements: \", feature_set_new_replacements)\n",
" # Add each element in the new set -> which were initially not present -> to the accepted combination -> create new combinations -> (EXPANSION)\n",
" new_explanation_candidates = []\n",
" new_explanation_candidates_replacements = []\n",
" # for element in feature_set_new:\n",
" # union = (comb|element) #set operation: union\n",
" # union_replacements = (replacement_comb_to_expand|feature_set_new_replacements[i])\n",
" # new_explanation_candidates.append(union) # Create new combinations to remove from the instance\n",
" # new_explanation_candidates_replacements.append(union_replacements)\n",
"\n",
" for i in range(len(feature_set_new)):\n",
" union = (comb|feature_set_new[i])\n",
" union_replacements = (replacement_comb_to_expand|feature_set_new_replacements[i])\n",
" new_explanation_candidates.append(union) # Create new combinations to remove from the instance\n",
" new_explanation_candidates_replacements.append(union_replacements)\n",
" \n",
" print(\"new_explanation_candidates: \", new_explanation_candidates)\n",
" print(\"new_explanation_candidates_replacements: \", new_explanation_candidates_replacements)\n",
" \n",
" #Add new explanation candidates to the list of candidates to expand\n",
" candidates_to_expand_notpruned = candidates_to_expand.copy()\n",
" candidates_to_expand_replacements_notpruned = candidates_to_expand_replacements.copy()\n",
" # for new_candidate in new_explanation_candidates:\n",
" # candidates_to_expand_notpruned.append(new_candidate)\n",
" for i in range(len(new_explanation_candidates_replacements)):\n",
" candidates_to_expand_notpruned.append(new_explanation_candidates[i])\n",
" candidates_to_expand_replacements_notpruned.append(new_explanation_candidates_replacements[i])\n",
"\n",
" print(\"candidates_to_expand_notpruned: \", candidates_to_expand_notpruned)\n",
" print(\"candidates_to_expand_replacements_notpruned: \", candidates_to_expand_replacements_notpruned)\n",
"\n",
" # Calculate scores of new combinations and add to scores_candidates_to_expand\n",
" # perturb each new candidate and get the score for each.\n",
" #perturbed_instances = [perturb_fn(x, inst=instance.copy(), print_flag=1) for x in new_explanation_candidates]\n",
" replaced_instances = []\n",
" for i in range(len(new_explanation_candidates)):\n",
" # #print(\"i: \", i, \"new_explanation_candidates[i]: \", new_explanation_candidates[i], \"new_explanation_candidates_replacements[i]: \", new_explanation_candidates_replacements[i])\n",
" # if isinstance(new_explanation_candidates_replacements[i][0], int):\n",
" # replaced_instances.append(perturb_fn(x=new_explanation_candidates[i], inst=instance.copy()))\n",
" # else:\n",
" replaced_instances.append(conditional_replace_fn(x=new_explanation_candidates[i], y=new_explanation_candidates_replacements[i], inst=instance.copy(), print_flag=1))\n",
" # -------------------------------------------\n",
" print(\"len(perturbed_instances): \", len(replaced_instances))\n",
" perturbed_instances = replaced_instances\n",
"\n",
" # word_sets = []\n",
" # replacement_word_sets = []\n",
" # for item in new_explanation_candidates:\n",
" # item_word = []\n",
" # word_replacement = []\n",
" # for index in item:\n",
" # item_word.append(feature_names[index])\n",
" # word_replacement.append(get_antonyms(feature_names[index], loaded_plain_model_rf))\n",
" # word_sets.append(item_word)\n",
" # replacement_word_sets.append(word_replacement)\n",
" # print(word_sets)\n",
" # print(replacement_word_sets)\n",
" # #replacements = np.array(replacement_word_sets).reshape(len(replacement_word_sets), 1)\n",
" # replacements = []\n",
" # # for features in replacement_word_sets:\n",
" # # replacements.append(OrderedSet(features))\n",
" # # replacements = [frozenset(x) for x in replacements]\n",
" \n",
" # replaced_instances = []\n",
" #perturbed_instances = [perturb_fn(x, inst=instance.copy()) for x in new_explanation_candidates]\n",
" print(\"Expanded sentences from the above chosen combination\")\n",
" for item in perturbed_instances:\n",
" print_instance(item, instance.copy(), feature_names)\n",
" scores_perturbed_new = [cf(x, revert) for x in perturbed_instances]\n",
" ## Append the newly created score array to the passes existing array\n",
" scores_candidates_to_expand_notpruned = scores_candidates_to_expand + scores_perturbed_new\n",
" # create a dictionary of scores dictionary where the \n",
" # keys are string representations of the candidates from candidates_to_expand_notpruned, and the \n",
" # values are the corresponding scores from scores_candidates_to_expand_notpruned\n",
" dictionary_scores = dict(zip([str(x) for x in candidates_to_expand_notpruned], scores_candidates_to_expand_notpruned))\n",
" \n",
" # *** Pruning step: remove all candidates to expand that have an explanation as subset ***\n",
" candidates_to_expand_pruned_explanations = []\n",
" candidates_to_expand_pruned_replacements_explanations = []\n",
" # take one combination from candidates\n",
" # Rewritten using list indices\n",
" # for combi in candidates_to_expand_notpruned:\n",
" # pruning=0\n",
" # for explanation in explanations_sets: # if an explanation is present as a subser in combi, does not add it to the to be expanded list -> because solution with a smaller size exists\n",
" # if ((explanation.issubset(combi)) or (explanation==combi)):\n",
" # pruning = pruning + 1\n",
" # if (pruning == 0): # If it is not a superset of a present explanation -> add it to the list\n",
" # candidates_to_expand_pruned_explanations.append(combi)\n",
" # Write the above function using list indices\n",
" for i in range(len(candidates_to_expand_notpruned)):\n",
" pruning=0\n",
" for explanation in explanations_sets:\n",
" if ((explanation.issubset(candidates_to_expand_notpruned[i])) or (explanation==candidates_to_expand_notpruned[i])):\n",
" pruning = pruning + 1\n",
" if (pruning == 0):\n",
" candidates_to_expand_pruned_explanations.append(candidates_to_expand_notpruned[i])\n",
" candidates_to_expand_pruned_replacements_explanations.append(candidates_to_expand_replacements_notpruned[i])\n",
"\n",
" # Each element is frozen as a set\n",
" candidates_to_expand_pruned_explanations_frozen = [frozenset(x) for x in candidates_to_expand_pruned_explanations]\n",
" candidates_to_expand_pruned_replacements_explanations_frozen = [frozenset(x) for x in candidates_to_expand_pruned_replacements_explanations]\n",
"\n",
" # But the total set f frozen sets are not frozen\n",
" candidates_to_expand_pruned_explanations_ = set(candidates_to_expand_pruned_explanations_frozen)\n",
" candidates_to_expand_pruned_replacements_explanations_ = set(candidates_to_expand_pruned_replacements_explanations_frozen)\n",
"\n",
" expanded_combis_frozen = [frozenset(x) for x in expanded_combis]\n",
" expanded_combis_ = set(expanded_combis_frozen)\n",
"\n",
" # *** Pruning step: remove all candidates to expand that are in expanded_combis *** -> Same as above\n",
" candidates_to_expand_pruned = (candidates_to_expand_pruned_explanations_ - expanded_combis_)\n",
" candidates_to_expand_pruned_replacements = (candidates_to_expand_pruned_replacements_explanations_ - expanded_combis_) \n",
" ind_dict = dict((k,i) for i,k in enumerate(candidates_to_expand_pruned_explanations_frozen))\n",
" indices = [ind_dict[x] for x in candidates_to_expand_pruned]\n",
" candidates_to_expand = [candidates_to_expand_pruned_explanations[i] for i in indices]\n",
" candidates_to_expand_replacements = [candidates_to_expand_pruned_replacements_explanations[i] for i in indices]\n",
"\n",
" #The new explanation candidates are the ones that are NOT in the old list of candidates to expand\n",
" new_explanation_candidates_pruned = (candidates_to_expand_pruned - old_candidates_to_expand) \n",
" candidates_to_expand_frozen = [frozenset(x) for x in candidates_to_expand]\n",
" candidates_to_expand_replacements_frozen = [frozenset(x) for x in candidates_to_expand_replacements]\n",
"\n",
" ind_dict2 = dict((k,i) for i,k in enumerate(candidates_to_expand_frozen))\n",
" indices2 = [ind_dict2[x] for x in new_explanation_candidates_pruned]\n",
" explanation_candidates = [candidates_to_expand[i] for i in indices2]\n",
" explanation_candidates_replacements = [candidates_to_expand_replacements[i] for i in indices2]\n",
"\n",
" # Get scores of the new candidates and explanations.\n",
" scores_candidates_to_expand = [dictionary_scores[x] for x in [str(c) for c in candidates_to_expand]]\n",
" scores_explanation_candidates = [dictionary_scores[x] for x in [str(c) for c in explanation_candidates]]\n",
" \n",
" return (explanation_candidates, explanation_candidates_replacements, candidates_to_expand, candidates_to_expand_replacements, expanded_combis, scores_candidates_to_expand, scores_explanation_candidates)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fdd1eefe",
"metadata": {},
"outputs": [],
"source": [
"class SEDC_Explainer(object):\n",
" \"\"\"Class for generating evidence counterfactuals for classifiers on behavioral/text data\"\"\"\n",
"\n",
" def __init__(\n",
" self,\n",
" feature_names,\n",
" classifier_fn,\n",
" threshold_classifier,\n",
" max_iter=100,\n",
" max_explained=1,\n",
" BB=True,\n",
" max_features=30,\n",
" time_maximum=120,\n",
" revert=0,\n",
" ):\n",
" \"\"\"Init function\n",
"\n",
" Args:\n",
" classifier_fn: [function] classifier prediction probability function\n",
" or decision function. For ScikitClassifiers, this is classifier.predict_proba\n",
" or classifier.decision_function or classifier.predict_log_proba.\n",
" Make sure the function only returns one (float) value. For instance, if you\n",
" use a ScikitClassifier, transform the classifier.predict_proba as follows:\n",
"\n",
" def classifier_fn(X):\n",
" c=classification_model.predict_proba(X)\n",
" y_predicted_proba=c[:,1]\n",
" return y_predicted_proba\n",
"\n",
" threshold_classifier: [float] the threshold that is used for classifying\n",
" instances as positive or not. When score or probability exceeds the\n",
" threshold value, then the instance is predicted as positive.\n",
" We have no default value, because it is important the user decides\n",
" a good value for the threshold.\n",
"\n",
" feature_names: [numpy.array] contains the interpretable feature names,\n",
" such as the words themselves in case of document classification or the names\n",
" of visited URLs.\n",
"\n",
" max_iter: [int] maximum number of iterations in the search procedure.\n",
" Default is set to 50.\n",
"\n",
" max_explained: [int] maximum number of EDC explanations generated.\n",
" Default is set to 1.\n",
"\n",
" BB: [“True” or “False”] when the algorithm is augmented with\n",
" branch-and-bound (BB=True), one is only interested in the (set of)\n",
" shortest explanation(s). Default is \"True\".\n",
"\n",
" max_features: [int] maximum number of features allowed in the explanation(s).\n",
" Default is set to 30.\n",
"\n",
" time_maximum: [int] maximum time allowed to generate explanations,\n",
" expressed in minutes. Default is set to 2 minutes (120 seconds).\n",
" \"\"\"\n",
"\n",
" self.feature_names = feature_names\n",
" self.classifier_fn = classifier_fn\n",
" self.threshold_classifier = threshold_classifier\n",
" self.max_iter = max_iter\n",
" self.max_explained = max_explained\n",
" self.BB = BB\n",
" self.max_features = max_features\n",
" self.time_maximum = time_maximum\n",
" self.revert = None\n",
" self.initial_class = None\n",
" self._report_data = {}\n",
"\n",
" def explanation(self, instance):\n",
" \"\"\"Generates evidence counterfactual explanation for the instance.\n",
" ONLY IF THE CURRENT INSTANCE IS POSITIVE -> Limitation\n",
"\n",
" Args:\n",
" instance: [numpy.array or sparse matrix] instance to explain\n",
"\n",
" Returns:\n",
" A dictionary where:\n",
"\n",
" explanation_set: explanation(s) ranked from high to low change\n",
" in predicted score or probability.\n",
" The number of explanations shown depends on the argument max_explained.\n",
"\n",
" number_active_elements: number of active elements of\n",
" the instance of interest.\n",
"\n",
" number_explanations: number of explanations found by algorithm.\n",
"\n",
" minimum_size_explanation: number of features in the smallest explanation.\n",
"\n",
" time_elapsed: number of seconds passed to generate explanation(s).\n",
"\n",
" explanations_score_change: change in predicted score/probability\n",
" when removing the features in the explanation, ranked from\n",
" high to low change.\n",
" \"\"\"\n",
"\n",
" # *** INITIALIZATION ***\n",
" print(\"Start initialization...\")\n",
" tic = time.time()\n",
" instance = lil_matrix(instance)\n",
" print(\"initial sentence is ... \")\n",
" print(instance.get_shape())\n",
" print_ref_instance(instance, self.feature_names)\n",
" iteration = 0\n",
" nb_explanations = 0\n",
" minimum_size_explanation = np.nan\n",
" explanations = []\n",
" explanations_replacements = []\n",
" explanations_sets = []\n",
" explanation_replacement_sets = []\n",
" explanations_score_change = []\n",
" expanded_combis = []\n",
" score_predicted = self.classifier_fn(instance) ## Returns Prediction Prob\n",
" # Intial class is 1 is score is greater than threshold\n",
" if score_predicted > self.threshold_classifier:\n",
" self.initial_class = [1]\n",
" else:\n",
" self.initial_class = [0]\n",
" self.revert = 1\n",
" print(\n",
" \"score_predicted \",\n",
" score_predicted,\n",
" \" initial_class \",\n",
" self.initial_class,\n",
" )\n",
"\n",
" reference = np.reshape(\n",
" np.zeros(np.shape(instance)[1]), (1, len(np.zeros(np.shape(instance)[1])))\n",
" )\n",
" reference = sparse.csr_matrix(reference)\n",
"\n",
" # explainer = shap.KernelExplainer(self.classifier_fn, reference, link=\"identity\")\n",
" # shapVals = explainer.shap_values(instance, nsamples=5000, l1_reg=\"aic\")\n",
"\n",
" # features = []\n",
" # for ind in range(len(shapVals[0])):\n",
" # if shapVals[0, ind] != 0:\n",
" # features.append({\"feature\": ind, \"shapValue\": shapVals[0, ind]})\n",
" # sorted_data_in = sorted(features, key=lambda x: x[\"shapValue\"], reverse=True)\n",
" # inverse_sorted_data_in = sorted(features, key=lambda x: x[\"shapValue\"])\n",
"\n",
" # if self.revert == 1:\n",
" # sorted_data_in = inverse_sorted_data_in\n",
"\n",
" indices_active_elements = np.nonzero(instance)[\n",
" 1\n",
" ] ## -> Gets non zero elements in the instance as an array [x, y, z]\n",
" # sorted_indices = sorted(\n",
" # indices_active_elements, key=lambda x: shapVals[0, x], reverse=True\n",
" # )\n",
" # indices_active_elements = np.array(sorted_indices)\n",
" number_active_elements = len(indices_active_elements)\n",
" indices_active_elements = indices_active_elements.reshape(\n",
" (number_active_elements, 1)\n",
" ) ## -> Reshape to get a predictable\n",
"\n",
" candidates_to_expand = ([]) # -> These combinations are further expanded -> These are the elements to be removed from the sentence\n",
" for features in indices_active_elements:\n",
" candidates_to_expand.append(OrderedSet(features))\n",
" ## > Gets an array with each element in reshaped incides as an ordered set -> [OrderedSet([430]), OrderedSet([588]), OrderedSet([595])]\n",
" candidates_to_expand_replacements = ([])\n",
" for features in indices_active_elements:\n",
" candidates_to_expand_replacements.append(OrderedSet(get_antonyms(self.feature_names[features[0]], loaded_plain_model_rf)))\n",
" for i in range(len(candidates_to_expand_replacements)):\n",
" if candidates_to_expand_replacements[i] == OrderedSet():\n",
" candidates_to_expand_replacements[i] = OrderedSet([\"0\"])\n",
" explanation_candidates = candidates_to_expand.copy()\n",
" explanation_candidates_replacements = candidates_to_expand_replacements.copy()\n",
" print(\"explanation_candidates /n\", explanation_candidates)\n",
" print(\"explanation_candidates_replacements /n\", explanation_candidates_replacements)\n",
" ## Gets a copy of the above array -> Initially\n",
"\n",
" feature_set = [\n",
" frozenset(x) for x in indices_active_elements\n",
" ] ## Immutable -> can be used as keys in dictionary\n",
" ## Used features in the current x-reference -> incides of the words in the review.\n",
"\n",
" print(\"Initialization is complete.\")\n",
" print(\"\\n Elapsed time %d \\n\" % (time.time() - tic))\n",
"\n",
" # *** WHILE LOOP ***\n",
" while (\n",
" (iteration < self.max_iter)\n",
" and (nb_explanations < self.max_explained)\n",
" and (len(candidates_to_expand) != 0)\n",
" and (len(explanation_candidates) != 0)\n",
" and ((time.time() - tic) < self.time_maximum)\n",
" ):\n",
" ## Stop if maximum iterations exceeded\n",
" # number of explanations generated is greater than the maximum explanations\n",
" # There are no candidates to expand\n",
" # There are no explanation candidates -> Used to force stop while loop below\n",
" # Or maximum allowed time exceeded\n",
" iteration += 1\n",
" print(\"\\n Iteration %d \\n\" % iteration)\n",
"\n",
" if iteration == 1:\n",
" print(\"Run in first iteration -> perturbation done \\n\")\n",
" # Print the word in each index in the explanation candidates\n",
" # for item in explanation_candidates:\n",
" # print([self.feature_names[x] for x in item])\n",
" replacements = [\n",
" get_antonyms(\n",
" self.feature_names[x[0]], loaded_plain_model_rf\n",
" )\n",
" for x in explanation_candidates\n",
" ]\n",
" # convert each element in replacement to a OrderedSet\n",
" replacements = explanation_candidates_replacements\n",
" print(\"replacements \\n\", replacements, \"\\n\")\n",
" print(\"explanation_candidates \\n\", explanation_candidates, \"\\n\")\n",
" perturbed_instances = []\n",
" print(\"After changing or removing words, \")\n",
" replaced_instances = []\n",
" for i in range(len(explanation_candidates)):\n",
" if replacements[i] == OrderedSet([\"0\"]):\n",
" replaced_instances.append(\n",
" perturb_fn(\n",
" x=explanation_candidates[i], inst=instance.copy()\n",
" )\n",
" )\n",
" else:\n",
" replaced_instances.append(\n",
" replace_fn(\n",
" x=explanation_candidates[i],\n",
" y=replacements[i],\n",
" inst=instance.copy(),\n",
" )\n",
" )\n",
" # Remove the elements in the indices given by the ordered set x and return an array fo such elements\n",
" # Removes only one element in the first run -> Contains sentences with one word removed\n",
" perturbed_instances = replaced_instances\n",
" for instance_p in perturbed_instances:\n",
" print_instance(instance_p, instance, self.feature_names)\n",
" scores_explanation_candidates = [\n",
" self.classifier_fn(x, self.revert) for x in perturbed_instances\n",
" ]\n",
" # Get predictions for each perturbed instance where one or more elements are removed from the initial instance\n",
" # It is in form of [[x], [y], [z]]\n",
" print(\n",
" \"scores_explanation_candidates \\n\",\n",
" scores_explanation_candidates,\n",
" \"\\n\",\n",
" )\n",
" scores_candidates_to_expand = scores_explanation_candidates.copy()\n",
"\n",
" scores_perturbed_new_combinations = [\n",
" x[0] for x in scores_explanation_candidates\n",
" ]\n",
" # Therefore get it to the shape [x, y, z] by getting the [0] th element of each element array\n",
" print(\n",
" \"scores_perturbed_new_combinations \", scores_perturbed_new_combinations\n",
" )\n",
"\n",
" # ***CHECK IF THERE ARE EXPLANATIONS***\n",
" new_explanations = list(\n",
" compress(\n",
" explanation_candidates,\n",
" scores_perturbed_new_combinations < self.threshold_classifier,\n",
" )\n",
" )\n",
" new_explanation_replacements = list(\n",
" compress(\n",
" explanation_candidates_replacements,\n",
" scores_perturbed_new_combinations < self.threshold_classifier,\n",
" )\n",
" )\n",
" # Get explanation candidates where their probability is less than the threshold classifier -> Positive becomes negative\n",
" # print(\"New Explanations \\n\", new_explanations)\n",
" explanations += list(\n",
" compress(\n",
" explanation_candidates,\n",
" scores_perturbed_new_combinations < self.threshold_classifier,\n",
" )\n",
" )\n",
" explanations_replacements += list(\n",
" compress(\n",
" explanation_candidates_replacements,\n",
" scores_perturbed_new_combinations < self.threshold_classifier,\n",
" )\n",
" )\n",
" # print(\"\\n explanations, explanations_score_change\", explanations)\n",
" nb_explanations += len(\n",
" list(\n",
" compress(\n",
" explanation_candidates,\n",
" scores_perturbed_new_combinations < self.threshold_classifier,\n",
" )\n",
" )\n",
" ) # Update number of explanations which pass the required threshold\n",
" explanations_sets += list(\n",
" compress(\n",
" explanation_candidates,\n",
" scores_perturbed_new_combinations < self.threshold_classifier,\n",
" )\n",
" )\n",
" explanation_replacement_sets += list(\n",
" compress(\n",
" explanation_candidates_replacements,\n",
" scores_perturbed_new_combinations < self.threshold_classifier,\n",
" )\n",
" )\n",
" explanations_sets = [\n",
" set(x) for x in explanations_sets\n",
" ] # Convert each array to a set -> to get the words\n",
" explanation_replacement_sets = [\n",
" set(x) for x in explanation_replacement_sets\n",
" ]\n",
" explanations_score_change += list(\n",
" compress(\n",
" scores_explanation_candidates,\n",
" scores_perturbed_new_combinations < self.threshold_classifier,\n",
" )\n",
" )\n",
" print('explanations_score_change', explanations_score_change)\n",
"\n",
" # Adjust max_length\n",
" if self.BB == True:\n",
" if len(explanations) != 0:\n",
" lengths = [] # Record length of each explanation found\n",
" for explanation in explanations:\n",
" lengths.append(len(explanation))\n",
" lengths = np.array(lengths)\n",
" max_length = lengths.min()\n",
" # Get minimum length of the found explanations as max length -> Do not search for explanations with longer length\n",
" else:\n",
" max_length = number_active_elements # Else can find maximum length equal to number of words in instance\n",
" else:\n",
" max_length = number_active_elements\n",
" print(\"\\n-------------Max length updated to - \", max_length)\n",
"\n",
" # Eliminate combinations from candidates_to_expand (\"best-first\" candidates) that can not be expanded\n",
" # Pruning based on Branch & Bound=True, max. features allowed and number of active features\n",
" candidates_to_expand_updated = []\n",
" candidates_to_expand_updated_replacements = []\n",
" scores_candidates_to_expand_updated = ([]) # enumerate -> Find count of || to list one after another\n",
" for j, combination in enumerate(candidates_to_expand):\n",
" if (\n",
" (len(combination) < number_active_elements)\n",
" and (len(combination) < max_length)\n",
" and (len(combination) < self.max_features)\n",
" ):\n",
" # Combination length should be less than the words in the input and max length of the required explanation and required maximum features\n",
" candidates_to_expand_updated.append(\n",
" combination\n",
" ) # If the combination matches, it is further expanded\n",
" scores_candidates_to_expand_updated.append(\n",
" scores_candidates_to_expand[j]\n",
" )\n",
" # Add the prediction score to the new array\n",
" # get the score from the scores_candidates_to_expand using the current index\n",
" candidates_to_expand_updated_replacements.append(\n",
" candidates_to_expand_replacements[j]\n",
" )\n",
" # Add the replacement to the new array\n",
"\n",
" print(\n",
" \"\\nlen(candidates_to_expand_updated)\",\n",
" len(candidates_to_expand_updated),\n",
" \" 0 \",\n",
" )\n",
" print(\n",
" \"\\nnb_explanations\",\n",
" nb_explanations,\n",
" \" >= self.max_explained \",\n",
" self.max_explained,\n",
" )\n",
"\n",
" # *** IF LOOP ***\n",
" # expanding the candidates to update will exceed the max length set in the earlier loop\n",
" if (len(candidates_to_expand_updated) == 0) or (\n",
" nb_explanations >= self.max_explained\n",
" ):\n",
" ## If the number of explanations exceeded the required number\n",
" ## or no candidates\n",
" ## no explanations present\n",
"\n",
" print(\"nb_explanations Stop iterations...\")\n",
" explanation_candidates = [] # stop algorithm\n",
" ## Found all the candidates\n",
" print(\n",
" \"scores_candidates_to_expand_updated \",\n",
" scores_candidates_to_expand_updated,\n",
" )\n",
" # print(\"candidates_to_expand_updated \", candidates_to_expand_updated)\n",
"\n",
" elif len(candidates_to_expand_updated) != 0:\n",
" ## If there are possible candidates\n",
"\n",
" explanation_candidates = []\n",
" it = 0 # Iteration of the while loop\n",
" indices = []\n",
"\n",
" scores_candidates_to_expand2 = []\n",
" for score in scores_candidates_to_expand_updated:\n",
" if score[0] < self.threshold_classifier:\n",
" scores_candidates_to_expand2.append(2 * score_predicted)\n",
" else:\n",
" scores_candidates_to_expand2.append(score)\n",
" # update candidate scores if they have score less than threshold -> To expand them further\n",
" # shap_candidates_to_expand2 = []\n",
" # for candidate in candidates_to_expand_updated:\n",
" # shapValues = 0\n",
" # for word in candidate:\n",
" # # find word in feature column in sorted_data\n",
" # for ind in range(len(sorted_data_in)):\n",
" # if sorted_data_in[ind][\"feature\"] == word:\n",
" # shapValues += sorted_data_in[ind][\"shapValue\"]\n",
" # break\n",
" # shap_candidates_to_expand2.append(shapValues)\n",
"\n",
" print(\n",
" \"\\n scores_candidates_to_expand2 before loop\",\n",
" scores_candidates_to_expand2,\n",
" )\n",
" print(len(explanation_candidates), it, \"<\", len(scores_candidates_to_expand2))\n",
"\n",
" # *** WHILE LOOP ***\n",
" while (\n",
" (len(explanation_candidates) == 0)\n",
" and (it < len(scores_candidates_to_expand2))\n",
" and ((time.time() - tic) < self.time_maximum)\n",
" ):\n",
" # Stop if candidates are found or looped through more than there are candidates or maximum time reached\n",
"\n",
" print(\"While loop iteration %d\" % it)\n",
"\n",
" if it != 0: # Because indices are not there in the first iteration\n",
" for index in indices:\n",
" scores_candidates_to_expand2[index] = 2 * score_predicted\n",
"\n",
" # print(\n",
" # \"\\n scores_candidates_to_expand2 after loop\",\n",
" # scores_candidates_to_expand2,\n",
" # )\n",
" # print(\"\\n indices\", indices)\n",
"\n",
" # do elementwise subtraction between score_predicted and scores_candidates_to_expand2\n",
" subtractionList = []\n",
" for item in scores_candidates_to_expand2:\n",
" subtractionList.append(item - score_predicted)\n",
" # for x, y in zip(score_predicted, scores_candidates_to_expand2):\n",
" # subtractionList.append(x - y)\n",
" #print(\"subtractionList\", subtractionList)\n",
"\n",
" # Do element wise subtraction between the prediction score of the x_ref and every element of the scores_candidates_to_expand2\n",
" index_combi_max = np.argmax(subtractionList)\n",
" # index_shap_max = np.argmax(shap_candidates_to_expand2)\n",
" # index_shap_min = np.argmin(shap_candidates_to_expand2)\n",
" if self.revert == 0:\n",
" index_combi_max = np.argmax(subtractionList)\n",
" else:\n",
" index_combi_max = np.argmin(subtractionList)\n",
" # if self.revert == 0:\n",
" # index_combi_max = index_shap_max\n",
" # else:\n",
" # index_combi_max = index_shap_min\n",
" # print(\n",
" # \"subtrac max \",\n",
" # index_combi_max,\n",
" # \" index_shap_max \",\n",
" # index_shap_max,\n",
" # )\n",
" # Get the index of the maximum value -> Expand it\n",
" print(\n",
" \"\\n index_combi_max\",\n",
" candidates_to_expand_updated[np.argmax(subtractionList)],\n",
" )\n",
" indices.append(index_combi_max)\n",
" expanded_combis.append(\n",
" candidates_to_expand_updated[index_combi_max]\n",
" )\n",
" # Add this combination to already expanded combinations as it will be expanded next by expand and prune function\n",
"\n",
" comb_to_expand = candidates_to_expand_updated[index_combi_max]\n",
" replacement_comb_to_expand = candidates_to_expand_updated_replacements[index_combi_max]\n",
" words_comb_selected = []\n",
" for item in candidates_to_expand_updated[index_combi_max]:\n",
" words_comb_selected.append(feature_names[item])\n",
" print(\"The chosen combination is \", words_comb_selected)\n",
" print_instance(conditional_replace_fn(candidates_to_expand_updated[index_combi_max], candidates_to_expand_updated_replacements[index_combi_max], instance.copy()), instance, self.feature_names)\n",
" print(\"It has a score of \", scores_candidates_to_expand_updated[index_combi_max])\n",
" # Expand the found combination with highest difference\n",
" print(\"comb_to_expand\", comb_to_expand)\n",
" print(\"replacement_comb_to_expand\", replacement_comb_to_expand)\n",
" func = expand_and_prune(\n",
" comb_to_expand,\n",
" replacement_comb_to_expand,\n",
" expanded_combis,\n",
" feature_set,\n",
" candidates_to_expand_updated,\n",
" candidates_to_expand_updated_replacements,\n",
" explanations_sets,\n",
" explanation_replacement_sets,\n",
" scores_candidates_to_expand_updated,\n",
" instance,\n",
" self.classifier_fn,\n",
" self.revert,\n",
" replacements,\n",
" )\n",
" \"\"\"Returns:\n",
" - explanation_candidates: combinations of features that are explanation\n",
" candidates to be checked in the next iteration\n",
" - candidates_to_expand: combinations of features that are candidates to\n",
" expanded in next iterations or candidates for \"best-first\"\n",
" - expanded_combis: [list] list of combinations of features that are already\n",
" expanded as \"best-first\"\n",
" - scores_candidates_to_expand: scores after perturbation for the candidate\n",
" combinations of features to be expanded\n",
" - scores_explanation_candidates: scores after perturbation of explanation candidates\"\"\"\n",
" explanation_candidates = func[0]\n",
" explanation_candidates_replacements = func[1]\n",
" candidates_to_expand = func[2]\n",
" candidates_to_expand_replacements = func[3]\n",
" expanded_combis = func[4]\n",
" scores_candidates_to_expand = func[5]\n",
" scores_explanation_candidates = func[6]\n",
"\n",
" it += 1\n",
"\n",
" print(\n",
" \"\\n\\n\\niteration - \", iteration, \" self.max_iter - \", self.max_iter\n",
" )\n",
" print(\n",
" \"\\n\\nlen(candidates_to_expand) - \",\n",
" len(candidates_to_expand),\n",
" \" != 0 \",\n",
" )\n",
" print(\n",
" \"\\n\\nlen(explanation_candidates) - \",\n",
" len(explanation_candidates),\n",
" \" !=0 \",\n",
" )\n",
" print(\n",
" \"\\n\\n(time.time() - tic) - \",\n",
" (time.time() - tic),\n",
" \" self.time_maximum - \",\n",
" self.time_maximum,\n",
" )\n",
" print(\"\\n Elapsed time %d \\n\" % (time.time() - tic))\n",
"\n",
" # *** FINAL PART OF ALGORITHM ***\n",
" print(\"Iterations are done.\")\n",
"\n",
" explanation_set = []\n",
" explanation_feature_names = []\n",
" index_of_min_length_explanation = -1\n",
" for i in range(len(explanations)):\n",
" explanation_feature_names = []\n",
" for features in explanations[i]:\n",
" explanation_feature_names.append(self.feature_names[features])\n",
" explanation_set.append(explanation_feature_names)\n",
"\n",
" if len(explanations) != 0:\n",
" lengths_explanation = []\n",
" for explanation in explanations:\n",
" l = len(explanation)\n",
" lengths_explanation.append(l)\n",
" minimum_size_explanation = np.min(lengths_explanation)\n",
" index_of_min_length_explanation = np.argmin(lengths_explanation)\n",
" try:\n",
" print(\"argmin\", explanations[index_of_min_length_explanation])\n",
" except:\n",
" pass\n",
"\n",
" print(\"Final sentence\")\n",
" final_sentence = conditional_replace_fn(explanations[index_of_min_length_explanation], explanations_replacements[index_of_min_length_explanation], instance.copy())\n",
" final_sentence = print_instance(final_sentence, instance, self.feature_names)\n",
" new_instance = instance.copy()\n",
" new_replacements = []\n",
" replacement_features = []\n",
" for feature in explanations[index_of_min_length_explanation]:\n",
" feature_replacement = get_antonyms(\n",
" feature_names[feature], loaded_plain_model_rf\n",
" )\n",
" print(\"feature_replacement\", feature_replacement)\n",
" new_replacements.append(feature_replacement)\n",
" print(\"new_replacements\", new_replacements)\n",
" print(\"replacementfeature\", feature_names[new_replacements[0]])\n",
"\n",
" output_removed_words = []\n",
" for item in explanations[index_of_min_length_explanation]:\n",
" output_removed_words.append(feature_names[item])\n",
" try:\n",
" replacementWords = []\n",
" for item_ind in range(len(new_replacements)):\n",
" replacementWords.append(\n",
" {\n",
" \"feature\": feature_names[\n",
" explanations[index_of_min_length_explanation][item_ind]\n",
" ],\n",
" \"replacement\": feature_names[new_replacements[item_ind]][0],\n",
" }\n",
" )\n",
" print(\"replacementWords\", replacementWords)\n",
" except:\n",
" pass\n",
"\n",
" new_insatnce = instance.copy()\n",
" index_of_min_length_explanation = -1\n",
" for relpacement_feature_index in range(len(new_replacements)):\n",
" if new_replacements[relpacement_feature_index] != []:\n",
" new_insatnce = replace_fn(\n",
" x=explanations[index_of_min_length_explanation][\n",
" relpacement_feature_index\n",
" ],\n",
" y=new_replacements[relpacement_feature_index],\n",
" inst=new_insatnce,\n",
" )\n",
" replacement_features.append(\n",
" feature_names[new_replacements[relpacement_feature_index]]\n",
" )\n",
" else:\n",
" new_insatnce = perturb_fn(\n",
" explanations[index_of_min_length_explanation][\n",
" relpacement_feature_index\n",
" ],\n",
" new_insatnce,\n",
" )\n",
" replacement_features.append(feature_names[relpacement_feature_index])\n",
" final_prob = self.classifier_fn(new_insatnce)\n",
" print(\"final_prob\", final_prob)\n",
"\n",
" final_exp = []\n",
" for i in range (len(explanations[index_of_min_length_explanation])):\n",
" if new_replacements[i] != []:\n",
" final_exp.append([output_removed_words[i], feature_names[new_replacements[i]][0]])\n",
" else:\n",
" final_exp.append([output_removed_words[i], \"---\"])\n",
"\n",
" number_explanations = len(explanations)\n",
" if np.size(explanations_score_change) > 1:\n",
" inds = np.argsort(explanations_score_change, axis=0)\n",
" inds = np.fliplr([inds])[0]\n",
" inds_2 = []\n",
" for i in range(np.size(inds)):\n",
" inds_2.append(inds[i][0])\n",
" explanation_set_adjusted = []\n",
" for i in range(np.size(inds)):\n",
" j = inds_2[i]\n",
" explanation_set_adjusted.append(explanation_set[j])\n",
" explanations_score_change_adjusted = []\n",
" for i in range(np.size(inds)):\n",
" j = inds_2[i]\n",
" explanations_score_change_adjusted.append(explanations_score_change[j])\n",
" explanation_set = explanation_set_adjusted\n",
" explanations_score_change = explanations_score_change_adjusted\n",
"\n",
" time_elapsed = time.time() - tic\n",
" print(\"\\n Total elapsed time %d \\n\" % time_elapsed)\n",
"\n",
" indices_active_elements = np.nonzero(instance)[1]\n",
" # Find the elements in indices_active_elements_explain that are not in indices_active_elements\n",
" print(\"indices_active_elements\", indices_active_elements)\n",
"\n",
" return {\n",
" \"final_exp\": final_exp,\n",
" \"number active elements\": number_active_elements,\n",
" \"number explanations found\": number_explanations,\n",
" \"size smallest explanation\": minimum_size_explanation,\n",
" \"time elapsed\": time_elapsed,\n",
" \"differences score\": explanations_score_change[0 : self.max_explained],\n",
" \"iterations\": iteration,\n",
" \"final_sentence\": final_sentence,\n",
" }"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e5bb9902",
"metadata": {},
"outputs": [],
"source": [
"# Do no need\n",
"class SEDC_Explainer_no(object):\n",
" \"\"\"Class for generating evidence counterfactuals for classifiers on behavioral/text data\"\"\"\n",
"\n",
" def __init__(\n",
" self,\n",
" feature_names,\n",
" classifier_fn,\n",
" threshold_classifier,\n",
" max_iter=100,\n",
" max_explained=1,\n",
" BB=True,\n",
" max_features=30,\n",
" time_maximum=120,\n",
" revert=0,\n",
" ):\n",
" \"\"\"Init function\n",
"\n",
" Args:\n",
" classifier_fn: [function] classifier prediction probability function\n",
" or decision function. For ScikitClassifiers, this is classifier.predict_proba\n",
" or classifier.decision_function or classifier.predict_log_proba.\n",
" Make sure the function only returns one (float) value. For instance, if you\n",
" use a ScikitClassifier, transform the classifier.predict_proba as follows:\n",
"\n",
" def classifier_fn(X):\n",
" c=classification_model.predict_proba(X)\n",
" y_predicted_proba=c[:,1]\n",
" return y_predicted_proba\n",
"\n",
" threshold_classifier: [float] the threshold that is used for classifying\n",
" instances as positive or not. When score or probability exceeds the\n",
" threshold value, then the instance is predicted as positive.\n",
" We have no default value, because it is important the user decides\n",
" a good value for the threshold.\n",
"\n",
" feature_names: [numpy.array] contains the interpretable feature names,\n",
" such as the words themselves in case of document classification or the names\n",
" of visited URLs.\n",
"\n",
" max_iter: [int] maximum number of iterations in the search procedure.\n",
" Default is set to 50.\n",
"\n",
" max_explained: [int] maximum number of EDC explanations generated.\n",
" Default is set to 1.\n",
"\n",
" BB: [“True” or “False”] when the algorithm is augmented with\n",
" branch-and-bound (BB=True), one is only interested in the (set of)\n",
" shortest explanation(s). Default is \"True\".\n",
"\n",
" max_features: [int] maximum number of features allowed in the explanation(s).\n",
" Default is set to 30.\n",
"\n",
" time_maximum: [int] maximum time allowed to generate explanations,\n",
" expressed in minutes. Default is set to 2 minutes (120 seconds).\n",
" \"\"\"\n",
"\n",
" self.feature_names = feature_names\n",
" self.classifier_fn = classifier_fn\n",
" self.threshold_classifier = threshold_classifier\n",
" self.max_iter = max_iter\n",
" self.max_explained = max_explained\n",
" self.BB = BB\n",
" self.max_features = max_features\n",
" self.time_maximum = time_maximum\n",
" self.revert = None\n",
" self.initial_class = None\n",
"\n",
" def explanation(self, instance):\n",
" \"\"\"Generates evidence counterfactual explanation for the instance.\n",
" ONLY IF THE CURRENT INSTANCE IS POSITIVE -> Limitation\n",
"\n",
" Args:\n",
" instance: [numpy.array or sparse matrix] instance to explain\n",
"\n",
" Returns:\n",
" A dictionary where:\n",
"\n",
" explanation_set: explanation(s) ranked from high to low change\n",
" in predicted score or probability.\n",
" The number of explanations shown depends on the argument max_explained.\n",
"\n",
" number_active_elements: number of active elements of\n",
" the instance of interest.\n",
"\n",
" number_explanations: number of explanations found by algorithm.\n",
"\n",
" minimum_size_explanation: number of features in the smallest explanation.\n",
"\n",
" time_elapsed: number of seconds passed to generate explanation(s).\n",
"\n",
" explanations_score_change: change in predicted score/probability\n",
" when removing the features in the explanation, ranked from\n",
" high to low change.\n",
" \"\"\"\n",
"\n",
" # *** INITIALIZATION ***\n",
" print(\"Start initialization...\")\n",
" tic = time.time()\n",
" instance = lil_matrix(instance)\n",
" iteration = 0\n",
" nb_explanations = 0\n",
" minimum_size_explanation = np.nan\n",
" explanations = []\n",
" explanations_sets = []\n",
" explanations_score_change = []\n",
" expanded_combis = []\n",
" score_predicted = self.classifier_fn(instance) ## Returns Prediction Prob\n",
" # Intial class is 1 is score is greater than threshold\n",
" if score_predicted > self.threshold_classifier:\n",
" self.initial_class = [1]\n",
" else:\n",
" self.initial_class = [0]\n",
" self.revert = 1\n",
" print(\"score_predicted \", score_predicted, \" initial_class \", self.initial_class)\n",
"\n",
" reference = np.reshape(\n",
" np.zeros(np.shape(instance)[1]), (1, len(np.zeros(np.shape(instance)[1])))\n",
" )\n",
" reference = sparse.csr_matrix(reference)\n",
"\n",
" explainer = shap.KernelExplainer(self.classifier_fn, reference, link=\"identity\")\n",
" shapVals = explainer.shap_values(instance, nsamples=5000, l1_reg=\"aic\")\n",
"\n",
" features = []\n",
" for ind in range(len(shapVals[0])):\n",
" if shapVals[0, ind] != 0:\n",
" features.append({\"feature\": ind, \"shapValue\": shapVals[0, ind]})\n",
" sorted_data_in = sorted(features, key=lambda x: x[\"shapValue\"], reverse=True)\n",
" inverse_sorted_data_in = sorted(features, key=lambda x: x[\"shapValue\"])\n",
"\n",
" if self.revert == 1:\n",
" sorted_data_in = inverse_sorted_data_in\n",
"\n",
" indices_active_elements = np.nonzero(instance)[\n",
" 1\n",
" ] ## -> Gets non zero elements in the instance as an array [x, y, z]\n",
" sorted_indices = sorted(\n",
" indices_active_elements, key=lambda x: shapVals[0, x], reverse=True\n",
" )\n",
" indices_active_elements = np.array(sorted_indices)\n",
" number_active_elements = len(indices_active_elements)\n",
" indices_active_elements = indices_active_elements.reshape(\n",
" (number_active_elements, 1)\n",
" ) ## -> Reshape to get a predictable\n",
"\n",
" candidates_to_expand = (\n",
" []\n",
" ) # -> These combinations are further expanded -> These are the elements to be removed from the sentence\n",
" for features in indices_active_elements:\n",
" candidates_to_expand.append(OrderedSet(features))\n",
" print(\"candidates_to_expand \", candidates_to_expand)\n",
" ## > Gets an array with each element in reshaped incides as an ordered set -> [OrderedSet([430]), OrderedSet([588]), OrderedSet([595])]\n",
"\n",
" explanation_candidates = candidates_to_expand.copy()\n",
" print(\"explanation_candidates \", explanation_candidates)\n",
" ## Gets a copy of the above array -> Initially\n",
"\n",
" feature_set = [\n",
" frozenset(x) for x in indices_active_elements\n",
" ] ## Immutable -> can be used as keys in dictionary\n",
" ## Used features in the current x-reference -> incides of the words in the review.\n",
"\n",
" print(\"Initialization is complete.\")\n",
" print(\"\\n Elapsed time %d \\n\" % (time.time() - tic))\n",
"\n",
" # *** WHILE LOOP ***\n",
" while (\n",
" (iteration < self.max_iter)\n",
" and (nb_explanations < self.max_explained)\n",
" and (len(candidates_to_expand) != 0)\n",
" and (len(explanation_candidates) != 0)\n",
" and ((time.time() - tic) < self.time_maximum)\n",
" ):\n",
" ## Stop if maximum iterations exceeded\n",
" # number of explanations generated is greater than the maximum explanations\n",
" # There are no candidates to expand\n",
" # There are no explanation candidates -> Used to force stop while loop below\n",
" # Or maximum allowed time exceeded\n",
" iteration += 1\n",
" print(\"\\n Iteration %d \\n\" % iteration)\n",
"\n",
" if iteration == 1:\n",
" print(\"Run in first iteration -> perturbation done \\n\")\n",
" # Print the word in each index in the explanation candidates\n",
" # for item in explanation_candidates:\n",
" # print([self.feature_names[x] for x in item])\n",
" replacements = [\n",
" get_antonyms(\n",
" self.feature_names[x[0]], loaded_plain_model_rf\n",
" )\n",
" for x in explanation_candidates\n",
" ]\n",
" # convert each element in replacement to a OrderedSet\n",
" replacements = [OrderedSet(x) for x in replacements]\n",
" print(\"replacements \\n\", replacements, \"\\n\")\n",
" print(\"explanation_candidates \\n\", explanation_candidates, \"\\n\")\n",
" perturbed_instances = [\n",
" perturb_fn(x, inst=instance.copy()) for x in explanation_candidates\n",
" ]\n",
" replaced_instances = []\n",
" for i in range(len(explanation_candidates)):\n",
" if replacements[i] == OrderedSet():\n",
" replaced_instances.append(\n",
" perturb_fn(\n",
" x=explanation_candidates[i], inst=instance.copy()\n",
" )\n",
" )\n",
" else:\n",
" replaced_instances.append(\n",
" replace_fn(\n",
" x=explanation_candidates[i],\n",
" y=replacements[i],\n",
" inst=instance.copy(),\n",
" )\n",
" )\n",
" print('replaced_instances \\n', replaced_instances, '\\n')\n",
" # Remove the elements in the indices given by the ordered set x and return an array fo such elements\n",
" # Removes only one element in the first run -> Contains sentences with one word removed\n",
" perturbed_instances = replaced_instances\n",
" scores_explanation_candidates = [\n",
" self.classifier_fn(x, self.revert) for x in perturbed_instances\n",
" ]\n",
" # Get predictions for each perturbed instance where one or more elements are removed from the initial instance\n",
" # It is in form of [[x], [y], [z]]\n",
" print(\n",
" \"scores_explanation_candidates \\n\",\n",
" scores_explanation_candidates,\n",
" \"\\n\",\n",
" )\n",
" scores_candidates_to_expand = scores_explanation_candidates.copy()\n",
"\n",
" scores_perturbed_new_combinations = [\n",
" x[0] for x in scores_explanation_candidates\n",
" ]\n",
" # Therefore get it to the shape [x, y, z] by getting the [0] th element of each element array\n",
" # print(\n",
" # \"scores_perturbed_new_combinations \", scores_perturbed_new_combinations\n",
" # )\n",
"\n",
" # ***CHECK IF THERE ARE EXPLANATIONS***\n",
" new_explanations = list(\n",
" compress(\n",
" explanation_candidates,\n",
" scores_perturbed_new_combinations < self.threshold_classifier,\n",
" )\n",
" )\n",
" # Get explanation candidates where their probability is less than the threshold classifier -> Positive becomes negative\n",
" # print(\"New Explanations \\n\", new_explanations)\n",
" explanations += list(\n",
" compress(\n",
" explanation_candidates,\n",
" scores_perturbed_new_combinations < self.threshold_classifier,\n",
" )\n",
" )\n",
" # print(\"\\n explanations, explanations_score_change\", explanations)\n",
" nb_explanations += len(\n",
" list(\n",
" compress(\n",
" explanation_candidates,\n",
" scores_perturbed_new_combinations < self.threshold_classifier,\n",
" )\n",
" )\n",
" ) # Update number of explanations which pass the required threshold\n",
" explanations_sets += list(\n",
" compress(\n",
" explanation_candidates,\n",
" scores_perturbed_new_combinations < self.threshold_classifier,\n",
" )\n",
" )\n",
" explanations_sets = [\n",
" set(x) for x in explanations_sets\n",
" ] # Convert each array to a set -> to get the words\n",
" explanations_score_change += list(\n",
" compress(\n",
" scores_explanation_candidates,\n",
" scores_perturbed_new_combinations < self.threshold_classifier,\n",
" )\n",
" )\n",
" # print('explanations_score_change', explanations_score_change)\n",
"\n",
" # Adjust max_length\n",
" if self.BB == True:\n",
" if len(explanations) != 0:\n",
" lengths = [] # Record length of each explanation found\n",
" for explanation in explanations:\n",
" lengths.append(len(explanation))\n",
" lengths = np.array(lengths)\n",
" max_length = lengths.min()\n",
" # Get minimum length of the found explanations as max length -> Do not search for explanations with longer length\n",
" else:\n",
" max_length = number_active_elements # Else can find maximum length equal to number of words in instance\n",
" else:\n",
" max_length = number_active_elements\n",
" print(\"\\n-------------Max length updated to - \", max_length)\n",
"\n",
" # Eliminate combinations from candidates_to_expand (\"best-first\" candidates) that can not be expanded\n",
" # Pruning based on Branch & Bound=True, max. features allowed and number of active features\n",
" candidates_to_expand_updated = []\n",
" scores_candidates_to_expand_updated = (\n",
" []\n",
" ) # enumerate -> Find count of || to list one after another\n",
" for j, combination in enumerate(candidates_to_expand):\n",
" if (\n",
" (len(combination) < number_active_elements)\n",
" and (len(combination) < max_length)\n",
" and (len(combination) < self.max_features)\n",
" ):\n",
" # Combination length should be less than the words in the input and max length of the required explanation and required maximum features\n",
" candidates_to_expand_updated.append(\n",
" combination\n",
" ) # If the combination matches, it is further expanded\n",
" scores_candidates_to_expand_updated.append(\n",
" scores_candidates_to_expand[j]\n",
" )\n",
" # Add the prediction score to the new array\n",
" # get the score from the scores_candidates_to_expand using the current index\n",
"\n",
" print(\n",
" \"\\nlen(candidates_to_expand_updated)\",\n",
" len(candidates_to_expand_updated),\n",
" \" 0 \",\n",
" )\n",
" print(\n",
" \"\\nnb_explanations\",\n",
" nb_explanations,\n",
" \" >= self.max_explained \",\n",
" self.max_explained,\n",
" )\n",
"\n",
" # *** IF LOOP ***\n",
" # expanding the candidates to update will exceed the max length set in the earlier loop\n",
" if (len(candidates_to_expand_updated) == 0) or (\n",
" nb_explanations >= self.max_explained\n",
" ):\n",
" ## If the number of explanations exceeded the required number\n",
" ## or no candidates\n",
" ## no explanations present\n",
"\n",
" print(\"nb_explanations Stop iterations...\")\n",
" explanation_candidates = [] # stop algorithm\n",
" ## Found all the candidates\n",
" print(\n",
" \"scores_candidates_to_expand_updated \",\n",
" scores_candidates_to_expand_updated,\n",
" )\n",
" # print(\"candidates_to_expand_updated \", candidates_to_expand_updated)\n",
"\n",
" elif len(candidates_to_expand_updated) != 0:\n",
" ## If there are possible candidates\n",
" print(\"elif\", len(candidates_to_expand_updated), \" != 0\")\n",
"\n",
" explanation_candidates = []\n",
" it = 0 # Iteration of the while loop\n",
" indices = []\n",
"\n",
" scores_candidates_to_expand2 = []\n",
" for score in scores_candidates_to_expand_updated:\n",
" if score[0] < self.threshold_classifier:\n",
" scores_candidates_to_expand2.append(2 * score_predicted)\n",
" else:\n",
" scores_candidates_to_expand2.append(score)\n",
" # update candidate scores if they have score less than threshold -> To expand them further\n",
" shap_candidates_to_expand2 = []\n",
" for candidate in candidates_to_expand_updated:\n",
" shapValues = 0\n",
" for word in candidate:\n",
" # find word in feature column in sorted_data\n",
" for ind in range(len(sorted_data_in)):\n",
" if sorted_data_in[ind][\"feature\"] == word:\n",
" shapValues += sorted_data_in[ind][\"shapValue\"]\n",
" break\n",
" shap_candidates_to_expand2.append(shapValues)\n",
"\n",
" # print(\n",
" # \"\\n scores_candidates_to_expand2 before loop\",\n",
" # scores_candidates_to_expand2,\n",
" # )\n",
"\n",
" # *** WHILE LOOP ***\n",
" while (\n",
" (len(explanation_candidates) == 0)\n",
" and (it < len(scores_candidates_to_expand2))\n",
" and ((time.time() - tic) < self.time_maximum)\n",
" ):\n",
" # Stop if candidates are found or looped through more than there are candidates or maximum time reached\n",
"\n",
" print(\"While loop iteration %d\" % it)\n",
"\n",
" if it != 0: # Because indices are not there in the first iteration\n",
" for index in indices:\n",
" scores_candidates_to_expand2[index] = 2 * score_predicted\n",
"\n",
" # print(\n",
" # \"\\n scores_candidates_to_expand2 after loop\",\n",
" # scores_candidates_to_expand2,\n",
" # )\n",
" # print(\"\\n indices\", indices)\n",
"\n",
" # do elementwise subtraction between score_predicted and scores_candidates_to_expand2\n",
" subtractionList = []\n",
" for x, y in zip(score_predicted, scores_candidates_to_expand2):\n",
" print(\"\\n x, y\", x - y)\n",
" subtractionList.append(x - y)\n",
"\n",
" # Do element wise subtraction between the prediction score of the x_ref and every element of the scores_candidates_to_expand2\n",
" index_combi_max = np.argmax(subtractionList)\n",
" if self.revert == 0:\n",
" index_combi_max = np.argmax(subtractionList)\n",
" else:\n",
" index_combi_max = np.argmin(subtractionList)\n",
" # index_shap_max = np.argmax(shap_candidates_to_expand2)\n",
" # index_shap_min = np.argmin(shap_candidates_to_expand2)\n",
" # if self.revert == 0:\n",
" # index_combi_max = index_shap_max\n",
" # else:\n",
" # index_combi_max = index_shap_min\n",
" # print(\n",
" # \"subtrac max \",\n",
" # index_combi_max,\n",
" # \" index_shap_max \",\n",
" # index_shap_max,\n",
" # )\n",
" # # Get the index of the maximum value -> Expand it\n",
" # print(\n",
" # \"\\n index_combi_max\",\n",
" # candidates_to_expand_updated[np.argmax(subtractionList)],\n",
" # \"\\n index_shap_max\",\n",
" # candidates_to_expand_updated[index_combi_max],\n",
" # )\n",
" indices.append(index_combi_max)\n",
" expanded_combis.append(\n",
" candidates_to_expand_updated[index_combi_max]\n",
" )\n",
" # Add this combination to already expanded combinations as it will be expanded next by expand and prune function\n",
"\n",
" comb_to_expand = candidates_to_expand_updated[index_combi_max]\n",
" # Expand the found combination with highest difference\n",
" func = expand_and_prune(\n",
" comb_to_expand,\n",
" expanded_combis,\n",
" feature_set,\n",
" candidates_to_expand_updated,\n",
" explanations_sets,\n",
" scores_candidates_to_expand_updated,\n",
" instance,\n",
" self.classifier_fn,\n",
" self.revert,\n",
" )\n",
" \"\"\"Returns:\n",
" - explanation_candidates: combinations of features that are explanation\n",
" candidates to be checked in the next iteration\n",
" - candidates_to_expand: combinations of features that are candidates to\n",
" expanded in next iterations or candidates for \"best-first\"\n",
" - expanded_combis: [list] list of combinations of features that are already\n",
" expanded as \"best-first\"\n",
" - scores_candidates_to_expand: scores after perturbation for the candidate\n",
" combinations of features to be expanded\n",
" - scores_explanation_candidates: scores after perturbation of explanation candidates\"\"\"\n",
" explanation_candidates = func[0]\n",
" candidates_to_expand = func[1]\n",
" expanded_combis = func[2]\n",
" scores_candidates_to_expand = func[3]\n",
" scores_explanation_candidates = func[4]\n",
"\n",
" it += 1\n",
"\n",
" print(\n",
" \"\\n\\n\\niteration - \", iteration, \" self.max_iter - \", self.max_iter\n",
" )\n",
" print(\n",
" \"\\n\\nlen(candidates_to_expand) - \",\n",
" len(candidates_to_expand),\n",
" \" != 0 \",\n",
" )\n",
" print(\n",
" \"\\n\\nlen(explanation_candidates) - \",\n",
" len(explanation_candidates),\n",
" \" !=0 \",\n",
" )\n",
" print(\n",
" \"\\n\\n(time.time() - tic) - \",\n",
" (time.time() - tic),\n",
" \" self.time_maximum - \",\n",
" self.time_maximum,\n",
" )\n",
" print(\"\\n Elapsed time %d \\n\" % (time.time() - tic))\n",
"\n",
" # *** FINAL PART OF ALGORITHM ***\n",
" print(\"Iterations are done.\")\n",
"\n",
" explanation_set = []\n",
" explanation_feature_names = []\n",
" for i in range(len(explanations)):\n",
" explanation_feature_names = []\n",
" for features in explanations[i]:\n",
" explanation_feature_names.append(self.feature_names[features])\n",
" explanation_set.append(explanation_feature_names)\n",
"\n",
" if len(explanations) != 0:\n",
" lengths_explanation = []\n",
" for explanation in explanations:\n",
" l = len(explanation)\n",
" lengths_explanation.append(l)\n",
" minimum_size_explanation = np.min(lengths_explanation)\n",
"\n",
" number_explanations = len(explanations)\n",
" if np.size(explanations_score_change) > 1:\n",
" inds = np.argsort(explanations_score_change, axis=0)\n",
" inds = np.fliplr([inds])[0]\n",
" inds_2 = []\n",
" for i in range(np.size(inds)):\n",
" inds_2.append(inds[i][0])\n",
" explanation_set_adjusted = []\n",
" for i in range(np.size(inds)):\n",
" j = inds_2[i]\n",
" explanation_set_adjusted.append(explanation_set[j])\n",
" explanations_score_change_adjusted = []\n",
" for i in range(np.size(inds)):\n",
" j = inds_2[i]\n",
" explanations_score_change_adjusted.append(explanations_score_change[j])\n",
" explanation_set = explanation_set_adjusted\n",
" explanations_score_change = explanations_score_change_adjusted\n",
"\n",
" time_elapsed = time.time() - tic\n",
" print(\"\\n Total elapsed time %d \\n\" % time_elapsed)\n",
"\n",
" print(\n",
" \"If we remove the words \",\n",
" explanation_set[0 : self.max_explained],\n",
" \"From the review, the prediction will be reversed\",\n",
" )\n",
"\n",
" return {\n",
" \"explanation set\": explanation_set[0 : self.max_explained],\n",
" \"number active elements\": number_active_elements,\n",
" \"number explanations found\": number_explanations,\n",
" \"size smallest explanation\": minimum_size_explanation,\n",
" \"time elapsed\": time_elapsed,\n",
" \"differences score\": explanations_score_change[0 : self.max_explained],\n",
" \"iterations\": iteration,\n",
" }"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e52259f6",
"metadata": {
"execution": {
"iopub.execute_input": "2023-05-23T22:24:14.775696Z",
"iopub.status.busy": "2023-05-23T22:24:14.775084Z",
"iopub.status.idle": "2023-05-23T22:24:22.763555Z",
"shell.execute_reply": "2023-05-23T22:24:22.762523Z"
},
"papermill": {
"duration": 8.009867,
"end_time": "2023-05-23T22:24:22.767023",
"exception": false,
"start_time": "2023-05-23T22:24:14.757156",
"status": "completed"
},
"tags": []
},
"outputs": [],
"source": [
"# Get threshold_classifier_probs\n",
"p = np.sum(y_train_imdb)/np.size(y_train_imdb)\n",
"\n",
"probs = loaded_plain_model_lr.predict(x_test_imdb)\n",
"threshold_classifier_probs = np.percentile(probs,(50.41))\n",
"print(threshold_classifier_probs)\n",
"predictions_probs = (probs >= threshold_classifier_probs) \n",
"\n",
"accuracy_test = accuracy_score(y_test_imdb, np.array(predictions_probs))\n",
"print(\"The accuracy of the model on the test data is %f\" %accuracy_test)\n",
"\n",
"#indices_probs_pos = np.nonzero(predictions_probs)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "426353cc",
"metadata": {},
"outputs": [],
"source": [
"# Do not need\n",
"for i in range(x_test_imdb.shape[0]):\n",
" counter = 0\n",
" if round(classifier_fn_lr(x_test_imdb[i,:])[0], 1) == 0.6:\n",
" counter = counter +1\n",
" print(i)\n",
" if(counter > 10):\n",
" break"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cf2da0a6",
"metadata": {},
"outputs": [],
"source": [
"indices_arr = []\n",
"for index in range(100):\n",
" score = classifier_fn_lr(x_test_imdb[index,:])[0]\n",
" if score < 0.8 and score > 0.6:\n",
" indices_arr.append(index)\n",
" print(index, score)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ba08552c",
"metadata": {
"execution": {
"iopub.execute_input": "2023-05-23T22:24:22.905668Z",
"iopub.status.busy": "2023-05-23T22:24:22.903844Z",
"iopub.status.idle": "2023-05-23T22:24:22.911474Z",
"shell.execute_reply": "2023-05-23T22:24:22.910280Z"
},
"papermill": {
"duration": 0.028079,
"end_time": "2023-05-23T22:24:22.914460",
"exception": false,
"start_time": "2023-05-23T22:24:22.886381",
"status": "completed"
},
"tags": []
},
"outputs": [],
"source": [
"explainer_shap = SEDC_Explainer(feature_names = feature_names,\n",
" threshold_classifier = threshold_classifier_probs,\n",
" classifier_fn = classifier_fn_lr,\n",
" max_iter = 50,\n",
" time_maximum = 120)\n",
"\n",
"\n",
"explanation_normal = explainer_shap.explanation(x_test_imdb[10, :])"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.17"
},
"papermill": {
"default_parameters": {},
"duration": 127.097894,
"end_time": "2023-05-23T22:25:02.784322",
"environment_variables": {},
"exception": null,
"input_path": "__notebook__.ipynb",
"output_path": "__notebook__.ipynb",
"parameters": {},
"start_time": "2023-05-23T22:22:55.686428",
"version": "2.4.0"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "2fc1f8d7",
"metadata": {
"_cell_guid": "b1076dfc-b9ad-4769-8c92-a6c4dae69d19",
"_uuid": "8f2839f25d086af736a60e9eeb907d3b93b6e0e5",
"execution": {
"iopub.execute_input": "2023-05-23T22:23:11.513543Z",
"iopub.status.busy": "2023-05-23T22:23:11.512957Z",
"iopub.status.idle": "2023-05-23T22:23:13.539048Z",
"shell.execute_reply": "2023-05-23T22:23:13.537538Z"
},
"papermill": {
"duration": 2.04584,
"end_time": "2023-05-23T22:23:13.542144",
"exception": false,
"start_time": "2023-05-23T22:23:11.496304",
"status": "completed"
},
"tags": []
},
"outputs": [],
"source": [
"## IMPORTS\n",
"import numpy as np # linear algebra\n",
"import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)\n",
"#import seaborn as sns\n",
"#import matplotlib.pyplot as plt\n",
"from scipy import sparse\n",
"\n",
"import time\n",
"\n",
"from sklearn.model_selection import RandomizedSearchCV\n",
"from sklearn.model_selection import GridSearchCV\n",
"from sklearn.model_selection import train_test_split\n",
"from sklearn.model_selection import ParameterGrid\n",
"from sklearn import metrics\n",
"from sklearn.metrics import roc_auc_score, accuracy_score\n",
"\n",
"from ordered_set import OrderedSet\n",
"from scipy.sparse import lil_matrix\n",
"from itertools import compress\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "20b5ace6",
"metadata": {},
"outputs": [],
"source": [
"# Import DataSet ds and Models\n",
"from src.datasets import IMDBDataset\n",
"\n",
"ds = IMDBDataset(config_path=\"./configs/datasets/imdb.yaml\", root=\"datasets/imdb\")\n",
"print(\n",
" ds.x_test.shape,\n",
" ds.x_train.shape,\n",
" ds.x_val.shape,\n",
" ds.y_test.shape,\n",
" ds.y_train.shape,\n",
" ds.y_val.shape,\n",
")\n",
"print(\n",
" type(ds.x_test),\n",
" type(ds.x_train),\n",
" type(ds.x_val),\n",
" type(ds.y_test),\n",
" type(ds.y_train),\n",
" type(ds.y_val),\n",
")\n",
"\n",
"from src.models import AnalysisModels as Models\n",
"\n",
"models = Models(config_path=\"./configs/models/analysis-models.yaml\", root=\"models/analysis-models/\")\n",
"print(models)\n",
"\n",
"loaded_plain_model_rf = models.rf.model\n",
"loaded_plain_model_svc = models.svm.model\n",
"loaded_plain_model_lr = models.lr.model\n",
"loaded_plain_model_knn = models.knn.model\n",
"feature_names = ds.feature_names\n",
"\n",
"## Preprocess text\n",
"\n",
"x_train_imdb = ds.x_train\n",
"x_test_imdb = ds.x_test\n",
"x_val_imdb = ds.x_val\n",
"\n",
"# Binarize y - Positive is 1\n",
"y_train_imdb = ds.y_train\n",
"y_test_imdb = ds.y_test\n",
"y_val_imdb = ds.y_val"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "47081014",
"metadata": {
"execution": {
"iopub.execute_input": "2023-05-23T22:24:22.859676Z",
"iopub.status.busy": "2023-05-23T22:24:22.858557Z",
"iopub.status.idle": "2023-05-23T22:24:22.865677Z",
"shell.execute_reply": "2023-05-23T22:24:22.864477Z"
},
"papermill": {
"duration": 0.030868,
"end_time": "2023-05-23T22:24:22.870299",
"exception": false,
"start_time": "2023-05-23T22:24:22.839431",
"status": "completed"
},
"tags": []
},
"outputs": [],
"source": [
"def classifier_fn_rf(x, negative_to_positive=0):\n",
" \"\"\"Returns the prediction probability of class 1 -> Not class 0\"\"\"\n",
" #print('loaded_plain_model_svc.decision_function(x) - ', loaded_plain_model_svc.decision_function(x))\n",
" prediction = loaded_plain_model_rf.predict_proba(x)\n",
" # If prediction is [1] retrurn the probability of class 1 else return probability of class 0\n",
" if (negative_to_positive == 1):\n",
" return prediction[:,0]\n",
" return prediction[:,1]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "28b8026d",
"metadata": {},
"outputs": [],
"source": [
"def get_featues_importances(instance):\n",
" \"\"\"Get feature importances with the sign of the change in prediction probability for a given instance.\n",
" Uses the gini impurity in the RF model.\n",
" Fast calculation as values are calculated during training period.\n",
" reference: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html#sklearn.ensemble.RandomForestClassifier.feature_importances_\n",
" \n",
" Args:\n",
" antonyms_indices: indices of antonyms in the feature vector\n",
" model: trained model with feature_importances_\n",
"\n",
" Returns:\n",
" tuple of features and their indices in the feature vector\n",
" \"\"\"\n",
" feature_importance = models.rf.model.feature_importances_\n",
" initial_score = models.rf.model.predict_proba(instance)[0][1]\n",
" print(\"Initial score: \", initial_score)\n",
" indices_active_elements = np.array(np.nonzero(instance)[1]).reshape(len(np.nonzero(instance)[1]), 1)\n",
" feature_set = [frozenset(x) for x in indices_active_elements]\n",
" candidates_to_expand = ([])\n",
" for features in indices_active_elements:\n",
" candidates_to_expand.append(OrderedSet(features))\n",
" explanation_candidates = candidates_to_expand.copy()\n",
" perturbed_instances = [perturb_fn(x, inst=instance.copy()) for x in explanation_candidates]\n",
" scores_explanation_candidates = [classifier_fn_rf(x) for x in perturbed_instances]\n",
" sign_change = [1 if (x-initial_score) > 0 else -1 for x in scores_explanation_candidates]\n",
" print(\"sign changes: \", sign_change)\n",
" # if sign change is 0, feature_importance value set to -value\n",
" feature_importance = [x if x > 0 else -x for x in feature_importance]\n",
" return feature_importance\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "580642f8",
"metadata": {},
"outputs": [],
"source": [
"# Print the Forest\n",
"sample_id = 0\n",
"positiveCount = 0\n",
"negativeCount = 0\n",
"X_train = x_train_imdb\n",
"\n",
"for j, tree in enumerate(loaded_plain_model_rf.estimators_):\n",
" print('----------------------------------------------------------------------------------------------------------------')\n",
"\n",
" n_nodes = tree.tree_.node_count\n",
" children_left = tree.tree_.children_left # Left child of node j -> access the left child by children_left[j]\n",
" children_right = tree.tree_.children_right # Right child of node j -> access the right child by children_right[j]\n",
" feature = tree.tree_.feature # Stores features used in each node j -> access the feature by feature[j]\n",
" threshold = tree.tree_.threshold # Stores the threshold value at node j -> access the threshold by threshold[j]\n",
"\n",
" print(\"Decision path for DecisionTree {0}\".format(j))\n",
" node_depth = np.zeros(shape=n_nodes, dtype=np.int64)\n",
" is_leaves = np.zeros(shape=n_nodes, dtype=bool)\n",
"\n",
" node_indicator = tree.decision_path(X_train[0])\n",
" leave_id = tree.apply(X_train)\n",
" node_index = node_indicator.indices[node_indicator.indptr[sample_id]:\n",
" node_indicator.indptr[sample_id + 1]] # Indices of nodes visited by sample_id in the current tree\n",
"\n",
"\n",
" print('Leave id: ', leave_id[sample_id])\n",
" print(' Rules used to predict sample %s, node index : ' % (sample_id))\n",
" stack = [(0, 0)] # start with the root node id (0) and its depth (0)\n",
" while len(stack) > 0:\n",
" # `pop` ensures each node is only visited once\n",
" node_id, depth = stack.pop()\n",
" node_depth[node_id] = depth\n",
"\n",
" # If the left and right child of a node is not the same we have a split\n",
" # node\n",
" is_split_node = children_left[node_id] != children_right[node_id]\n",
" # If a split node, append left and right children and depth to `stack`\n",
" # so we can loop through them\n",
" if is_split_node:\n",
" stack.append((children_left[node_id], depth + 1))\n",
" stack.append((children_right[node_id], depth + 1))\n",
" else:\n",
" is_leaves[node_id] = True\n",
"\n",
" print(\n",
" \"The binary tree structure has {n} nodes and has \"\n",
" \"the following tree structure:\\n\".format(n=n_nodes)\n",
" )\n",
" for i in range(n_nodes):\n",
" # if i is in node_index, then this is a node we care about\n",
" if i in node_index:\n",
" if is_leaves[i]:\n",
" if tree.tree_.value[i][0][0] > tree.tree_.value[i][0][1]:\n",
" positiveCount += 1\n",
" else:\n",
" negativeCount += 1\n",
" print(\n",
" \"{space}node={node} is a leaf node with value {value}.\".format(\n",
" space=node_depth[i] * \"\\t\", node=i, value=tree.tree_.value[i]\n",
" )\n",
" )\n",
" else:\n",
" print(\n",
" \"{space}node={node} is a split node: \"\n",
" \"go to node {left} if X[:, {feature} {name}] <= {threshold} \"\n",
" \"else to node {right}.\".format(\n",
" space=node_depth[i] * \"\\t\",\n",
" node=i,\n",
" left=children_left[i],\n",
" feature=feature[i],\n",
" name=feature_names[feature[i]],\n",
" threshold=threshold[i],\n",
" right=children_right[i],\n",
" )\n",
" )\n",
"\n",
"print('Positive Count: ', positiveCount), print('Negative Count: ', negativeCount)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "f5b5b511",
"metadata": {
"papermill": {
"duration": 0.014519,
"end_time": "2023-05-23T22:23:57.584948",
"exception": false,
"start_time": "2023-05-23T22:23:57.570429",
"status": "completed"
},
"tags": []
},
"source": [
"## SHAP-SEDC Function"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f51db9b1",
"metadata": {
"execution": {
"iopub.execute_input": "2023-05-23T22:24:14.202042Z",
"iopub.status.busy": "2023-05-23T22:24:14.201564Z",
"iopub.status.idle": "2023-05-23T22:24:14.231018Z",
"shell.execute_reply": "2023-05-23T22:24:14.229850Z"
},
"papermill": {
"duration": 0.050135,
"end_time": "2023-05-23T22:24:14.234197",
"exception": false,
"start_time": "2023-05-23T22:24:14.184062",
"status": "completed"
},
"tags": []
},
"outputs": [],
"source": [
"def perturb_fn(x,inst):\n",
" \"\"\" Function to perturb instance x -> Deform the array -> assign 0 to the x-th column \"\"\"\n",
" \"\"\"\n",
" Returns perturbed instance inst\n",
" \"\"\"\n",
" inst[:,x]=0\n",
" return inst\n",
"\n",
"\n",
"def print_instance(pert_inst, ref_inst, feature_names):\n",
" \"\"\" Function to print the perturbed instance \"\"\"\n",
" \"\"\"\n",
" Returns perturbed instance inst\n",
" \"\"\"\n",
" indices_active_elements_ref = np.nonzero(ref_inst)[1]\n",
" indices_active_elements_pert = np.nonzero(pert_inst)[1]\n",
" ref_set = set(indices_active_elements_ref)\n",
" pert_set = set(indices_active_elements_pert)\n",
" # elements in ref_set but not in pert_set\n",
" removed_word_indices = ref_set - pert_set\n",
" # elements in pert_set but not in ref_set\n",
" added_word_indices = pert_set - ref_set\n",
" printable_array = []\n",
" for item in indices_active_elements_ref:\n",
" printable_array.append(\"..\" + feature_names[item] + \"..\")\n",
" # Change formatting of removed words\n",
" for item in removed_word_indices:\n",
" printable_array[printable_array.index(\"..\" + feature_names[item] + \"..\")] = \"--\" + feature_names[item] + \"--\"\n",
" # change formatting of added words\n",
" for item in added_word_indices:\n",
" printable_array.append(\"++\" + feature_names[item] + \"++\")\n",
" printable_array.append(classifier_fn_rf(pert_inst))\n",
" print(printable_array)\n",
" return printable_array\n",
"\n",
"\n",
"\"\"\"\n",
"Input:\n",
" - comb: \"best-first\" (combination of) feature(s) that is expanded\n",
" (e.g., comb_to_expand)\n",
" - expanded_combis: list of combinations of features that are already \n",
" expanded as \"best-first\"\n",
" - feature_set: indices of the active features of the instance \n",
" - candidates_to_expand: combinations of features that are candidates to be \n",
" expanded in next iterations or candidates for \"best-first\"\n",
" - explanations_sets: counterfactual explanations already found\n",
" - scores_candidates_to_expand: scores after perturbation for the candidate\n",
" combinations of features to be expanded\n",
" - instance: instance to be explained\n",
" - cf: classifier prediction probability function\n",
" or decision function. For ScikitClassifiers, this is classifier.predict_proba \n",
" or classifier.decision_function or classifier.predict_log_proba.\n",
" Make sure the function only returns one (float) value. For instance, if you\n",
" use a ScikitClassifier, transform the classifier.predict_proba as follows:\n",
" \n",
" def classifier_fn(X):\n",
" c=classification_model.predict_proba(X)\n",
" y_predicted_proba=c[:,1]\n",
" return y_predicted_proba\n",
" \n",
"Returns:\n",
" - explanation_candidates: combinations of features that are explanation\n",
" candidates to be checked in the next iteration\n",
" - candidates_to_expand: combinations of features that are candidates to be \n",
" expanded in next iterations or candidates for \"best-first\"\n",
" - expanded_combis: [list] list of combinations of features that are already \n",
" expanded as \"best-first\" \n",
" - scores_candidates_to_expand: scores after perturbation for the candidate\n",
" combinations of features to be expanded\n",
" - scores_explanation_candidates: scores after perturbation of explanation candidates\n",
"\"\"\"\n",
"\n",
"def print_ref_instance(ref_inst, feaaure_names):\n",
" printable_array = []\n",
" indices_active_elements = np.nonzero(ref_inst)[1]\n",
" for item in indices_active_elements:\n",
" printable_array.append(\"..\" + feature_names[item] + \"..\")\n",
" print(printable_array)\n",
"\n",
"\n",
"def expand_and_prune(comb, expanded_combis, feature_set, candidates_to_expand, explanations_sets, scores_candidates_to_expand, instance, cf, feature_names, revert=0):\n",
" \"\"\" Function to expand \"best-first\" feature combination and prune explanation_candidates and candidates_to_expand \"\"\" \n",
" \n",
" comb = OrderedSet(comb)\n",
" expanded_combis.append(comb)\n",
" \n",
" old_candidates_to_expand = [frozenset(x) for x in candidates_to_expand]\n",
" old_candidates_to_expand = set(old_candidates_to_expand)\n",
" \n",
" feature_set_new = []\n",
" ## If the feature is not in the current combination -> add it to a new list\n",
" for feature in feature_set:\n",
" if (len(comb & feature) == 0): #set operation: intersection\n",
" feature_set_new.append(feature) # If the feature is not in the current combination to remove from the instance\n",
" \n",
" # Add each element in the new set -> which were initially not present -> to the accepted combination -> create new combinations -> (EXPANSION)\n",
" new_explanation_candidates = []\n",
" for element in feature_set_new:\n",
" union = (comb|element) #set operation: union\n",
" new_explanation_candidates.append(union) # Create new combinations to remove from the instance\n",
" \n",
" #Add new explanation candidates to the list of candidates to expand\n",
" candidates_to_expand_notpruned = candidates_to_expand.copy()\n",
" for new_candidate in new_explanation_candidates:\n",
" candidates_to_expand_notpruned.append(new_candidate)\n",
" \n",
" # Calculate scores of new combinations and add to scores_candidates_to_expand\n",
" # perturb each new candidate and get the score for each.\n",
" perturbed_instances = [perturb_fn(x, inst=instance.copy()) for x in new_explanation_candidates]\n",
" for instance_p in perturbed_instances:\n",
" print_instance(instance_p, instance, feature_names)\n",
" scores_perturbed_new = [cf(x, revert) for x in perturbed_instances]\n",
" ## Append the newly created score array to the passes existing array\n",
" scores_candidates_to_expand_notpruned = scores_candidates_to_expand + scores_perturbed_new\n",
" # create a dictionary of scores dictionary where the \n",
" # keys are string representations of the candidates from candidates_to_expand_notpruned, and the \n",
" # values are the corresponding scores from scores_candidates_to_expand_notpruned\n",
" dictionary_scores = dict(zip([str(x) for x in candidates_to_expand_notpruned], scores_candidates_to_expand_notpruned))\n",
" \n",
" # *** Pruning step: remove all candidates to expand that have an explanation as subset ***\n",
" candidates_to_expand_pruned_explanations = []\n",
" # take one combination from candidates\n",
" for combi in candidates_to_expand_notpruned:\n",
" pruning=0\n",
" for explanation in explanations_sets: # if an explanation is present as a subser in combi, does not add it to the to be expanded list -> because solution with a smaller size exists\n",
" if ((explanation.issubset(combi)) or (explanation==combi)):\n",
" pruning = pruning + 1\n",
" if (pruning == 0): # If it is not a superset of a present explanation -> add it to the list\n",
" candidates_to_expand_pruned_explanations.append(combi)\n",
" # Each element is frozen as a set\n",
" candidates_to_expand_pruned_explanations_frozen = [frozenset(x) for x in candidates_to_expand_pruned_explanations]\n",
" # But the total set f frozen sets are not frozen\n",
" candidates_to_expand_pruned_explanations_ = set(candidates_to_expand_pruned_explanations_frozen)\n",
" \n",
" expanded_combis_frozen = [frozenset(x) for x in expanded_combis]\n",
" expanded_combis_ = set(expanded_combis_frozen)\n",
" \n",
" # *** Pruning step: remove all candidates to expand that are in expanded_combis *** -> Same as above\n",
" candidates_to_expand_pruned = (candidates_to_expand_pruned_explanations_ - expanded_combis_) \n",
" ind_dict = dict((k,i) for i,k in enumerate(candidates_to_expand_pruned_explanations_frozen))\n",
" indices = [ind_dict[x] for x in candidates_to_expand_pruned]\n",
" candidates_to_expand = [candidates_to_expand_pruned_explanations[i] for i in indices]\n",
" \n",
" #The new explanation candidates are the ones that are NOT in the old list of candidates to expand\n",
" new_explanation_candidates_pruned = (candidates_to_expand_pruned - old_candidates_to_expand) \n",
" candidates_to_expand_frozen = [frozenset(x) for x in candidates_to_expand]\n",
" ind_dict2 = dict((k,i) for i,k in enumerate(candidates_to_expand_frozen))\n",
" indices2 = [ind_dict2[x] for x in new_explanation_candidates_pruned]\n",
" explanation_candidates = [candidates_to_expand[i] for i in indices2]\n",
" \n",
" # Get scores of the new candidates and explanations.\n",
" scores_candidates_to_expand = [dictionary_scores[x] for x in [str(c) for c in candidates_to_expand]]\n",
" scores_explanation_candidates = [dictionary_scores[x] for x in [str(c) for c in explanation_candidates]]\n",
" \n",
" return (explanation_candidates, candidates_to_expand, expanded_combis, scores_candidates_to_expand, scores_explanation_candidates)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5c61b551",
"metadata": {
"execution": {
"iopub.execute_input": "2023-05-23T22:24:14.268763Z",
"iopub.status.busy": "2023-05-23T22:24:14.268238Z",
"iopub.status.idle": "2023-05-23T22:24:14.329025Z",
"shell.execute_reply": "2023-05-23T22:24:14.327494Z"
},
"papermill": {
"duration": 0.082213,
"end_time": "2023-05-23T22:24:14.332213",
"exception": false,
"start_time": "2023-05-23T22:24:14.250000",
"status": "completed"
},
"tags": []
},
"outputs": [],
"source": [
"class FIC_Explainer(object):\n",
" \"\"\"Class for generating evidence counterfactuals for classifiers on behavioral/text data\"\"\"\n",
"\n",
" def __init__(\n",
" self,\n",
" feature_names,\n",
" classifier_fn,\n",
" threshold_classifier,\n",
" max_iter=100,\n",
" max_explained=1,\n",
" BB=True,\n",
" max_features=30,\n",
" time_maximum=120,\n",
" revert=0,\n",
" ):\n",
" \"\"\"Init function\n",
"\n",
" Args:\n",
" classifier_fn: [function] classifier prediction probability function\n",
" or decision function. For ScikitClassifiers, this is classifier.predict_proba\n",
" or classifier.decision_function or classifier.predict_log_proba.\n",
" Make sure the function only returns one (float) value. For instance, if you\n",
" use a ScikitClassifier, transform the classifier.predict_proba as follows:\n",
"\n",
" def classifier_fn(X):\n",
" c=classification_model.predict_proba(X)\n",
" y_predicted_proba=c[:,1]\n",
" return y_predicted_proba\n",
"\n",
" threshold_classifier: [float] the threshold that is used for classifying\n",
" instances as positive or not. When score or probability exceeds the\n",
" threshold value, then the instance is predicted as positive.\n",
" We have no default value, because it is important the user decides\n",
" a good value for the threshold.\n",
"\n",
" feature_names: [numpy.array] contains the interpretable feature names,\n",
" such as the words themselves in case of document classification or the names\n",
" of visited URLs.\n",
"\n",
" max_iter: [int] maximum number of iterations in the search procedure.\n",
" Default is set to 50.\n",
"\n",
" max_explained: [int] maximum number of EDC explanations generated.\n",
" Default is set to 1.\n",
"\n",
" BB: [“True” or “False”] when the algorithm is augmented with\n",
" branch-and-bound (BB=True), one is only interested in the (set of)\n",
" shortest explanation(s). Default is \"True\".\n",
"\n",
" max_features: [int] maximum number of features allowed in the explanation(s).\n",
" Default is set to 30.\n",
"\n",
" time_maximum: [int] maximum time allowed to generate explanations,\n",
" expressed in minutes. Default is set to 2 minutes (120 seconds).\n",
" \"\"\"\n",
"\n",
" self.feature_names = feature_names\n",
" self.classifier_fn = classifier_fn\n",
" self.threshold_classifier = threshold_classifier\n",
" self.max_iter = max_iter\n",
" self.max_explained = max_explained\n",
" self.BB = BB\n",
" self.max_features = max_features\n",
" self.time_maximum = time_maximum\n",
" self.revert = None\n",
" self.initial_class = None\n",
"\n",
" def explanation(self, instance):\n",
" \"\"\"Generates evidence counterfactual explanation for the instance.\n",
" ONLY IF THE CURRENT INSTANCE IS POSITIVE -> Limitation\n",
"\n",
" Args:\n",
" instance: [numpy.array or sparse matrix] instance to explain\n",
"\n",
" Returns:\n",
" A dictionary where:\n",
"\n",
" explanation_set: explanation(s) ranked from high to low change\n",
" in predicted score or probability.\n",
" The number of explanations shown depends on the argument max_explained.\n",
"\n",
" number_active_elements: number of active elements of\n",
" the instance of interest.\n",
"\n",
" number_explanations: number of explanations found by algorithm.\n",
"\n",
" minimum_size_explanation: number of features in the smallest explanation.\n",
"\n",
" time_elapsed: number of seconds passed to generate explanation(s).\n",
"\n",
" explanations_score_change: change in predicted score/probability\n",
" when removing the features in the explanation, ranked from\n",
" high to low change.\n",
" \"\"\"\n",
"\n",
" # *** INITIALIZATION ***\n",
" print(\"Start initialization...\")\n",
" tic = time.time()\n",
" instance = lil_matrix(instance)\n",
" iteration = 0\n",
" nb_explanations = 0\n",
" minimum_size_explanation = np.nan\n",
" explanations = []\n",
"\n",
" explanations_sets = []\n",
" explanations_score_change = []\n",
"\n",
" expanded_combis = []\n",
"\n",
" score_predicted = self.classifier_fn(instance) ## Returns Prediction Prob\n",
" # Intial class is 1 is score is greater than threshold\n",
" if score_predicted > self.threshold_classifier:\n",
" self.initial_class = [1]\n",
" else:\n",
" self.initial_class = [0]\n",
" self.revert = 1\n",
" print(\n",
" \"score_predicted \",\n",
" score_predicted,\n",
" \" initial_class \",\n",
" self.initial_class,\n",
" )\n",
"\n",
" importances = get_featues_importances(instance)\n",
" features = []\n",
" for ind in range(len(importances)):\n",
" if importances[ind] != 0:\n",
" features.append({\"feature\": ind, \"importance\": importances[ind]})\n",
" sorted_data_in = sorted(features, key=lambda x: x[\"importance\"], reverse=True)\n",
" inverse_sorted_data_in = sorted(features, key=lambda x: x[\"importance\"])\n",
"\n",
" if self.revert == 1:\n",
" sorted_data_in = inverse_sorted_data_in\n",
"\n",
" indices_active_elements = np.nonzero(instance)[\n",
" 1\n",
" ] ## -> Gets non zero elements in the instance as an array [x, y, z]\n",
" sorted_indices = sorted(\n",
" indices_active_elements, key=lambda x: importances[x], reverse=True\n",
" )\n",
" indices_active_elements = np.array(sorted_indices)\n",
" number_active_elements = len(indices_active_elements)\n",
" indices_active_elements = indices_active_elements.reshape(\n",
" (number_active_elements, 1)\n",
" ) ## -> Reshape to get a predictable\n",
"\n",
" candidates_to_expand = (\n",
" []\n",
" ) # -> These combinations are further expanded -> These are the elements to be removed from the sentence\n",
" for features in indices_active_elements:\n",
" candidates_to_expand.append(OrderedSet(features))\n",
" print(\"candidates_to_expand \", candidates_to_expand)\n",
" ## > Gets an array with each element in reshaped incides as an ordered set -> [OrderedSet([430]), OrderedSet([588]), OrderedSet([595])]\n",
"\n",
" explanation_candidates = candidates_to_expand.copy()\n",
" print(\"explanation_candidates \", explanation_candidates)\n",
" ## Gets a copy of the above array -> Initially\n",
"\n",
" feature_set = [\n",
" frozenset(x) for x in indices_active_elements\n",
" ] ## Immutable -> can be used as keys in dictionary\n",
" ## Used features in the current x-reference -> incides of the words in the review.\n",
"\n",
" print(\"Initialization is complete.\")\n",
" print(\"\\n Elapsed time %d \\n\" % (time.time() - tic))\n",
"\n",
" # *** WHILE LOOP ***\n",
" while (\n",
" (iteration < self.max_iter)\n",
" and (nb_explanations < self.max_explained)\n",
" and (len(candidates_to_expand) != 0)\n",
" and (len(explanation_candidates) != 0)\n",
" and ((time.time() - tic) < self.time_maximum)\n",
" ):\n",
" ## Stop if maximum iterations exceeded\n",
" # number of explanations generated is greater than the maximum explanations\n",
" # There are no candidates to expand\n",
" # There are no explanation candidates -> Used to force stop while loop below\n",
" # Or maximum allowed time exceeded\n",
" iteration += 1\n",
" print(\"\\n Iteration %d \\n\" % iteration)\n",
"\n",
" if iteration == 1:\n",
" print(\"Run in first iteration -> perturbation done \\n\")\n",
" # Print the word in each index in the explanation candidates\n",
" # for item in explanation_candidates:\n",
" # print([self.feature_names[x] for x in item])\n",
" print(\"explanation_candidates \\n\", explanation_candidates, \"\\n\")\n",
" perturbed_instances = [\n",
" perturb_fn(x, inst=instance.copy()) for x in explanation_candidates\n",
" ]\n",
"\n",
" for instance_p in perturbed_instances:\n",
" print_instance(instance_p, instance, self.feature_names)\n",
"\n",
" scores_explanation_candidates = [\n",
" self.classifier_fn(x, self.revert) for x in perturbed_instances\n",
" ]\n",
" # Get predictions for each perturbed instance where one or more elements are removed from the initial instance\n",
" # It is in form of [[x], [y], [z]]\n",
" print(\n",
" \"scores_explanation_candidates \\n\",\n",
" scores_explanation_candidates,\n",
" \"\\n\",\n",
" )\n",
" scores_candidates_to_expand = scores_explanation_candidates.copy()\n",
"\n",
" scores_perturbed_new_combinations = [\n",
" x[0] for x in scores_explanation_candidates\n",
" ]\n",
" # Therefore get it to the shape [x, y, z] by getting the [0] th element of each element array\n",
" # print(\n",
" # \"scores_perturbed_new_combinations \", scores_perturbed_new_combinations\n",
" # )\n",
"\n",
" # ***CHECK IF THERE ARE EXPLANATIONS***\n",
" new_explanations = list(\n",
" compress(\n",
" explanation_candidates,\n",
" scores_perturbed_new_combinations < self.threshold_classifier,\n",
" )\n",
" )\n",
" # Get explanation candidates where their probability is less than the threshold classifier -> Positive becomes negative\n",
" # print(\"New Explanations \\n\", new_explanations)\n",
" explanations += list(\n",
" compress(\n",
" explanation_candidates,\n",
" scores_perturbed_new_combinations < self.threshold_classifier,\n",
" )\n",
" )\n",
" # print(\"\\n explanations, explanations_score_change\", explanations)\n",
" nb_explanations += len(\n",
" list(\n",
" compress(\n",
" explanation_candidates,\n",
" scores_perturbed_new_combinations < self.threshold_classifier,\n",
" )\n",
" )\n",
" ) # Update number of explanations which pass the required threshold\n",
" explanations_sets += list(\n",
" compress(\n",
" explanation_candidates,\n",
" scores_perturbed_new_combinations < self.threshold_classifier,\n",
" )\n",
" )\n",
" explanations_sets = [\n",
" set(x) for x in explanations_sets\n",
" ] # Convert each array to a set -> to get the words\n",
" explanations_score_change += list(\n",
" compress(\n",
" scores_explanation_candidates,\n",
" scores_perturbed_new_combinations < self.threshold_classifier,\n",
" )\n",
" )\n",
" # print('explanations_score_change', explanations_score_change)\n",
"\n",
" # Adjust max_length\n",
" if self.BB == True:\n",
" if len(explanations) != 0:\n",
" lengths = [] # Record length of each explanation found\n",
" for explanation in explanations:\n",
" lengths.append(len(explanation))\n",
" lengths = np.array(lengths)\n",
" max_length = lengths.min()\n",
" # Get minimum length of the found explanations as max length -> Do not search for explanations with longer length\n",
" else:\n",
" max_length = number_active_elements # Else can find maximum length equal to number of words in instance\n",
" else:\n",
" max_length = number_active_elements\n",
" print(\"\\n-------------Max length updated to - \", max_length)\n",
"\n",
" # Eliminate combinations from candidates_to_expand (\"best-first\" candidates) that can not be expanded\n",
" # Pruning based on Branch & Bound=True, max. features allowed and number of active features\n",
" candidates_to_expand_updated = []\n",
" scores_candidates_to_expand_updated = (\n",
" []\n",
" ) # enumerate -> Find count of || to list one after another\n",
" for j, combination in enumerate(candidates_to_expand):\n",
" if (\n",
" (len(combination) < number_active_elements)\n",
" and (len(combination) < max_length)\n",
" and (len(combination) < self.max_features)\n",
" ):\n",
" # Combination length should be less than the words in the input and max length of the required explanation and required maximum features\n",
" candidates_to_expand_updated.append(\n",
" combination\n",
" ) # If the combination matches, it is further expanded\n",
" scores_candidates_to_expand_updated.append(\n",
" scores_candidates_to_expand[j]\n",
" )\n",
" # Add the prediction score to the new array\n",
" # get the score from the scores_candidates_to_expand using the current index\n",
"\n",
" print(\n",
" \"\\nlen(candidates_to_expand_updated)\",\n",
" len(candidates_to_expand_updated),\n",
" \" 0 \",\n",
" )\n",
" print(\n",
" \"\\nnb_explanations\",\n",
" nb_explanations,\n",
" \" >= self.max_explained \",\n",
" self.max_explained,\n",
" )\n",
"\n",
" # *** IF LOOP ***\n",
" # expanding the candidates to update will exceed the max length set in the earlier loop\n",
" if (len(candidates_to_expand_updated) == 0) or (\n",
" nb_explanations >= self.max_explained\n",
" ):\n",
" ## If the number of explanations exceeded the required number\n",
" ## or no candidates\n",
" ## no explanations present\n",
"\n",
" print(\"nb_explanations Stop iterations...\")\n",
" explanation_candidates = [] # stop algorithm\n",
" ## Found all the candidates\n",
" print(\n",
" \"scores_candidates_to_expand_updated \",\n",
" scores_candidates_to_expand_updated,\n",
" )\n",
" # print(\"candidates_to_expand_updated \", candidates_to_expand_updated)\n",
"\n",
" elif len(candidates_to_expand_updated) != 0:\n",
" ## If there are possible candidates\n",
"\n",
" explanation_candidates = []\n",
" it = 0 # Iteration of the while loop\n",
" indices = []\n",
"\n",
" scores_candidates_to_expand2 = []\n",
" for score in scores_candidates_to_expand_updated:\n",
" if score[0] < self.threshold_classifier:\n",
" scores_candidates_to_expand2.append(2 * score_predicted)\n",
" else:\n",
" scores_candidates_to_expand2.append(score)\n",
" # update candidate scores if they have score less than threshold -> To expand them further\n",
" shap_candidates_to_expand2 = []\n",
" for candidate in candidates_to_expand_updated:\n",
" importancess = 0\n",
" for word in candidate:\n",
" # find word in feature column in sorted_data\n",
" for ind in range(len(sorted_data_in)):\n",
" if sorted_data_in[ind][\"feature\"] == word:\n",
" importancess += sorted_data_in[ind][\"importance\"]\n",
" break\n",
" shap_candidates_to_expand2.append(importancess)\n",
"\n",
" # print(\n",
" # \"\\n scores_candidates_to_expand2 before loop\",\n",
" # scores_candidates_to_expand2,\n",
" # )\n",
"\n",
" # *** WHILE LOOP ***\n",
" while (\n",
" (len(explanation_candidates) == 0)\n",
" and (it < len(scores_candidates_to_expand2))\n",
" and ((time.time() - tic) < self.time_maximum)\n",
" ):\n",
" # Stop if candidates are found or looped through more than there are candidates or maximum time reached\n",
"\n",
" print(\"While loop iteration %d\" % it)\n",
"\n",
" if it != 0: # Because indices are not there in the first iteration\n",
" for index in indices:\n",
" scores_candidates_to_expand2[index] = 2 * score_predicted\n",
"\n",
" # print(\n",
" # \"\\n scores_candidates_to_expand2 after loop\",\n",
" # scores_candidates_to_expand2,\n",
" # )\n",
" # print(\"\\n indices\", indices)\n",
"\n",
" # do elementwise subtraction between score_predicted and scores_candidates_to_expand2\n",
" subtractionList = []\n",
" for x, y in zip(score_predicted, scores_candidates_to_expand2):\n",
" print(\"\\n x, y\", x - y)\n",
" subtractionList.append(x - y)\n",
"\n",
" # Do element wise subtraction between the prediction score of the x_ref and every element of the scores_candidates_to_expand2\n",
" index_combi_max = np.argmax(subtractionList)\n",
" index_importance_max = np.argmax(shap_candidates_to_expand2)\n",
" index_importance_min = np.argmin(shap_candidates_to_expand2)\n",
"\n",
" print(\n",
" \"subtrac max \",\n",
" index_combi_max,\n",
" \" index_shap_max \",\n",
" index_importance_max,\n",
" )\n",
" if(iteration < 3):\n",
" print(\"---------USING IMPORTANCE----------\")\n",
" if self.revert == 0:\n",
" index_combi_max = index_importance_max\n",
" else:\n",
" index_combi_max = index_importance_min\n",
" #Get the index of the maximum value -> Expand it\n",
" else:\n",
" print(\"++++++++USING DIFFERENCE+++++++++\")\n",
" print(\n",
" \"\\n index_combi_max\",\n",
" candidates_to_expand_updated[np.argmax(subtractionList)],\n",
" \"\\n index_importance_max\",\n",
" candidates_to_expand_updated[index_importance_max],\n",
" \"\\n using combination\",\n",
" candidates_to_expand_updated[index_combi_max],\n",
" )\n",
" indices.append(index_combi_max)\n",
" expanded_combis.append(\n",
" candidates_to_expand_updated[index_combi_max]\n",
" )\n",
" # Add this combination to already expanded combinations as it will be expanded next by expand and prune function\n",
"\n",
" comb_to_expand = candidates_to_expand_updated[index_combi_max]\n",
" # Expand the found combination with highest difference\n",
" func = expand_and_prune(\n",
" comb_to_expand,\n",
" expanded_combis,\n",
" feature_set,\n",
" candidates_to_expand_updated,\n",
" explanations_sets,\n",
" scores_candidates_to_expand_updated,\n",
" instance,\n",
" self.classifier_fn,\n",
" self.feature_names,\n",
" self.revert,\n",
" )\n",
" \"\"\"Returns:\n",
" - explanation_candidates: combinations of features that are explanation\n",
" candidates to be checked in the next iteration\n",
" - candidates_to_expand: combinations of features that are candidates to\n",
" expanded in next iterations or candidates for \"best-first\"\n",
" - expanded_combis: [list] list of combinations of features that are already\n",
" expanded as \"best-first\"\n",
" - scores_candidates_to_expand: scores after perturbation for the candidate\n",
" combinations of features to be expanded\n",
" - scores_explanation_candidates: scores after perturbation of explanation candidates\"\"\"\n",
" explanation_candidates = func[0]\n",
" candidates_to_expand = func[1]\n",
" expanded_combis = func[2]\n",
" scores_candidates_to_expand = func[3]\n",
" scores_explanation_candidates = func[4]\n",
"\n",
" it += 1\n",
"\n",
" print(\n",
" \"\\n\\n\\niteration - \", iteration, \" self.max_iter - \", self.max_iter\n",
" )\n",
" print(\n",
" \"\\n\\nlen(candidates_to_expand) - \",\n",
" len(candidates_to_expand),\n",
" \" != 0 \",\n",
" )\n",
" print(\n",
" \"\\n\\nlen(explanation_candidates) - \",\n",
" len(explanation_candidates),\n",
" \" !=0 \",\n",
" )\n",
" print(\n",
" \"\\n\\n(time.time() - tic) - \",\n",
" (time.time() - tic),\n",
" \" self.time_maximum - \",\n",
" self.time_maximum,\n",
" )\n",
" print(\"\\n Elapsed time %d \\n\" % (time.time() - tic))\n",
"\n",
" # *** FINAL PART OF ALGORITHM ***\n",
" print(\"Iterations are done.\")\n",
"\n",
" explanation_set = []\n",
" explanation_feature_names = []\n",
" for i in range(len(explanations)):\n",
" explanation_feature_names = []\n",
" for features in explanations[i]:\n",
" explanation_feature_names.append(self.feature_names[features])\n",
" explanation_set.append(explanation_feature_names)\n",
"\n",
" if len(explanations) != 0:\n",
" lengths_explanation = []\n",
" for explanation in explanations:\n",
" l = len(explanation)\n",
" lengths_explanation.append(l)\n",
" minimum_size_explanation = np.min(lengths_explanation)\n",
"\n",
" number_explanations = len(explanations)\n",
" if np.size(explanations_score_change) > 1:\n",
" inds = np.argsort(explanations_score_change, axis=0)\n",
" inds = np.fliplr([inds])[0]\n",
" inds_2 = []\n",
" for i in range(np.size(inds)):\n",
" inds_2.append(inds[i][0])\n",
" explanation_set_adjusted = []\n",
" for i in range(np.size(inds)):\n",
" j = inds_2[i]\n",
" explanation_set_adjusted.append(explanation_set[j])\n",
" explanations_score_change_adjusted = []\n",
" for i in range(np.size(inds)):\n",
" j = inds_2[i]\n",
" explanations_score_change_adjusted.append(explanations_score_change[j])\n",
" explanation_set = explanation_set_adjusted\n",
" explanations_score_change = explanations_score_change_adjusted\n",
"\n",
" time_elapsed = time.time() - tic\n",
" print(\"\\n Total elapsed time %d \\n\" % time_elapsed)\n",
"\n",
" print(\n",
" \"If we remove the words \",\n",
" explanation_set[0 : self.max_explained],\n",
" \"From the review, the prediction will be reversed\",\n",
" )\n",
"\n",
" return {\n",
" \"explanation set\": explanation_set[0 : self.max_explained],\n",
" \"number active elements\": number_active_elements,\n",
" \"number explanations found\": number_explanations,\n",
" \"size smallest explanation\": minimum_size_explanation,\n",
" \"time elapsed\": time_elapsed,\n",
" \"differences score\": explanations_score_change[0 : self.max_explained],\n",
" \"iterations\": iteration,\n",
" }"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e52259f6",
"metadata": {
"execution": {
"iopub.execute_input": "2023-05-23T22:24:14.775696Z",
"iopub.status.busy": "2023-05-23T22:24:14.775084Z",
"iopub.status.idle": "2023-05-23T22:24:22.763555Z",
"shell.execute_reply": "2023-05-23T22:24:22.762523Z"
},
"papermill": {
"duration": 8.009867,
"end_time": "2023-05-23T22:24:22.767023",
"exception": false,
"start_time": "2023-05-23T22:24:14.757156",
"status": "completed"
},
"tags": []
},
"outputs": [],
"source": [
"# Get threshold_classifier_probs\n",
"p = np.sum(y_train_imdb)/np.size(y_train_imdb)\n",
"\n",
"probs = loaded_plain_model_rf.predict(x_test_imdb)\n",
"threshold_classifier_probs = np.percentile(probs,(50.33))\n",
"print(threshold_classifier_probs)\n",
"predictions_probs = (probs >= threshold_classifier_probs) \n",
"\n",
"accuracy_test = accuracy_score(y_test_imdb, np.array(predictions_probs))\n",
"print(\"The accuracy of the model on the test data is %f\" %accuracy_test)\n",
"\n",
"#indices_probs_pos = np.nonzero(predictions_probs)\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aac68535",
"metadata": {},
"outputs": [],
"source": [
"indices_arr = []\n",
"for index in range(100):\n",
" score = classifier_fn_rf(x_test_imdb[index,:])[0]\n",
" if score < 0.45 and score > 0.4:\n",
" indices_arr.append(index)\n",
" print(index, score)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d9687a5f",
"metadata": {},
"outputs": [],
"source": [
"classifier_fn_rf(x_test_imdb[11,:])"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ba08552c",
"metadata": {
"execution": {
"iopub.execute_input": "2023-05-23T22:24:22.905668Z",
"iopub.status.busy": "2023-05-23T22:24:22.903844Z",
"iopub.status.idle": "2023-05-23T22:24:22.911474Z",
"shell.execute_reply": "2023-05-23T22:24:22.910280Z"
},
"papermill": {
"duration": 0.028079,
"end_time": "2023-05-23T22:24:22.914460",
"exception": false,
"start_time": "2023-05-23T22:24:22.886381",
"status": "completed"
},
"tags": []
},
"outputs": [],
"source": [
"# Run feature importance counterfactual\n",
"explainer_FIC = FIC_Explainer(feature_names = feature_names,\n",
" threshold_classifier = threshold_classifier_probs,\n",
" classifier_fn = classifier_fn_rf,\n",
" max_iter = 50,\n",
" time_maximum = 120)\n",
"\n",
"explanation_FIC = explainer_FIC.explanation(x_test_imdb[6,:])"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c740f4c3",
"metadata": {},
"outputs": [],
"source": [
"# Run feature importance counterfactual\n",
"explainer_FIC = FIC_Explainer(feature_names = feature_names,\n",
" threshold_classifier = threshold_classifier_probs,\n",
" classifier_fn = classifier_fn_rf,\n",
" max_iter = 50,\n",
" time_maximum = 120)\n",
"\n",
"for element in indices_arr:\n",
" print('element', element)\n",
" explanation_FIC = explainer_FIC.explanation(x_test_imdb[element,:])"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "541af84c",
"metadata": {
"execution": {
"iopub.execute_input": "2023-05-23T22:24:58.903375Z",
"iopub.status.busy": "2023-05-23T22:24:58.902856Z",
"iopub.status.idle": "2023-05-23T22:24:58.913086Z",
"shell.execute_reply": "2023-05-23T22:24:58.911659Z"
},
"papermill": {
"duration": 0.042328,
"end_time": "2023-05-23T22:24:58.915972",
"exception": false,
"start_time": "2023-05-23T22:24:58.873644",
"status": "completed"
},
"tags": []
},
"outputs": [],
"source": [
"print(explanation_FIC)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.17"
},
"papermill": {
"default_parameters": {},
"duration": 127.097894,
"end_time": "2023-05-23T22:25:02.784322",
"environment_variables": {},
"exception": null,
"input_path": "__notebook__.ipynb",
"output_path": "__notebook__.ipynb",
"parameters": {},
"start_time": "2023-05-23T22:22:55.686428",
"version": "2.4.0"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
name: imdb
source_url: https://sliit-xai.s3.ap-south-1.amazonaws.com/datasets/imdb.zip
paths:
data: imdb.csv
split:
test: 0.1
train: 0.8
val: 0.1
labels:
- negative
- positive
extras:
input_encoder_path: tfidf.pkl
min_df: 30
name: snli_1.0_contra
paths:
test: snli_1.0_contra_test.csv
train: snli_1.0_contra_train.csv
val: snli_1.0_contra_val.csv
source_url: https://sliit-xai.s3.ap-south-1.amazonaws.com/datasets/snli_1.0_contra.zip
model_name: t5-small
max_token_len: 64
name: analysis-models
source_url: https://sliit-xai.s3.ap-south-1.amazonaws.com/models/analysis-models.zip
paths:
tfidf: tfidf.pkl
knn: knn.pkl
lr: lr.pkl
rf: rf.pkl
svm: svm.pkl
models:
knn: knn.pkl
lr: lr.pkl
rf: rf.pkl
svm: svm.pkl
encoders:
input_encoder_name: tfidf
output_encoder_name: lut
output_labels:
- negative
- positive
name: t5-cf-generator
paths:
model: model.pt
model_config: t5-small
source_url: https://sliit-xai.s3.ap-south-1.amazonaws.com/models/t5-cf-generator.zip
name: wf-cf-generator
flip_prob: 0.5
flipping_tags:
- VB
- VBD
- VBG
- VBN
- VBP
- VBZ
sample_prob_decay_factor: 0.2
/app/models
/app/src
/app/configs
\ No newline at end of file
# Define function directory
ARG FUNCTION_DIR="/function"
FROM python:3.9-buster as build-image
# Install aws-lambda-cpp build dependencies
RUN apt-get update && \
apt-get install -y \
g++ \
make \
cmake \
unzip \
libcurl4-openssl-dev
# Include global arg in this stage of the build
ARG FUNCTION_DIR
# Create function directory
RUN mkdir -p ${FUNCTION_DIR}
# Install pip dependencies
COPY cpu-requirements.txt .
RUN pip install -r cpu-requirements.txt --index-url https://download.pytorch.org/whl/cpu --target ${FUNCTION_DIR}
COPY requirements.txt .
RUN pip install -r requirements.txt --target ${FUNCTION_DIR}
# Install the runtime interface client
RUN pip install \
--target ${FUNCTION_DIR} \
awslambdaric
# Multi-stage build: grab a fresh copy of the base image
FROM python:3.9-buster
# Include global arg in this stage of the build
ARG FUNCTION_DIR
# Set working directory to function root directory
WORKDIR ${FUNCTION_DIR}
# Copy in the build image dependencies
COPY --from=build-image ${FUNCTION_DIR} ${FUNCTION_DIR}
# Download nltk data
RUN python3 -m nltk.downloader --dir /usr/share/nltk_data wordnet punkt stopwords averaged_perceptron_tagger tagsets
# copy function code
COPY app/models ${FUNCTION_DIR}/models
COPY app/configs ${FUNCTION_DIR}/configs
COPY app/src ${FUNCTION_DIR}/src
COPY app/handlers ${FUNCTION_DIR}/handlers
COPY app/app.py ${FUNCTION_DIR}/app.py
# Setting the entry point
ENTRYPOINT [ "/usr/local/bin/python", "-m", "awslambdaric" ]
CMD [ "app.handler" ]
from typing import Dict
from handlers import evaluate, analyze
import traceback
def handler(event: Dict, context: Dict):
task = event["task"]
payload = event["payload"]
try:
if task == "evaluation":
body = evaluate(payload)
return {"status": 200, "body": body}
elif task == "analysis":
body = analyze(payload)
return {"status": 200, "body": body}
else:
body = "Invocation error"
return {"status": 400, "body": body}
except Exception as e:
body = str(e)
print("Error:", body)
print("Input:", event)
traceback.print_exc()
return {"status": 500, "body": body}
from .evaluation import evaluate
from .analysis import analyze
from typing import Dict
from src import TestBench
tb_kwargs = {
"svm": {"cf_generator_config": "configs/models/wf-cf-generator.yaml"},
"knn": {"cf_generator_config": "configs/models/wf-cf-generator.yaml"},
"rf": {
"threshold_classifier": 0.49339999999983775,
"max_iter": 50,
"time_maximum": 120,
},
"lr": {
"threshold_classifier": 0.49179999999978463,
"max_iter": 50,
"time_maximum": 120,
},
}
def analyze(payload: Dict):
model_name = payload["model_name"]
configurations = payload["configurations"]
prompt = payload["prompt"]
variations = payload["variations"]
tb = TestBench(
model_path=f"models/analysis-models/{model_name}.pkl",
vectorizer_path="models/analysis-models/tfidf.pkl",
analyzer_name=model_name,
**tb_kwargs[model_name],
)
reports = tb(configurations, prompt, variations)
reports = "\n\n".join(reports)
return reports
from src.models import AnalysisModels as Models
from typing import Dict
def evaluate(payload: Dict) -> Dict:
texts = payload["texts"]
model_name = payload["model_name"]
models = Models("configs/models/analysis-models.yaml", "models/analysis-models/")
model = getattr(models, model_name)
scores, preds = model(texts)
return {"scores": scores, "predictions": preds}
#!/bin/bash
if [ -e Dockerfile ]; then
# remove old repititions if they exist
if [ -e app/models ]; then
rm -r app/models
fi
if [ -e app/configs ]; then
rm -r app/configs
fi
if [ -e app/src ]; then
rm -r app/src
fi
# copy repetitions
cp -r ../../models app/models
cp -r ../../configs app/configs
cp -r ../../src app/src
# log into docker
aws ecr get-login-password --region ap-south-1 | sudo docker login --username AWS --password-stdin 065257926712.dkr.ecr.ap-south-1.amazonaws.com
# build and push
sudo docker build -t 065257926712.dkr.ecr.ap-south-1.amazonaws.com/xai:latest .
sudo docker push 065257926712.dkr.ecr.ap-south-1.amazonaws.com/xai:latest
else
echo "Please change the working directory to the directory containing the Dockerfile"
exit 1
fi
\ No newline at end of file
torch==2.0.1
\ No newline at end of file
scikit-learn==1.2.2
nltk==3.8.1
ipykernel==6.24.0
ipywidgets==7.6.5
pyyaml==6.0
pandas==2.0.3
beautifulsoup4==4.12.2
wget==3.2
numpy==1.23.5
shap==0.41.0
matplotlib==3.5.1
seaborn==0.11.2
ordered-set==4.1.0
boto3==1.27.0
transformers==4.31.0
sagemaker==2.173.0
sentencepiece==0.1.99
\ No newline at end of file
......@@ -8,16 +8,28 @@
# testing
/coverage
# next.js
/.next/
/out/
# production
/build
# misc
.DS_Store
.env.local
.env.development.local
.env.test.local
.env.production.local
*.pem
# debug
npm-debug.log*
yarn-debug.log*
yarn-error.log*
# local env files
.env*.local
# vercel
.vercel
# typescript
*.tsbuildinfo
next-env.d.ts
{
"compilerOptions": {
"paths": {
"@/*": ["./src/*"]
}
}
}
/** @type {import('next').NextConfig} */
const nextConfig = {
reactStrictMode: true,
}
module.exports = nextConfig
{
"name": "xai-frontend",
"version": "0.1.0",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "xai-frontend",
"version": "0.1.0",
"dependencies": {
"@aws-sdk/client-lambda": "^3.359.0",
"@emotion/react": "^11.11.1",
"@emotion/styled": "^11.11.0",
"@mui/icons-material": "^5.11.16",
"@mui/lab": "^5.0.0-alpha.134",
"@mui/material": "^5.13.6",
"next": "13.4.7",
"react": "18.2.0",
"react-dom": "18.2.0",
"react-hook-form": "^7.45.4"
}
},
"node_modules/@aws-crypto/crc32": {
"version": "3.0.0",
"resolved": "https://registry.npmjs.org/@aws-crypto/crc32/-/crc32-3.0.0.tgz",
"integrity": "sha512-IzSgsrxUcsrejQbPVilIKy16kAT52EwB6zSaI+M3xxIhKh5+aldEyvI+z6erM7TCLB2BJsFrtHjp6/4/sr+3dA==",
"dependencies": {
"@aws-crypto/util": "^3.0.0",
"@aws-sdk/types": "^3.222.0",
"tslib": "^1.11.1"
}
},
"node_modules/@aws-crypto/crc32/node_modules/tslib": {
"version": "1.14.1",
"resolved": "https://registry.npmjs.org/tslib/-/tslib-1.14.1.tgz",
"integrity": "sha512-Xni35NKzjgMrwevysHTCArtLDpPvye8zV/0E4EyYn43P7/7qvQwPh9BGkHewbMulVntbigmcT7rdX3BNo9wRJg=="
},
"node_modules/@aws-crypto/ie11-detection": {
"version": "3.0.0",
"resolved": "https://registry.npmjs.org/@aws-crypto/ie11-detection/-/ie11-detection-3.0.0.tgz",
"integrity": "sha512-341lBBkiY1DfDNKai/wXM3aujNBkXR7tq1URPQDL9wi3AUbI80NR74uF1TXHMm7po1AcnFk8iu2S2IeU/+/A+Q==",
"dependencies": {
"tslib": "^1.11.1"
}
},
"node_modules/@aws-crypto/ie11-detection/node_modules/tslib": {
"version": "1.14.1",
"resolved": "https://registry.npmjs.org/tslib/-/tslib-1.14.1.tgz",
"integrity": "sha512-Xni35NKzjgMrwevysHTCArtLDpPvye8zV/0E4EyYn43P7/7qvQwPh9BGkHewbMulVntbigmcT7rdX3BNo9wRJg=="
},
"node_modules/@aws-crypto/sha256-browser": {
"version": "3.0.0",
"resolved": "https://registry.npmjs.org/@aws-crypto/sha256-browser/-/sha256-browser-3.0.0.tgz",
"integrity": "sha512-8VLmW2B+gjFbU5uMeqtQM6Nj0/F1bro80xQXCW6CQBWgosFWXTx77aeOF5CAIAmbOK64SdMBJdNr6J41yP5mvQ==",
"dependencies": {
"@aws-crypto/ie11-detection": "^3.0.0",
"@aws-crypto/sha256-js": "^3.0.0",
"@aws-crypto/supports-web-crypto": "^3.0.0",
"@aws-crypto/util": "^3.0.0",
"@aws-sdk/types": "^3.222.0",
"@aws-sdk/util-locate-window": "^3.0.0",
"@aws-sdk/util-utf8-browser": "^3.0.0",
"tslib": "^1.11.1"
}
},
"node_modules/@aws-crypto/sha256-browser/node_modules/tslib": {
"version": "1.14.1",
"resolved": "https://registry.npmjs.org/tslib/-/tslib-1.14.1.tgz",
"integrity": "sha512-Xni35NKzjgMrwevysHTCArtLDpPvye8zV/0E4EyYn43P7/7qvQwPh9BGkHewbMulVntbigmcT7rdX3BNo9wRJg=="
},
"node_modules/@aws-crypto/sha256-js": {
"version": "3.0.0",
"resolved": "https://registry.npmjs.org/@aws-crypto/sha256-js/-/sha256-js-3.0.0.tgz",
"integrity": "sha512-PnNN7os0+yd1XvXAy23CFOmTbMaDxgxXtTKHybrJ39Y8kGzBATgBFibWJKH6BhytLI/Zyszs87xCOBNyBig6vQ==",
"dependencies": {
"@aws-crypto/util": "^3.0.0",
"@aws-sdk/types": "^3.222.0",
"tslib": "^1.11.1"
}
},
"node_modules/@aws-crypto/sha256-js/node_modules/tslib": {
"version": "1.14.1",
"resolved": "https://registry.npmjs.org/tslib/-/tslib-1.14.1.tgz",
"integrity": "sha512-Xni35NKzjgMrwevysHTCArtLDpPvye8zV/0E4EyYn43P7/7qvQwPh9BGkHewbMulVntbigmcT7rdX3BNo9wRJg=="
},
"node_modules/@aws-crypto/supports-web-crypto": {
"version": "3.0.0",
"resolved": "https://registry.npmjs.org/@aws-crypto/supports-web-crypto/-/supports-web-crypto-3.0.0.tgz",
"integrity": "sha512-06hBdMwUAb2WFTuGG73LSC0wfPu93xWwo5vL2et9eymgmu3Id5vFAHBbajVWiGhPO37qcsdCap/FqXvJGJWPIg==",
"dependencies": {
"tslib": "^1.11.1"
}
},
"node_modules/@aws-crypto/supports-web-crypto/node_modules/tslib": {
"version": "1.14.1",
"resolved": "https://registry.npmjs.org/tslib/-/tslib-1.14.1.tgz",
"integrity": "sha512-Xni35NKzjgMrwevysHTCArtLDpPvye8zV/0E4EyYn43P7/7qvQwPh9BGkHewbMulVntbigmcT7rdX3BNo9wRJg=="
},
"node_modules/@aws-crypto/util": {
"version": "3.0.0",
"resolved": "https://registry.npmjs.org/@aws-crypto/util/-/util-3.0.0.tgz",
"integrity": "sha512-2OJlpeJpCR48CC8r+uKVChzs9Iungj9wkZrl8Z041DWEWvyIHILYKCPNzJghKsivj+S3mLo6BVc7mBNzdxA46w==",
"dependencies": {
"@aws-sdk/types": "^3.222.0",
"@aws-sdk/util-utf8-browser": "^3.0.0",
"tslib": "^1.11.1"
}
},
"node_modules/@aws-crypto/util/node_modules/tslib": {
"version": "1.14.1",
"resolved": "https://registry.npmjs.org/tslib/-/tslib-1.14.1.tgz",
"integrity": "sha512-Xni35NKzjgMrwevysHTCArtLDpPvye8zV/0E4EyYn43P7/7qvQwPh9BGkHewbMulVntbigmcT7rdX3BNo9wRJg=="
},
"node_modules/@aws-sdk/abort-controller": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/abort-controller/-/abort-controller-3.357.0.tgz",
"integrity": "sha512-nQYDJon87quPwt2JZJwUN2GFKJnvE5kWb6tZP4xb5biSGUKBqDQo06oYed7yokatCuCMouIXV462aN0fWODtOw==",
"dependencies": {
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/client-lambda": {
"version": "3.359.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/client-lambda/-/client-lambda-3.359.0.tgz",
"integrity": "sha512-o6A3x+R6Oi64+mmK+mbYO1BLr18z5U/NiBevqE+FqQFptAoA6vp8CQW7zpeuUeC1J0ZdSNZCHc3HqUAlv7l/bg==",
"dependencies": {
"@aws-crypto/sha256-browser": "3.0.0",
"@aws-crypto/sha256-js": "3.0.0",
"@aws-sdk/client-sts": "3.359.0",
"@aws-sdk/config-resolver": "3.357.0",
"@aws-sdk/credential-provider-node": "3.358.0",
"@aws-sdk/eventstream-serde-browser": "3.357.0",
"@aws-sdk/eventstream-serde-config-resolver": "3.357.0",
"@aws-sdk/eventstream-serde-node": "3.357.0",
"@aws-sdk/fetch-http-handler": "3.357.0",
"@aws-sdk/hash-node": "3.357.0",
"@aws-sdk/invalid-dependency": "3.357.0",
"@aws-sdk/middleware-content-length": "3.357.0",
"@aws-sdk/middleware-endpoint": "3.357.0",
"@aws-sdk/middleware-host-header": "3.357.0",
"@aws-sdk/middleware-logger": "3.357.0",
"@aws-sdk/middleware-recursion-detection": "3.357.0",
"@aws-sdk/middleware-retry": "3.357.0",
"@aws-sdk/middleware-serde": "3.357.0",
"@aws-sdk/middleware-signing": "3.357.0",
"@aws-sdk/middleware-stack": "3.357.0",
"@aws-sdk/middleware-user-agent": "3.357.0",
"@aws-sdk/node-config-provider": "3.357.0",
"@aws-sdk/node-http-handler": "3.357.0",
"@aws-sdk/smithy-client": "3.358.0",
"@aws-sdk/types": "3.357.0",
"@aws-sdk/url-parser": "3.357.0",
"@aws-sdk/util-base64": "3.310.0",
"@aws-sdk/util-body-length-browser": "3.310.0",
"@aws-sdk/util-body-length-node": "3.310.0",
"@aws-sdk/util-defaults-mode-browser": "3.358.0",
"@aws-sdk/util-defaults-mode-node": "3.358.0",
"@aws-sdk/util-endpoints": "3.357.0",
"@aws-sdk/util-retry": "3.357.0",
"@aws-sdk/util-stream": "3.358.0",
"@aws-sdk/util-user-agent-browser": "3.357.0",
"@aws-sdk/util-user-agent-node": "3.357.0",
"@aws-sdk/util-utf8": "3.310.0",
"@aws-sdk/util-waiter": "3.357.0",
"@smithy/protocol-http": "^1.0.1",
"@smithy/types": "^1.0.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/client-sso": {
"version": "3.358.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/client-sso/-/client-sso-3.358.0.tgz",
"integrity": "sha512-Kc9IsoPIHJfkjDuStyItwQAOpnxw/I9xfF3vvukeN9vkXcRiWeMDhEXACN4L1AYFlU9FHQSRdNwpYTIz7OrD2A==",
"dependencies": {
"@aws-crypto/sha256-browser": "3.0.0",
"@aws-crypto/sha256-js": "3.0.0",
"@aws-sdk/config-resolver": "3.357.0",
"@aws-sdk/fetch-http-handler": "3.357.0",
"@aws-sdk/hash-node": "3.357.0",
"@aws-sdk/invalid-dependency": "3.357.0",
"@aws-sdk/middleware-content-length": "3.357.0",
"@aws-sdk/middleware-endpoint": "3.357.0",
"@aws-sdk/middleware-host-header": "3.357.0",
"@aws-sdk/middleware-logger": "3.357.0",
"@aws-sdk/middleware-recursion-detection": "3.357.0",
"@aws-sdk/middleware-retry": "3.357.0",
"@aws-sdk/middleware-serde": "3.357.0",
"@aws-sdk/middleware-stack": "3.357.0",
"@aws-sdk/middleware-user-agent": "3.357.0",
"@aws-sdk/node-config-provider": "3.357.0",
"@aws-sdk/node-http-handler": "3.357.0",
"@aws-sdk/smithy-client": "3.358.0",
"@aws-sdk/types": "3.357.0",
"@aws-sdk/url-parser": "3.357.0",
"@aws-sdk/util-base64": "3.310.0",
"@aws-sdk/util-body-length-browser": "3.310.0",
"@aws-sdk/util-body-length-node": "3.310.0",
"@aws-sdk/util-defaults-mode-browser": "3.358.0",
"@aws-sdk/util-defaults-mode-node": "3.358.0",
"@aws-sdk/util-endpoints": "3.357.0",
"@aws-sdk/util-retry": "3.357.0",
"@aws-sdk/util-user-agent-browser": "3.357.0",
"@aws-sdk/util-user-agent-node": "3.357.0",
"@aws-sdk/util-utf8": "3.310.0",
"@smithy/protocol-http": "^1.0.1",
"@smithy/types": "^1.0.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/client-sso-oidc": {
"version": "3.358.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/client-sso-oidc/-/client-sso-oidc-3.358.0.tgz",
"integrity": "sha512-Gy09fSlhJdGbr8rNNR8EdLaUynB1B34nw8kN1aFT4CdAnjFKxTainqG6Aq4vx64TbMDMhvMYWpNAluvq7UHVhw==",
"dependencies": {
"@aws-crypto/sha256-browser": "3.0.0",
"@aws-crypto/sha256-js": "3.0.0",
"@aws-sdk/config-resolver": "3.357.0",
"@aws-sdk/fetch-http-handler": "3.357.0",
"@aws-sdk/hash-node": "3.357.0",
"@aws-sdk/invalid-dependency": "3.357.0",
"@aws-sdk/middleware-content-length": "3.357.0",
"@aws-sdk/middleware-endpoint": "3.357.0",
"@aws-sdk/middleware-host-header": "3.357.0",
"@aws-sdk/middleware-logger": "3.357.0",
"@aws-sdk/middleware-recursion-detection": "3.357.0",
"@aws-sdk/middleware-retry": "3.357.0",
"@aws-sdk/middleware-serde": "3.357.0",
"@aws-sdk/middleware-stack": "3.357.0",
"@aws-sdk/middleware-user-agent": "3.357.0",
"@aws-sdk/node-config-provider": "3.357.0",
"@aws-sdk/node-http-handler": "3.357.0",
"@aws-sdk/smithy-client": "3.358.0",
"@aws-sdk/types": "3.357.0",
"@aws-sdk/url-parser": "3.357.0",
"@aws-sdk/util-base64": "3.310.0",
"@aws-sdk/util-body-length-browser": "3.310.0",
"@aws-sdk/util-body-length-node": "3.310.0",
"@aws-sdk/util-defaults-mode-browser": "3.358.0",
"@aws-sdk/util-defaults-mode-node": "3.358.0",
"@aws-sdk/util-endpoints": "3.357.0",
"@aws-sdk/util-retry": "3.357.0",
"@aws-sdk/util-user-agent-browser": "3.357.0",
"@aws-sdk/util-user-agent-node": "3.357.0",
"@aws-sdk/util-utf8": "3.310.0",
"@smithy/protocol-http": "^1.0.1",
"@smithy/types": "^1.0.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/client-sts": {
"version": "3.359.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/client-sts/-/client-sts-3.359.0.tgz",
"integrity": "sha512-zpyui8hXvEUvq8MwzZsm51ni0intvPjtV8dgx10nVJnm605nqrLlAMGqQ1S/UxO7CVmhqWbh5dnGHEc//UJlsw==",
"dependencies": {
"@aws-crypto/sha256-browser": "3.0.0",
"@aws-crypto/sha256-js": "3.0.0",
"@aws-sdk/config-resolver": "3.357.0",
"@aws-sdk/credential-provider-node": "3.358.0",
"@aws-sdk/fetch-http-handler": "3.357.0",
"@aws-sdk/hash-node": "3.357.0",
"@aws-sdk/invalid-dependency": "3.357.0",
"@aws-sdk/middleware-content-length": "3.357.0",
"@aws-sdk/middleware-endpoint": "3.357.0",
"@aws-sdk/middleware-host-header": "3.357.0",
"@aws-sdk/middleware-logger": "3.357.0",
"@aws-sdk/middleware-recursion-detection": "3.357.0",
"@aws-sdk/middleware-retry": "3.357.0",
"@aws-sdk/middleware-sdk-sts": "3.357.0",
"@aws-sdk/middleware-serde": "3.357.0",
"@aws-sdk/middleware-signing": "3.357.0",
"@aws-sdk/middleware-stack": "3.357.0",
"@aws-sdk/middleware-user-agent": "3.357.0",
"@aws-sdk/node-config-provider": "3.357.0",
"@aws-sdk/node-http-handler": "3.357.0",
"@aws-sdk/smithy-client": "3.358.0",
"@aws-sdk/types": "3.357.0",
"@aws-sdk/url-parser": "3.357.0",
"@aws-sdk/util-base64": "3.310.0",
"@aws-sdk/util-body-length-browser": "3.310.0",
"@aws-sdk/util-body-length-node": "3.310.0",
"@aws-sdk/util-defaults-mode-browser": "3.358.0",
"@aws-sdk/util-defaults-mode-node": "3.358.0",
"@aws-sdk/util-endpoints": "3.357.0",
"@aws-sdk/util-retry": "3.357.0",
"@aws-sdk/util-user-agent-browser": "3.357.0",
"@aws-sdk/util-user-agent-node": "3.357.0",
"@aws-sdk/util-utf8": "3.310.0",
"@smithy/protocol-http": "^1.0.1",
"@smithy/types": "^1.0.0",
"fast-xml-parser": "4.2.5",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/config-resolver": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/config-resolver/-/config-resolver-3.357.0.tgz",
"integrity": "sha512-cukfg0nX7Tzx/xFyH5F4Eyb8DA1ITCGtSQv4vnEjgUop+bkzckuGLKEeBcBhyZY+aw+2C9CVwIHwIMhRm0ul5w==",
"dependencies": {
"@aws-sdk/types": "3.357.0",
"@aws-sdk/util-config-provider": "3.310.0",
"@aws-sdk/util-middleware": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/credential-provider-env": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/credential-provider-env/-/credential-provider-env-3.357.0.tgz",
"integrity": "sha512-UOecwfqvXgJVqhfWSZ2S44v2Nq2oceW0PQVQp0JAa9opc2rxSVIfyOhPr0yMoPmpyNcP22rgeg6ce70KULYwiA==",
"dependencies": {
"@aws-sdk/property-provider": "3.357.0",
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/credential-provider-imds": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/credential-provider-imds/-/credential-provider-imds-3.357.0.tgz",
"integrity": "sha512-upw/bfsl7/WydT6gM0lBuR4Ipp4fzYm/E3ObFr0Mg5OkgVPt5ZJE+eeFTvwCpDdBSTKs4JfrK6/iEK8A23Q1jQ==",
"dependencies": {
"@aws-sdk/node-config-provider": "3.357.0",
"@aws-sdk/property-provider": "3.357.0",
"@aws-sdk/types": "3.357.0",
"@aws-sdk/url-parser": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/credential-provider-ini": {
"version": "3.358.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/credential-provider-ini/-/credential-provider-ini-3.358.0.tgz",
"integrity": "sha512-Blmw4bhGxpaYvPmrbRKAltqnNDDSf6ZegNqJasc5OWvAlHJNvB/hYPmyQN0oFy79BXn7PbBip1QaLWaEhJvpAA==",
"dependencies": {
"@aws-sdk/credential-provider-env": "3.357.0",
"@aws-sdk/credential-provider-imds": "3.357.0",
"@aws-sdk/credential-provider-process": "3.357.0",
"@aws-sdk/credential-provider-sso": "3.358.0",
"@aws-sdk/credential-provider-web-identity": "3.357.0",
"@aws-sdk/property-provider": "3.357.0",
"@aws-sdk/shared-ini-file-loader": "3.357.0",
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/credential-provider-node": {
"version": "3.358.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/credential-provider-node/-/credential-provider-node-3.358.0.tgz",
"integrity": "sha512-iLjyRNOT0ycdLqkzXNW+V2zibVljkLjL8j45FpK6mNrAwc/Ynr7EYuRRp5OuRiiYDO3ZoneAxpBJQ5SqmK2Jfg==",
"dependencies": {
"@aws-sdk/credential-provider-env": "3.357.0",
"@aws-sdk/credential-provider-imds": "3.357.0",
"@aws-sdk/credential-provider-ini": "3.358.0",
"@aws-sdk/credential-provider-process": "3.357.0",
"@aws-sdk/credential-provider-sso": "3.358.0",
"@aws-sdk/credential-provider-web-identity": "3.357.0",
"@aws-sdk/property-provider": "3.357.0",
"@aws-sdk/shared-ini-file-loader": "3.357.0",
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/credential-provider-process": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/credential-provider-process/-/credential-provider-process-3.357.0.tgz",
"integrity": "sha512-qFWWilFPsc2hR7O0KIhwcE78w+pVIK+uQR6MQMfdRyxUndgiuCorJwVjedc3yZtmnoELHF34j+m8whTBXv9E7Q==",
"dependencies": {
"@aws-sdk/property-provider": "3.357.0",
"@aws-sdk/shared-ini-file-loader": "3.357.0",
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/credential-provider-sso": {
"version": "3.358.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/credential-provider-sso/-/credential-provider-sso-3.358.0.tgz",
"integrity": "sha512-hKu5NshKohSDoHaXKyeCW88J8dBt4TMljrL+WswTMifuThO9ptyMq4PCdl4z7CNjIq6zo3ftc/uNf8TY7Ga8+w==",
"dependencies": {
"@aws-sdk/client-sso": "3.358.0",
"@aws-sdk/property-provider": "3.357.0",
"@aws-sdk/shared-ini-file-loader": "3.357.0",
"@aws-sdk/token-providers": "3.358.0",
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/credential-provider-web-identity": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/credential-provider-web-identity/-/credential-provider-web-identity-3.357.0.tgz",
"integrity": "sha512-0KRRAFrXy5HJe2vqnCWCoCS+fQw7IoIj3KQsuURJMW4F+ifisxCgEsh3brJ2LQlN4ElWTRJhlrDHNZ/pd61D4w==",
"dependencies": {
"@aws-sdk/property-provider": "3.357.0",
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/eventstream-codec": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/eventstream-codec/-/eventstream-codec-3.357.0.tgz",
"integrity": "sha512-bqenTHG6GH6aCk/Il+ooWXVVAZuc8lOgVEy9bE2hI49oVqT8zSuXxQB+w1WWyZoAOPcelsjayB1wfPub8VDBxQ==",
"dependencies": {
"@aws-crypto/crc32": "3.0.0",
"@aws-sdk/types": "3.357.0",
"@aws-sdk/util-hex-encoding": "3.310.0",
"tslib": "^2.5.0"
}
},
"node_modules/@aws-sdk/eventstream-serde-browser": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/eventstream-serde-browser/-/eventstream-serde-browser-3.357.0.tgz",
"integrity": "sha512-hBabtmwuspVHGSKnUccDiSIbg+IVoBThx6wYt6i4edbWAITHF3ADVKXy7icV400CAyG0XTZgxjE6FKpiDxj9rQ==",
"dependencies": {
"@aws-sdk/eventstream-serde-universal": "3.357.0",
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/eventstream-serde-config-resolver": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/eventstream-serde-config-resolver/-/eventstream-serde-config-resolver-3.357.0.tgz",
"integrity": "sha512-E6rwk+1KFXhKmJ+v7JW5Uyyda1yN5XRVupCnCrtFsHFmhVGQxFacoUZIee3bfuCpC58dLSyESggxGpUd3XOSsw==",
"dependencies": {
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/eventstream-serde-node": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/eventstream-serde-node/-/eventstream-serde-node-3.357.0.tgz",
"integrity": "sha512-boXDy+JWcPfHc9OIKV6I4Bh2XrLcg+eac+/LldNZFcDIB33/gHIM2CJw8u565Iebdz1NKEkP/QPPZbk2y+abPA==",
"dependencies": {
"@aws-sdk/eventstream-serde-universal": "3.357.0",
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/eventstream-serde-universal": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/eventstream-serde-universal/-/eventstream-serde-universal-3.357.0.tgz",
"integrity": "sha512-9/Wcdxx38XQAturqOAGYNCaLOzFVnW+xwxd4af9eNOfZfZ5PP5PRKBIpvKDsN26e3l4f3GodHx7MS1WB7BBc2w==",
"dependencies": {
"@aws-sdk/eventstream-codec": "3.357.0",
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/fetch-http-handler": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/fetch-http-handler/-/fetch-http-handler-3.357.0.tgz",
"integrity": "sha512-5sPloTO8y8fAnS/6/Sfp/aVoL9zuhzkLdWBORNzMazdynVNEzWKWCPZ27RQpgkaCDHiXjqUY4kfuFXAGkvFfDQ==",
"dependencies": {
"@aws-sdk/protocol-http": "3.357.0",
"@aws-sdk/querystring-builder": "3.357.0",
"@aws-sdk/types": "3.357.0",
"@aws-sdk/util-base64": "3.310.0",
"tslib": "^2.5.0"
}
},
"node_modules/@aws-sdk/hash-node": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/hash-node/-/hash-node-3.357.0.tgz",
"integrity": "sha512-fq3LS9AxHKb7dTZkm6iM1TrGk6XOTZz96iEZPME1+vjiSEXGWuebHt87q92n+KozVGRypn9MId3lHOPBBjygNQ==",
"dependencies": {
"@aws-sdk/types": "3.357.0",
"@aws-sdk/util-buffer-from": "3.310.0",
"@aws-sdk/util-utf8": "3.310.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/invalid-dependency": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/invalid-dependency/-/invalid-dependency-3.357.0.tgz",
"integrity": "sha512-HnCYZczf0VdyxMVMMxmA3QJAyyPSFbcMtZzgKbxVTWTG7GKpQe0psWZu/7O2Nk31mKg6vEUdiP1FylqLBsgMOA==",
"dependencies": {
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
}
},
"node_modules/@aws-sdk/is-array-buffer": {
"version": "3.310.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/is-array-buffer/-/is-array-buffer-3.310.0.tgz",
"integrity": "sha512-urnbcCR+h9NWUnmOtet/s4ghvzsidFmspfhYaHAmSRdy9yDjdjBJMFjjsn85A1ODUktztm+cVncXjQ38WCMjMQ==",
"dependencies": {
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/middleware-content-length": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/middleware-content-length/-/middleware-content-length-3.357.0.tgz",
"integrity": "sha512-zQOFEyzOXAgN4M54tYNWGxKxnyzY0WwYDTFzh9riJRmxN1hTEKHUKmze4nILIf5rkQmOG4kTf1qmfazjkvZAhw==",
"dependencies": {
"@aws-sdk/protocol-http": "3.357.0",
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/middleware-endpoint": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/middleware-endpoint/-/middleware-endpoint-3.357.0.tgz",
"integrity": "sha512-ScJi0SL8X/Lyi0Fp5blg0QN/Z6PoRwV/ZJXd8dQkXSznkbSvJHfqPP0xk/w3GcQ1TKsu5YEPfeYy8ejcq+7Pgg==",
"dependencies": {
"@aws-sdk/middleware-serde": "3.357.0",
"@aws-sdk/types": "3.357.0",
"@aws-sdk/url-parser": "3.357.0",
"@aws-sdk/util-middleware": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/middleware-host-header": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/middleware-host-header/-/middleware-host-header-3.357.0.tgz",
"integrity": "sha512-HuGLcP7JP1qJ5wGT9GSlEknDaTSnOzHY4T6IPFuvFjAy3PvY5siQNm6+VRqdVS+n6/kzpL3JP5sAVM3aoxHT6Q==",
"dependencies": {
"@aws-sdk/protocol-http": "3.357.0",
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/middleware-logger": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/middleware-logger/-/middleware-logger-3.357.0.tgz",
"integrity": "sha512-dncT3tr+lZ9+duZo52rASgO6AKVwRcsc2/T93gmaYVrJqI6WWAwQ7yML5s72l9ZjQ5LZ+4jjrgtlufavAS0eCg==",
"dependencies": {
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/middleware-recursion-detection": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/middleware-recursion-detection/-/middleware-recursion-detection-3.357.0.tgz",
"integrity": "sha512-AXC54IeDS3jC1dbbkYHML4STvBPcKZ4IJTWdjEK1RCOgqXd0Ze1cE1e21wyj1tM6prF03zLyvpBd+3TS++nqfA==",
"dependencies": {
"@aws-sdk/protocol-http": "3.357.0",
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/middleware-retry": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/middleware-retry/-/middleware-retry-3.357.0.tgz",
"integrity": "sha512-ZCbXCYv3nglQqwREYxxpclrnR9MYPAnHlLcC8e9PbApqxGnaZdhoywxoqbgqT3hf/RM7kput4vEHDl1fyymcRQ==",
"dependencies": {
"@aws-sdk/protocol-http": "3.357.0",
"@aws-sdk/service-error-classification": "3.357.0",
"@aws-sdk/types": "3.357.0",
"@aws-sdk/util-middleware": "3.357.0",
"@aws-sdk/util-retry": "3.357.0",
"tslib": "^2.5.0",
"uuid": "^8.3.2"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/middleware-sdk-sts": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/middleware-sdk-sts/-/middleware-sdk-sts-3.357.0.tgz",
"integrity": "sha512-Ng2VjLrPiL02QOcs1qs9jG2boO4Gn+v3VIbOJLG4zXcfbSq55iIWtlmr2ljfw9vP5aLhWtcODfmKHS5Bp+019Q==",
"dependencies": {
"@aws-sdk/middleware-signing": "3.357.0",
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/middleware-serde": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/middleware-serde/-/middleware-serde-3.357.0.tgz",
"integrity": "sha512-bGI4kYuuEsFjlANbyJLyy4AovETnyf/SukgLOG7Qjbua+ZGuzvRhMsk21mBKKGrnsTO4PmtieJo6xClThGAN8g==",
"dependencies": {
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/middleware-signing": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/middleware-signing/-/middleware-signing-3.357.0.tgz",
"integrity": "sha512-yB9ewEqI6Fw1OrmKFrUypbCqN5ijk06UGPochybamMuPxxkwMT3bnrm7eezsCA+TZbJyKhpffpyobwuv+xGNrA==",
"dependencies": {
"@aws-sdk/property-provider": "3.357.0",
"@aws-sdk/protocol-http": "3.357.0",
"@aws-sdk/signature-v4": "3.357.0",
"@aws-sdk/types": "3.357.0",
"@aws-sdk/util-middleware": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/middleware-stack": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/middleware-stack/-/middleware-stack-3.357.0.tgz",
"integrity": "sha512-nNV+jfwGwmbOGZujAY/U8AW3EbVlxa9DJDLz3TPp/39o6Vu5KEzHJyDDNreo2k9V/TMvV+nOzHafufgPdagv7w==",
"dependencies": {
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/middleware-user-agent": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/middleware-user-agent/-/middleware-user-agent-3.357.0.tgz",
"integrity": "sha512-M/CsAXjGblZS4rEbMb0Dn9IXbfq4EjVaTHBfvuILU/dKRppWvjnm2lRtqCZ+LIT3ATbAjA3/dY7dWsjxQWwijA==",
"dependencies": {
"@aws-sdk/protocol-http": "3.357.0",
"@aws-sdk/types": "3.357.0",
"@aws-sdk/util-endpoints": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/node-config-provider": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/node-config-provider/-/node-config-provider-3.357.0.tgz",
"integrity": "sha512-kwBIzKCaW3UWqLdELhy7TcN8itNMOjbzga530nalFILMvn2IxrkdKQhNgxGBXy6QK6kCOtH6OmcrG3/oZkLwig==",
"dependencies": {
"@aws-sdk/property-provider": "3.357.0",
"@aws-sdk/shared-ini-file-loader": "3.357.0",
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/node-http-handler": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/node-http-handler/-/node-http-handler-3.357.0.tgz",
"integrity": "sha512-uoab4xIJux+Q9hQ9A/vWEAjojtBQ0U4K7xEQVa0BXEv7MHH5zv51H+VtrelU1Ed6hsHq4Sx0bxBMFpbbWhNyjA==",
"dependencies": {
"@aws-sdk/abort-controller": "3.357.0",
"@aws-sdk/protocol-http": "3.357.0",
"@aws-sdk/querystring-builder": "3.357.0",
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/property-provider": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/property-provider/-/property-provider-3.357.0.tgz",
"integrity": "sha512-im4W0u8WaYxG7J7ko4Xl3OEzK3Mrm1Rz6/txTGe6hTIHlyUISu1ekOQJXK6XYPqNMn8v1G3BiQREoRXUEJFbHg==",
"dependencies": {
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/protocol-http": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/protocol-http/-/protocol-http-3.357.0.tgz",
"integrity": "sha512-w1JHiI50VEea7duDeAspUiKJmmdIQblvRyjVMOqWA6FIQAyDVuEiPX7/MdQr0ScxhtRQxHbP0I4MFyl7ctRQvA==",
"dependencies": {
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/querystring-builder": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/querystring-builder/-/querystring-builder-3.357.0.tgz",
"integrity": "sha512-aQcicqB6Y2cNaXPPwunz612a01SMiQQPsdz632F/3Lzn0ua82BJKobHOtaiTUlmVJ5Q4/EAeNfwZgL7tTUNtDQ==",
"dependencies": {
"@aws-sdk/types": "3.357.0",
"@aws-sdk/util-uri-escape": "3.310.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/querystring-parser": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/querystring-parser/-/querystring-parser-3.357.0.tgz",
"integrity": "sha512-Svvq+atRNP9s2VxiklcUNgCzmt3T5kfs7X2C+yjmxHvOQTPjLNaNGbfC/vhjOK7aoXw0h+lBac48r5ymx1PbQA==",
"dependencies": {
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/service-error-classification": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/service-error-classification/-/service-error-classification-3.357.0.tgz",
"integrity": "sha512-VuXeL4g5vKO9HjgCZlxmH8Uv1FcqUSjmbPpQkbNtYIDck6u0qzM0rG+n0/1EjyQbPSr3MhW/pkWs5nx2Nljlyg==",
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/shared-ini-file-loader": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/shared-ini-file-loader/-/shared-ini-file-loader-3.357.0.tgz",
"integrity": "sha512-ceyqM4XxQe0Plb/oQAD2t1UOV2Iy4PFe1oAGM8dfJzYrRKu7zvMwru7/WaB3NYq+/mIY6RU+jjhRmjQ3GySVqA==",
"dependencies": {
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/signature-v4": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/signature-v4/-/signature-v4-3.357.0.tgz",
"integrity": "sha512-itt4/Jh9FqnzK30qIjXFBvM4J7zN4S/AAqsRMnaX7U4f/MV+1YxQHmzimpdMnsCXXs2jqFqKVRu6DewxJ3nbxg==",
"dependencies": {
"@aws-sdk/eventstream-codec": "3.357.0",
"@aws-sdk/is-array-buffer": "3.310.0",
"@aws-sdk/types": "3.357.0",
"@aws-sdk/util-hex-encoding": "3.310.0",
"@aws-sdk/util-middleware": "3.357.0",
"@aws-sdk/util-uri-escape": "3.310.0",
"@aws-sdk/util-utf8": "3.310.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/smithy-client": {
"version": "3.358.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/smithy-client/-/smithy-client-3.358.0.tgz",
"integrity": "sha512-oqctxWb9yAqCh4ENwUkt9MC01l5uKoy+QCiSUUhQ76k7R3lyGOge9ycyRyoKl+oZWvEpnjZevXQFqEfGzkL7bA==",
"dependencies": {
"@aws-sdk/middleware-stack": "3.357.0",
"@aws-sdk/types": "3.357.0",
"@aws-sdk/util-stream": "3.358.0",
"@smithy/types": "^1.0.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/token-providers": {
"version": "3.358.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/token-providers/-/token-providers-3.358.0.tgz",
"integrity": "sha512-vATKNCwNhCSo2LzvtkIzW9Yp2/aKNR032VPtIWlDtWGGFhkzGi4FPS0VTdfefxz4rqPWfBz53mh54d9xylsWVw==",
"dependencies": {
"@aws-sdk/client-sso-oidc": "3.358.0",
"@aws-sdk/property-provider": "3.357.0",
"@aws-sdk/shared-ini-file-loader": "3.357.0",
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/types": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/types/-/types-3.357.0.tgz",
"integrity": "sha512-/riCRaXg3p71BeWnShrai0y0QTdXcouPSM0Cn1olZbzTf7s71aLEewrc96qFrL70XhY4XvnxMpqQh+r43XIL3g==",
"dependencies": {
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/url-parser": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/url-parser/-/url-parser-3.357.0.tgz",
"integrity": "sha512-fAaU6cFsaAba01lCRsRJiYR/LfXvX2wudyEyutBVglE4dWSoSeu3QJNxImIzTBULfbiFhz59++NQ1JUVx88IVg==",
"dependencies": {
"@aws-sdk/querystring-parser": "3.357.0",
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
}
},
"node_modules/@aws-sdk/util-base64": {
"version": "3.310.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/util-base64/-/util-base64-3.310.0.tgz",
"integrity": "sha512-v3+HBKQvqgdzcbL+pFswlx5HQsd9L6ZTlyPVL2LS9nNXnCcR3XgGz9jRskikRUuUvUXtkSG1J88GAOnJ/apTPg==",
"dependencies": {
"@aws-sdk/util-buffer-from": "3.310.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/util-body-length-browser": {
"version": "3.310.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/util-body-length-browser/-/util-body-length-browser-3.310.0.tgz",
"integrity": "sha512-sxsC3lPBGfpHtNTUoGXMQXLwjmR0zVpx0rSvzTPAuoVILVsp5AU/w5FphNPxD5OVIjNbZv9KsKTuvNTiZjDp9g==",
"dependencies": {
"tslib": "^2.5.0"
}
},
"node_modules/@aws-sdk/util-body-length-node": {
"version": "3.310.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/util-body-length-node/-/util-body-length-node-3.310.0.tgz",
"integrity": "sha512-2tqGXdyKhyA6w4zz7UPoS8Ip+7sayOg9BwHNidiGm2ikbDxm1YrCfYXvCBdwaJxa4hJfRVz+aL9e+d3GqPI9pQ==",
"dependencies": {
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/util-buffer-from": {
"version": "3.310.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/util-buffer-from/-/util-buffer-from-3.310.0.tgz",
"integrity": "sha512-i6LVeXFtGih5Zs8enLrt+ExXY92QV25jtEnTKHsmlFqFAuL3VBeod6boeMXkN2p9lbSVVQ1sAOOYZOHYbYkntw==",
"dependencies": {
"@aws-sdk/is-array-buffer": "3.310.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/util-config-provider": {
"version": "3.310.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/util-config-provider/-/util-config-provider-3.310.0.tgz",
"integrity": "sha512-xIBaYo8dwiojCw8vnUcIL4Z5tyfb1v3yjqyJKJWV/dqKUFOOS0U591plmXbM+M/QkXyML3ypon1f8+BoaDExrg==",
"dependencies": {
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/util-defaults-mode-browser": {
"version": "3.358.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/util-defaults-mode-browser/-/util-defaults-mode-browser-3.358.0.tgz",
"integrity": "sha512-KGfw64wRL/gROLD4Gatda8cUsaNKNhSnx+yDDcG2WkFlFfLr6FHvTijpRxvIM2Jau2ZhcdGzbegLjsFxviTJAA==",
"dependencies": {
"@aws-sdk/property-provider": "3.357.0",
"@aws-sdk/types": "3.357.0",
"bowser": "^2.11.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">= 10.0.0"
}
},
"node_modules/@aws-sdk/util-defaults-mode-node": {
"version": "3.358.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/util-defaults-mode-node/-/util-defaults-mode-node-3.358.0.tgz",
"integrity": "sha512-2C5on0yppDS0xGpFkHRqfrG9TeTq6ive1hPX1V8UCkiI/TBQYl88XCKCKct8zTcejyK9klZUDGI8QQTan2UWkw==",
"dependencies": {
"@aws-sdk/config-resolver": "3.357.0",
"@aws-sdk/credential-provider-imds": "3.357.0",
"@aws-sdk/node-config-provider": "3.357.0",
"@aws-sdk/property-provider": "3.357.0",
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">= 10.0.0"
}
},
"node_modules/@aws-sdk/util-endpoints": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/util-endpoints/-/util-endpoints-3.357.0.tgz",
"integrity": "sha512-XHKyS5JClT9su9hDif715jpZiWHQF9gKZXER8tW0gOizU3R9cyWc9EsJ2BRhFNhi7nt/JF/CLUEc5qDx3ETbUw==",
"dependencies": {
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/util-hex-encoding": {
"version": "3.310.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/util-hex-encoding/-/util-hex-encoding-3.310.0.tgz",
"integrity": "sha512-sVN7mcCCDSJ67pI1ZMtk84SKGqyix6/0A1Ab163YKn+lFBQRMKexleZzpYzNGxYzmQS6VanP/cfU7NiLQOaSfA==",
"dependencies": {
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/util-locate-window": {
"version": "3.310.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/util-locate-window/-/util-locate-window-3.310.0.tgz",
"integrity": "sha512-qo2t/vBTnoXpjKxlsC2e1gBrRm80M3bId27r0BRB2VniSSe7bL1mmzM+/HFtujm0iAxtPM+aLEflLJlJeDPg0w==",
"dependencies": {
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/util-middleware": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/util-middleware/-/util-middleware-3.357.0.tgz",
"integrity": "sha512-pV1krjZs7BdahZBfsCJMatE8kcor7GFsBOWrQgQDm9T0We5b5xPpOO2vxAD0RytBpY8Ky2ELs/+qXMv7l5fWIA==",
"dependencies": {
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/util-retry": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/util-retry/-/util-retry-3.357.0.tgz",
"integrity": "sha512-SUqYJE9msbuOVq+vnUy+t0LH7XuYNFz66dSF8q6tedsbJK4j8tgya0I1Ct3m06ynGrXDJMaj39I7AXCyW9bjtw==",
"dependencies": {
"@aws-sdk/service-error-classification": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">= 14.0.0"
}
},
"node_modules/@aws-sdk/util-stream": {
"version": "3.358.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/util-stream/-/util-stream-3.358.0.tgz",
"integrity": "sha512-zUhpjxAXV2+0eALlTU6uXRYMs10XYpcYzl3NtLRe4wWgnrOOOZnF/t5LQDoKXOfaMdzwZ+i90+PYr+6JQ58+7g==",
"dependencies": {
"@aws-sdk/fetch-http-handler": "3.357.0",
"@aws-sdk/node-http-handler": "3.357.0",
"@aws-sdk/types": "3.357.0",
"@aws-sdk/util-base64": "3.310.0",
"@aws-sdk/util-buffer-from": "3.310.0",
"@aws-sdk/util-hex-encoding": "3.310.0",
"@aws-sdk/util-utf8": "3.310.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/util-uri-escape": {
"version": "3.310.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/util-uri-escape/-/util-uri-escape-3.310.0.tgz",
"integrity": "sha512-drzt+aB2qo2LgtDoiy/3sVG8w63cgLkqFIa2NFlGpUgHFWTXkqtbgf4L5QdjRGKWhmZsnqkbtL7vkSWEcYDJ4Q==",
"dependencies": {
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/util-user-agent-browser": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/util-user-agent-browser/-/util-user-agent-browser-3.357.0.tgz",
"integrity": "sha512-JHaWlNIUkPNvXkqeDOrqFzAlAgdwZK5mZw7FQnCRvf8tdSogpGZSkuyb9Z6rLD9gC40Srbc2nepO1cFpeMsDkA==",
"dependencies": {
"@aws-sdk/types": "3.357.0",
"bowser": "^2.11.0",
"tslib": "^2.5.0"
}
},
"node_modules/@aws-sdk/util-user-agent-node": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/util-user-agent-node/-/util-user-agent-node-3.357.0.tgz",
"integrity": "sha512-RdpQoaJWQvcS99TVgSbT451iGrlH4qpWUWFA9U1IRhxOSsmC1hz8ME7xc8nci9SREx/ZlfT3ai6LpoAzAtIEMA==",
"dependencies": {
"@aws-sdk/node-config-provider": "3.357.0",
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
},
"peerDependencies": {
"aws-crt": ">=1.0.0"
},
"peerDependenciesMeta": {
"aws-crt": {
"optional": true
}
}
},
"node_modules/@aws-sdk/util-utf8": {
"version": "3.310.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/util-utf8/-/util-utf8-3.310.0.tgz",
"integrity": "sha512-DnLfFT8uCO22uOJc0pt0DsSNau1GTisngBCDw8jQuWT5CqogMJu4b/uXmwEqfj8B3GX6Xsz8zOd6JpRlPftQoA==",
"dependencies": {
"@aws-sdk/util-buffer-from": "3.310.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@aws-sdk/util-utf8-browser": {
"version": "3.259.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/util-utf8-browser/-/util-utf8-browser-3.259.0.tgz",
"integrity": "sha512-UvFa/vR+e19XookZF8RzFZBrw2EUkQWxiBW0yYQAhvk3C+QVGl0H3ouca8LDBlBfQKXwmW3huo/59H8rwb1wJw==",
"dependencies": {
"tslib": "^2.3.1"
}
},
"node_modules/@aws-sdk/util-waiter": {
"version": "3.357.0",
"resolved": "https://registry.npmjs.org/@aws-sdk/util-waiter/-/util-waiter-3.357.0.tgz",
"integrity": "sha512-jQQGA5G8bm0JP5C4U85VzMpkFHdeeT7fOSUncXLG9Sh8Ambzi4XTud8m5/dA7aNJkvPwZeIF9QdgWCOzpkp1xA==",
"dependencies": {
"@aws-sdk/abort-controller": "3.357.0",
"@aws-sdk/types": "3.357.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@babel/code-frame": {
"version": "7.22.5",
"resolved": "https://registry.npmjs.org/@babel/code-frame/-/code-frame-7.22.5.tgz",
"integrity": "sha512-Xmwn266vad+6DAqEB2A6V/CcZVp62BbwVmcOJc2RPuwih1kw02TjQvWVWlcKGbBPd+8/0V5DEkOcizRGYsspYQ==",
"dependencies": {
"@babel/highlight": "^7.22.5"
},
"engines": {
"node": ">=6.9.0"
}
},
"node_modules/@babel/helper-module-imports": {
"version": "7.22.5",
"resolved": "https://registry.npmjs.org/@babel/helper-module-imports/-/helper-module-imports-7.22.5.tgz",
"integrity": "sha512-8Dl6+HD/cKifutF5qGd/8ZJi84QeAKh+CEe1sBzz8UayBBGg1dAIJrdHOcOM5b2MpzWL2yuotJTtGjETq0qjXg==",
"dependencies": {
"@babel/types": "^7.22.5"
},
"engines": {
"node": ">=6.9.0"
}
},
"node_modules/@babel/helper-string-parser": {
"version": "7.22.5",
"resolved": "https://registry.npmjs.org/@babel/helper-string-parser/-/helper-string-parser-7.22.5.tgz",
"integrity": "sha512-mM4COjgZox8U+JcXQwPijIZLElkgEpO5rsERVDJTc2qfCDfERyob6k5WegS14SX18IIjv+XD+GrqNumY5JRCDw==",
"engines": {
"node": ">=6.9.0"
}
},
"node_modules/@babel/helper-validator-identifier": {
"version": "7.22.5",
"resolved": "https://registry.npmjs.org/@babel/helper-validator-identifier/-/helper-validator-identifier-7.22.5.tgz",
"integrity": "sha512-aJXu+6lErq8ltp+JhkJUfk1MTGyuA4v7f3pA+BJ5HLfNC6nAQ0Cpi9uOquUj8Hehg0aUiHzWQbOVJGao6ztBAQ==",
"engines": {
"node": ">=6.9.0"
}
},
"node_modules/@babel/highlight": {
"version": "7.22.5",
"resolved": "https://registry.npmjs.org/@babel/highlight/-/highlight-7.22.5.tgz",
"integrity": "sha512-BSKlD1hgnedS5XRnGOljZawtag7H1yPfQp0tdNJCHoH6AZ+Pcm9VvkrK59/Yy593Ypg0zMxH2BxD1VPYUQ7UIw==",
"dependencies": {
"@babel/helper-validator-identifier": "^7.22.5",
"chalk": "^2.0.0",
"js-tokens": "^4.0.0"
},
"engines": {
"node": ">=6.9.0"
}
},
"node_modules/@babel/runtime": {
"version": "7.22.5",
"resolved": "https://registry.npmjs.org/@babel/runtime/-/runtime-7.22.5.tgz",
"integrity": "sha512-ecjvYlnAaZ/KVneE/OdKYBYfgXV3Ptu6zQWmgEF7vwKhQnvVS6bjMD2XYgj+SNvQ1GfK/pjgokfPkC/2CO8CuA==",
"dependencies": {
"regenerator-runtime": "^0.13.11"
},
"engines": {
"node": ">=6.9.0"
}
},
"node_modules/@babel/types": {
"version": "7.22.5",
"resolved": "https://registry.npmjs.org/@babel/types/-/types-7.22.5.tgz",
"integrity": "sha512-zo3MIHGOkPOfoRXitsgHLjEXmlDaD/5KU1Uzuc9GNiZPhSqVxVRtxuPaSBZDsYZ9qV88AjtMtWW7ww98loJ9KA==",
"dependencies": {
"@babel/helper-string-parser": "^7.22.5",
"@babel/helper-validator-identifier": "^7.22.5",
"to-fast-properties": "^2.0.0"
},
"engines": {
"node": ">=6.9.0"
}
},
"node_modules/@emotion/babel-plugin": {
"version": "11.11.0",
"resolved": "https://registry.npmjs.org/@emotion/babel-plugin/-/babel-plugin-11.11.0.tgz",
"integrity": "sha512-m4HEDZleaaCH+XgDDsPF15Ht6wTLsgDTeR3WYj9Q/k76JtWhrJjcP4+/XlG8LGT/Rol9qUfOIztXeA84ATpqPQ==",
"dependencies": {
"@babel/helper-module-imports": "^7.16.7",
"@babel/runtime": "^7.18.3",
"@emotion/hash": "^0.9.1",
"@emotion/memoize": "^0.8.1",
"@emotion/serialize": "^1.1.2",
"babel-plugin-macros": "^3.1.0",
"convert-source-map": "^1.5.0",
"escape-string-regexp": "^4.0.0",
"find-root": "^1.1.0",
"source-map": "^0.5.7",
"stylis": "4.2.0"
}
},
"node_modules/@emotion/cache": {
"version": "11.11.0",
"resolved": "https://registry.npmjs.org/@emotion/cache/-/cache-11.11.0.tgz",
"integrity": "sha512-P34z9ssTCBi3e9EI1ZsWpNHcfY1r09ZO0rZbRO2ob3ZQMnFI35jB536qoXbkdesr5EUhYi22anuEJuyxifaqAQ==",
"dependencies": {
"@emotion/memoize": "^0.8.1",
"@emotion/sheet": "^1.2.2",
"@emotion/utils": "^1.2.1",
"@emotion/weak-memoize": "^0.3.1",
"stylis": "4.2.0"
}
},
"node_modules/@emotion/hash": {
"version": "0.9.1",
"resolved": "https://registry.npmjs.org/@emotion/hash/-/hash-0.9.1.tgz",
"integrity": "sha512-gJB6HLm5rYwSLI6PQa+X1t5CFGrv1J1TWG+sOyMCeKz2ojaj6Fnl/rZEspogG+cvqbt4AE/2eIyD2QfLKTBNlQ=="
},
"node_modules/@emotion/is-prop-valid": {
"version": "1.2.1",
"resolved": "https://registry.npmjs.org/@emotion/is-prop-valid/-/is-prop-valid-1.2.1.tgz",
"integrity": "sha512-61Mf7Ufx4aDxx1xlDeOm8aFFigGHE4z+0sKCa+IHCeZKiyP9RLD0Mmx7m8b9/Cf37f7NAvQOOJAbQQGVr5uERw==",
"dependencies": {
"@emotion/memoize": "^0.8.1"
}
},
"node_modules/@emotion/memoize": {
"version": "0.8.1",
"resolved": "https://registry.npmjs.org/@emotion/memoize/-/memoize-0.8.1.tgz",
"integrity": "sha512-W2P2c/VRW1/1tLox0mVUalvnWXxavmv/Oum2aPsRcoDJuob75FC3Y8FbpfLwUegRcxINtGUMPq0tFCvYNTBXNA=="
},
"node_modules/@emotion/react": {
"version": "11.11.1",
"resolved": "https://registry.npmjs.org/@emotion/react/-/react-11.11.1.tgz",
"integrity": "sha512-5mlW1DquU5HaxjLkfkGN1GA/fvVGdyHURRiX/0FHl2cfIfRxSOfmxEH5YS43edp0OldZrZ+dkBKbngxcNCdZvA==",
"dependencies": {
"@babel/runtime": "^7.18.3",
"@emotion/babel-plugin": "^11.11.0",
"@emotion/cache": "^11.11.0",
"@emotion/serialize": "^1.1.2",
"@emotion/use-insertion-effect-with-fallbacks": "^1.0.1",
"@emotion/utils": "^1.2.1",
"@emotion/weak-memoize": "^0.3.1",
"hoist-non-react-statics": "^3.3.1"
},
"peerDependencies": {
"react": ">=16.8.0"
},
"peerDependenciesMeta": {
"@types/react": {
"optional": true
}
}
},
"node_modules/@emotion/serialize": {
"version": "1.1.2",
"resolved": "https://registry.npmjs.org/@emotion/serialize/-/serialize-1.1.2.tgz",
"integrity": "sha512-zR6a/fkFP4EAcCMQtLOhIgpprZOwNmCldtpaISpvz348+DP4Mz8ZoKaGGCQpbzepNIUWbq4w6hNZkwDyKoS+HA==",
"dependencies": {
"@emotion/hash": "^0.9.1",
"@emotion/memoize": "^0.8.1",
"@emotion/unitless": "^0.8.1",
"@emotion/utils": "^1.2.1",
"csstype": "^3.0.2"
}
},
"node_modules/@emotion/sheet": {
"version": "1.2.2",
"resolved": "https://registry.npmjs.org/@emotion/sheet/-/sheet-1.2.2.tgz",
"integrity": "sha512-0QBtGvaqtWi+nx6doRwDdBIzhNdZrXUppvTM4dtZZWEGTXL/XE/yJxLMGlDT1Gt+UHH5IX1n+jkXyytE/av7OA=="
},
"node_modules/@emotion/styled": {
"version": "11.11.0",
"resolved": "https://registry.npmjs.org/@emotion/styled/-/styled-11.11.0.tgz",
"integrity": "sha512-hM5Nnvu9P3midq5aaXj4I+lnSfNi7Pmd4EWk1fOZ3pxookaQTNew6bp4JaCBYM4HVFZF9g7UjJmsUmC2JlxOng==",
"dependencies": {
"@babel/runtime": "^7.18.3",
"@emotion/babel-plugin": "^11.11.0",
"@emotion/is-prop-valid": "^1.2.1",
"@emotion/serialize": "^1.1.2",
"@emotion/use-insertion-effect-with-fallbacks": "^1.0.1",
"@emotion/utils": "^1.2.1"
},
"peerDependencies": {
"@emotion/react": "^11.0.0-rc.0",
"react": ">=16.8.0"
},
"peerDependenciesMeta": {
"@types/react": {
"optional": true
}
}
},
"node_modules/@emotion/unitless": {
"version": "0.8.1",
"resolved": "https://registry.npmjs.org/@emotion/unitless/-/unitless-0.8.1.tgz",
"integrity": "sha512-KOEGMu6dmJZtpadb476IsZBclKvILjopjUii3V+7MnXIQCYh8W3NgNcgwo21n9LXZX6EDIKvqfjYxXebDwxKmQ=="
},
"node_modules/@emotion/use-insertion-effect-with-fallbacks": {
"version": "1.0.1",
"resolved": "https://registry.npmjs.org/@emotion/use-insertion-effect-with-fallbacks/-/use-insertion-effect-with-fallbacks-1.0.1.tgz",
"integrity": "sha512-jT/qyKZ9rzLErtrjGgdkMBn2OP8wl0G3sQlBb3YPryvKHsjvINUhVaPFfP+fpBcOkmrVOVEEHQFJ7nbj2TH2gw==",
"peerDependencies": {
"react": ">=16.8.0"
}
},
"node_modules/@emotion/utils": {
"version": "1.2.1",
"resolved": "https://registry.npmjs.org/@emotion/utils/-/utils-1.2.1.tgz",
"integrity": "sha512-Y2tGf3I+XVnajdItskUCn6LX+VUDmP6lTL4fcqsXAv43dnlbZiuW4MWQW38rW/BVWSE7Q/7+XQocmpnRYILUmg=="
},
"node_modules/@emotion/weak-memoize": {
"version": "0.3.1",
"resolved": "https://registry.npmjs.org/@emotion/weak-memoize/-/weak-memoize-0.3.1.tgz",
"integrity": "sha512-EsBwpc7hBUJWAsNPBmJy4hxWx12v6bshQsldrVmjxJoc3isbxhOrF2IcCpaXxfvq03NwkI7sbsOLXbYuqF/8Ww=="
},
"node_modules/@mui/base": {
"version": "5.0.0-beta.5",
"resolved": "https://registry.npmjs.org/@mui/base/-/base-5.0.0-beta.5.tgz",
"integrity": "sha512-vy3TWLQYdGNecTaufR4wDNQFV2WEg6wRPi6BVbx6q1vP3K1mbxIn1+XOqOzfYBXjFHvMx0gZAo2TgWbaqfgvAA==",
"dependencies": {
"@babel/runtime": "^7.22.5",
"@emotion/is-prop-valid": "^1.2.1",
"@mui/types": "^7.2.4",
"@mui/utils": "^5.13.6",
"@popperjs/core": "^2.11.8",
"clsx": "^1.2.1",
"prop-types": "^15.8.1",
"react-is": "^18.2.0"
},
"engines": {
"node": ">=12.0.0"
},
"funding": {
"type": "opencollective",
"url": "https://opencollective.com/mui"
},
"peerDependencies": {
"@types/react": "^17.0.0 || ^18.0.0",
"react": "^17.0.0 || ^18.0.0",
"react-dom": "^17.0.0 || ^18.0.0"
},
"peerDependenciesMeta": {
"@types/react": {
"optional": true
}
}
},
"node_modules/@mui/core-downloads-tracker": {
"version": "5.13.4",
"resolved": "https://registry.npmjs.org/@mui/core-downloads-tracker/-/core-downloads-tracker-5.13.4.tgz",
"integrity": "sha512-yFrMWcrlI0TqRN5jpb6Ma9iI7sGTHpytdzzL33oskFHNQ8UgrtPas33Y1K7sWAMwCrr1qbWDrOHLAQG4tAzuSw==",
"funding": {
"type": "opencollective",
"url": "https://opencollective.com/mui"
}
},
"node_modules/@mui/icons-material": {
"version": "5.11.16",
"resolved": "https://registry.npmjs.org/@mui/icons-material/-/icons-material-5.11.16.tgz",
"integrity": "sha512-oKkx9z9Kwg40NtcIajF9uOXhxiyTZrrm9nmIJ4UjkU2IdHpd4QVLbCc/5hZN/y0C6qzi2Zlxyr9TGddQx2vx2A==",
"dependencies": {
"@babel/runtime": "^7.21.0"
},
"engines": {
"node": ">=12.0.0"
},
"funding": {
"type": "opencollective",
"url": "https://opencollective.com/mui"
},
"peerDependencies": {
"@mui/material": "^5.0.0",
"@types/react": "^17.0.0 || ^18.0.0",
"react": "^17.0.0 || ^18.0.0"
},
"peerDependenciesMeta": {
"@types/react": {
"optional": true
}
}
},
"node_modules/@mui/lab": {
"version": "5.0.0-alpha.134",
"resolved": "https://registry.npmjs.org/@mui/lab/-/lab-5.0.0-alpha.134.tgz",
"integrity": "sha512-GhvuM2dNOi6hzjbeGEocWVozgyyeUn7RBmZhLFtniROauxmPCZMcTsEU+GAxmpyYppqHuI8flP6tGKgMuEAK/g==",
"dependencies": {
"@babel/runtime": "^7.21.0",
"@mui/base": "5.0.0-beta.4",
"@mui/system": "^5.13.5",
"@mui/types": "^7.2.4",
"@mui/utils": "^5.13.1",
"clsx": "^1.2.1",
"prop-types": "^15.8.1",
"react-is": "^18.2.0"
},
"engines": {
"node": ">=12.0.0"
},
"funding": {
"type": "opencollective",
"url": "https://opencollective.com/mui"
},
"peerDependencies": {
"@emotion/react": "^11.5.0",
"@emotion/styled": "^11.3.0",
"@mui/material": "^5.0.0",
"@types/react": "^17.0.0 || ^18.0.0",
"react": "^17.0.0 || ^18.0.0",
"react-dom": "^17.0.0 || ^18.0.0"
},
"peerDependenciesMeta": {
"@emotion/react": {
"optional": true
},
"@emotion/styled": {
"optional": true
},
"@types/react": {
"optional": true
}
}
},
"node_modules/@mui/lab/node_modules/@mui/base": {
"version": "5.0.0-beta.4",
"resolved": "https://registry.npmjs.org/@mui/base/-/base-5.0.0-beta.4.tgz",
"integrity": "sha512-ejhtqYJpjDgHGEljjMBQWZ22yEK0OzIXNa7toJmmXsP4TT3W7xVy8bTJ0TniPDf+JNjrsgfgiFTDGdlEhV1E+g==",
"dependencies": {
"@babel/runtime": "^7.21.0",
"@emotion/is-prop-valid": "^1.2.1",
"@mui/types": "^7.2.4",
"@mui/utils": "^5.13.1",
"@popperjs/core": "^2.11.8",
"clsx": "^1.2.1",
"prop-types": "^15.8.1",
"react-is": "^18.2.0"
},
"engines": {
"node": ">=12.0.0"
},
"funding": {
"type": "opencollective",
"url": "https://opencollective.com/mui"
},
"peerDependencies": {
"@types/react": "^17.0.0 || ^18.0.0",
"react": "^17.0.0 || ^18.0.0",
"react-dom": "^17.0.0 || ^18.0.0"
},
"peerDependenciesMeta": {
"@types/react": {
"optional": true
}
}
},
"node_modules/@mui/material": {
"version": "5.13.6",
"resolved": "https://registry.npmjs.org/@mui/material/-/material-5.13.6.tgz",
"integrity": "sha512-/c2ZApeQm2sTYdQXjqEnldaBMBcUEiyu2VRS6bS39ZeNaAcCLBQbYocLR46R+f0S5dgpBzB0T4AsOABPOFYZ5Q==",
"dependencies": {
"@babel/runtime": "^7.22.5",
"@mui/base": "5.0.0-beta.5",
"@mui/core-downloads-tracker": "^5.13.4",
"@mui/system": "^5.13.6",
"@mui/types": "^7.2.4",
"@mui/utils": "^5.13.6",
"@types/react-transition-group": "^4.4.6",
"clsx": "^1.2.1",
"csstype": "^3.1.2",
"prop-types": "^15.8.1",
"react-is": "^18.2.0",
"react-transition-group": "^4.4.5"
},
"engines": {
"node": ">=12.0.0"
},
"funding": {
"type": "opencollective",
"url": "https://opencollective.com/mui"
},
"peerDependencies": {
"@emotion/react": "^11.5.0",
"@emotion/styled": "^11.3.0",
"@types/react": "^17.0.0 || ^18.0.0",
"react": "^17.0.0 || ^18.0.0",
"react-dom": "^17.0.0 || ^18.0.0"
},
"peerDependenciesMeta": {
"@emotion/react": {
"optional": true
},
"@emotion/styled": {
"optional": true
},
"@types/react": {
"optional": true
}
}
},
"node_modules/@mui/private-theming": {
"version": "5.13.1",
"resolved": "https://registry.npmjs.org/@mui/private-theming/-/private-theming-5.13.1.tgz",
"integrity": "sha512-HW4npLUD9BAkVppOUZHeO1FOKUJWAwbpy0VQoGe3McUYTlck1HezGHQCfBQ5S/Nszi7EViqiimECVl9xi+/WjQ==",
"dependencies": {
"@babel/runtime": "^7.21.0",
"@mui/utils": "^5.13.1",
"prop-types": "^15.8.1"
},
"engines": {
"node": ">=12.0.0"
},
"funding": {
"type": "opencollective",
"url": "https://opencollective.com/mui"
},
"peerDependencies": {
"@types/react": "^17.0.0 || ^18.0.0",
"react": "^17.0.0 || ^18.0.0"
},
"peerDependenciesMeta": {
"@types/react": {
"optional": true
}
}
},
"node_modules/@mui/styled-engine": {
"version": "5.13.2",
"resolved": "https://registry.npmjs.org/@mui/styled-engine/-/styled-engine-5.13.2.tgz",
"integrity": "sha512-VCYCU6xVtXOrIN8lcbuPmoG+u7FYuOERG++fpY74hPpEWkyFQG97F+/XfTQVYzlR2m7nPjnwVUgATcTCMEaMvw==",
"dependencies": {
"@babel/runtime": "^7.21.0",
"@emotion/cache": "^11.11.0",
"csstype": "^3.1.2",
"prop-types": "^15.8.1"
},
"engines": {
"node": ">=12.0.0"
},
"funding": {
"type": "opencollective",
"url": "https://opencollective.com/mui"
},
"peerDependencies": {
"@emotion/react": "^11.4.1",
"@emotion/styled": "^11.3.0",
"react": "^17.0.0 || ^18.0.0"
},
"peerDependenciesMeta": {
"@emotion/react": {
"optional": true
},
"@emotion/styled": {
"optional": true
}
}
},
"node_modules/@mui/system": {
"version": "5.13.6",
"resolved": "https://registry.npmjs.org/@mui/system/-/system-5.13.6.tgz",
"integrity": "sha512-G3Xr28uLqU3DyF6r2LQkHGw/ku4P0AHzlKVe7FGXOPl7X1u+hoe2xxj8Vdiq/69II/mh9OP21i38yBWgWb7WgQ==",
"dependencies": {
"@babel/runtime": "^7.22.5",
"@mui/private-theming": "^5.13.1",
"@mui/styled-engine": "^5.13.2",
"@mui/types": "^7.2.4",
"@mui/utils": "^5.13.6",
"clsx": "^1.2.1",
"csstype": "^3.1.2",
"prop-types": "^15.8.1"
},
"engines": {
"node": ">=12.0.0"
},
"funding": {
"type": "opencollective",
"url": "https://opencollective.com/mui"
},
"peerDependencies": {
"@emotion/react": "^11.5.0",
"@emotion/styled": "^11.3.0",
"@types/react": "^17.0.0 || ^18.0.0",
"react": "^17.0.0 || ^18.0.0"
},
"peerDependenciesMeta": {
"@emotion/react": {
"optional": true
},
"@emotion/styled": {
"optional": true
},
"@types/react": {
"optional": true
}
}
},
"node_modules/@mui/types": {
"version": "7.2.4",
"resolved": "https://registry.npmjs.org/@mui/types/-/types-7.2.4.tgz",
"integrity": "sha512-LBcwa8rN84bKF+f5sDyku42w1NTxaPgPyYKODsh01U1fVstTClbUoSA96oyRBnSNyEiAVjKm6Gwx9vjR+xyqHA==",
"peerDependencies": {
"@types/react": "*"
},
"peerDependenciesMeta": {
"@types/react": {
"optional": true
}
}
},
"node_modules/@mui/utils": {
"version": "5.13.6",
"resolved": "https://registry.npmjs.org/@mui/utils/-/utils-5.13.6.tgz",
"integrity": "sha512-ggNlxl5NPSbp+kNcQLmSig6WVB0Id+4gOxhx644987v4fsji+CSXc+MFYLocFB/x4oHtzCUlSzbVHlJfP/fXoQ==",
"dependencies": {
"@babel/runtime": "^7.22.5",
"@types/prop-types": "^15.7.5",
"@types/react-is": "^18.2.0",
"prop-types": "^15.8.1",
"react-is": "^18.2.0"
},
"engines": {
"node": ">=12.0.0"
},
"funding": {
"type": "opencollective",
"url": "https://opencollective.com/mui"
},
"peerDependencies": {
"react": "^17.0.0 || ^18.0.0"
}
},
"node_modules/@next/env": {
"version": "13.4.7",
"resolved": "https://registry.npmjs.org/@next/env/-/env-13.4.7.tgz",
"integrity": "sha512-ZlbiFulnwiFsW9UV1ku1OvX/oyIPLtMk9p/nnvDSwI0s7vSoZdRtxXNsaO+ZXrLv/pMbXVGq4lL8TbY9iuGmVw=="
},
"node_modules/@next/swc-darwin-arm64": {
"version": "13.4.7",
"resolved": "https://registry.npmjs.org/@next/swc-darwin-arm64/-/swc-darwin-arm64-13.4.7.tgz",
"integrity": "sha512-VZTxPv1b59KGiv/pZHTO5Gbsdeoxcj2rU2cqJu03btMhHpn3vwzEK0gUSVC/XW96aeGO67X+cMahhwHzef24/w==",
"cpu": [
"arm64"
],
"optional": true,
"os": [
"darwin"
],
"engines": {
"node": ">= 10"
}
},
"node_modules/@next/swc-darwin-x64": {
"version": "13.4.7",
"resolved": "https://registry.npmjs.org/@next/swc-darwin-x64/-/swc-darwin-x64-13.4.7.tgz",
"integrity": "sha512-gO2bw+2Ymmga+QYujjvDz9955xvYGrWofmxTq7m70b9pDPvl7aDFABJOZ2a8SRCuSNB5mXU8eTOmVVwyp/nAew==",
"cpu": [
"x64"
],
"optional": true,
"os": [
"darwin"
],
"engines": {
"node": ">= 10"
}
},
"node_modules/@next/swc-linux-arm64-gnu": {
"version": "13.4.7",
"resolved": "https://registry.npmjs.org/@next/swc-linux-arm64-gnu/-/swc-linux-arm64-gnu-13.4.7.tgz",
"integrity": "sha512-6cqp3vf1eHxjIDhEOc7Mh/s8z1cwc/l5B6ZNkOofmZVyu1zsbEM5Hmx64s12Rd9AYgGoiCz4OJ4M/oRnkE16/Q==",
"cpu": [
"arm64"
],
"optional": true,
"os": [
"linux"
],
"engines": {
"node": ">= 10"
}
},
"node_modules/@next/swc-linux-arm64-musl": {
"version": "13.4.7",
"resolved": "https://registry.npmjs.org/@next/swc-linux-arm64-musl/-/swc-linux-arm64-musl-13.4.7.tgz",
"integrity": "sha512-T1kD2FWOEy5WPidOn1si0rYmWORNch4a/NR52Ghyp4q7KyxOCuiOfZzyhVC5tsLIBDH3+cNdB5DkD9afpNDaOw==",
"cpu": [
"arm64"
],
"optional": true,
"os": [
"linux"
],
"engines": {
"node": ">= 10"
}
},
"node_modules/@next/swc-linux-x64-gnu": {
"version": "13.4.7",
"resolved": "https://registry.npmjs.org/@next/swc-linux-x64-gnu/-/swc-linux-x64-gnu-13.4.7.tgz",
"integrity": "sha512-zaEC+iEiAHNdhl6fuwl0H0shnTzQoAoJiDYBUze8QTntE/GNPfTYpYboxF5LRYIjBwETUatvE0T64W6SKDipvg==",
"cpu": [
"x64"
],
"optional": true,
"os": [
"linux"
],
"engines": {
"node": ">= 10"
}
},
"node_modules/@next/swc-linux-x64-musl": {
"version": "13.4.7",
"resolved": "https://registry.npmjs.org/@next/swc-linux-x64-musl/-/swc-linux-x64-musl-13.4.7.tgz",
"integrity": "sha512-X6r12F8d8SKAtYJqLZBBMIwEqcTRvUdVm+xIq+l6pJqlgT2tNsLLf2i5Cl88xSsIytBICGsCNNHd+siD2fbWBA==",
"cpu": [
"x64"
],
"optional": true,
"os": [
"linux"
],
"engines": {
"node": ">= 10"
}
},
"node_modules/@next/swc-win32-arm64-msvc": {
"version": "13.4.7",
"resolved": "https://registry.npmjs.org/@next/swc-win32-arm64-msvc/-/swc-win32-arm64-msvc-13.4.7.tgz",
"integrity": "sha512-NPnmnV+vEIxnu6SUvjnuaWRglZzw4ox5n/MQTxeUhb5iwVWFedolPFebMNwgrWu4AELwvTdGtWjqof53AiWHcw==",
"cpu": [
"arm64"
],
"optional": true,
"os": [
"win32"
],
"engines": {
"node": ">= 10"
}
},
"node_modules/@next/swc-win32-ia32-msvc": {
"version": "13.4.7",
"resolved": "https://registry.npmjs.org/@next/swc-win32-ia32-msvc/-/swc-win32-ia32-msvc-13.4.7.tgz",
"integrity": "sha512-6Hxijm6/a8XqLQpOOf/XuwWRhcuc/g4rBB2oxjgCMuV9Xlr2bLs5+lXyh8w9YbAUMYR3iC9mgOlXbHa79elmXw==",
"cpu": [
"ia32"
],
"optional": true,
"os": [
"win32"
],
"engines": {
"node": ">= 10"
}
},
"node_modules/@next/swc-win32-x64-msvc": {
"version": "13.4.7",
"resolved": "https://registry.npmjs.org/@next/swc-win32-x64-msvc/-/swc-win32-x64-msvc-13.4.7.tgz",
"integrity": "sha512-sW9Yt36Db1nXJL+mTr2Wo0y+VkPWeYhygvcHj1FF0srVtV+VoDjxleKtny21QHaG05zdeZnw2fCtf2+dEqgwqA==",
"cpu": [
"x64"
],
"optional": true,
"os": [
"win32"
],
"engines": {
"node": ">= 10"
}
},
"node_modules/@popperjs/core": {
"version": "2.11.8",
"resolved": "https://registry.npmjs.org/@popperjs/core/-/core-2.11.8.tgz",
"integrity": "sha512-P1st0aksCrn9sGZhp8GMYwBnQsbvAWsZAX44oXNNvLHGqAOcoVxmjZiohstwQ7SqKnbR47akdNi+uleWD8+g6A==",
"funding": {
"type": "opencollective",
"url": "https://opencollective.com/popperjs"
}
},
"node_modules/@smithy/protocol-http": {
"version": "1.1.0",
"resolved": "https://registry.npmjs.org/@smithy/protocol-http/-/protocol-http-1.1.0.tgz",
"integrity": "sha512-H5y/kZOqfJSqRkwtcAoVbqONmhdXwSgYNJ1Glk5Ry8qlhVVy5qUzD9EklaCH8/XLnoCsLO/F/Giee8MIvaBRkg==",
"dependencies": {
"@smithy/types": "^1.1.0",
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@smithy/types": {
"version": "1.1.0",
"resolved": "https://registry.npmjs.org/@smithy/types/-/types-1.1.0.tgz",
"integrity": "sha512-KzmvisMmuwD2jZXuC9e65JrgsZM97y5NpDU7g347oB+Q+xQLU6hQZ5zFNNbEfwwOJHoOvEVTna+dk1h/lW7alw==",
"dependencies": {
"tslib": "^2.5.0"
},
"engines": {
"node": ">=14.0.0"
}
},
"node_modules/@swc/helpers": {
"version": "0.5.1",
"resolved": "https://registry.npmjs.org/@swc/helpers/-/helpers-0.5.1.tgz",
"integrity": "sha512-sJ902EfIzn1Fa+qYmjdQqh8tPsoxyBz+8yBKC2HKUxyezKJFwPGOn7pv4WY6QuQW//ySQi5lJjA/ZT9sNWWNTg==",
"dependencies": {
"tslib": "^2.4.0"
}
},
"node_modules/@types/parse-json": {
"version": "4.0.0",
"resolved": "https://registry.npmjs.org/@types/parse-json/-/parse-json-4.0.0.tgz",
"integrity": "sha512-//oorEZjL6sbPcKUaCdIGlIUeH26mgzimjBB77G6XRgnDl/L5wOnpyBGRe/Mmf5CVW3PwEBE1NjiMZ/ssFh4wA=="
},
"node_modules/@types/prop-types": {
"version": "15.7.5",
"resolved": "https://registry.npmjs.org/@types/prop-types/-/prop-types-15.7.5.tgz",
"integrity": "sha512-JCB8C6SnDoQf0cNycqd/35A7MjcnK+ZTqE7judS6o7utxUCg6imJg3QK2qzHKszlTjcj2cn+NwMB2i96ubpj7w=="
},
"node_modules/@types/react": {
"version": "18.2.14",
"resolved": "https://registry.npmjs.org/@types/react/-/react-18.2.14.tgz",
"integrity": "sha512-A0zjq+QN/O0Kpe30hA1GidzyFjatVvrpIvWLxD+xv67Vt91TWWgco9IvrJBkeyHm1trGaFS/FSGqPlhyeZRm0g==",
"dependencies": {
"@types/prop-types": "*",
"@types/scheduler": "*",
"csstype": "^3.0.2"
}
},
"node_modules/@types/react-is": {
"version": "18.2.1",
"resolved": "https://registry.npmjs.org/@types/react-is/-/react-is-18.2.1.tgz",
"integrity": "sha512-wyUkmaaSZEzFZivD8F2ftSyAfk6L+DfFliVj/mYdOXbVjRcS87fQJLTnhk6dRZPuJjI+9g6RZJO4PNCngUrmyw==",
"dependencies": {
"@types/react": "*"
}
},
"node_modules/@types/react-transition-group": {
"version": "4.4.6",
"resolved": "https://registry.npmjs.org/@types/react-transition-group/-/react-transition-group-4.4.6.tgz",
"integrity": "sha512-VnCdSxfcm08KjsJVQcfBmhEQAPnLB8G08hAxn39azX1qYBQ/5RVQuoHuKIcfKOdncuaUvEpFKFzEvbtIMsfVew==",
"dependencies": {
"@types/react": "*"
}
},
"node_modules/@types/scheduler": {
"version": "0.16.3",
"resolved": "https://registry.npmjs.org/@types/scheduler/-/scheduler-0.16.3.tgz",
"integrity": "sha512-5cJ8CB4yAx7BH1oMvdU0Jh9lrEXyPkar6F9G/ERswkCuvP4KQZfZkSjcMbAICCpQTN4OuZn8tz0HiKv9TGZgrQ=="
},
"node_modules/ansi-styles": {
"version": "3.2.1",
"resolved": "https://registry.npmjs.org/ansi-styles/-/ansi-styles-3.2.1.tgz",
"integrity": "sha512-VT0ZI6kZRdTh8YyJw3SMbYm/u+NqfsAxEpWO0Pf9sq8/e94WxxOpPKx9FR1FlyCtOVDNOQ+8ntlqFxiRc+r5qA==",
"dependencies": {
"color-convert": "^1.9.0"
},
"engines": {
"node": ">=4"
}
},
"node_modules/babel-plugin-macros": {
"version": "3.1.0",
"resolved": "https://registry.npmjs.org/babel-plugin-macros/-/babel-plugin-macros-3.1.0.tgz",
"integrity": "sha512-Cg7TFGpIr01vOQNODXOOaGz2NpCU5gl8x1qJFbb6hbZxR7XrcE2vtbAsTAbJ7/xwJtUuJEw8K8Zr/AE0LHlesg==",
"dependencies": {
"@babel/runtime": "^7.12.5",
"cosmiconfig": "^7.0.0",
"resolve": "^1.19.0"
},
"engines": {
"node": ">=10",
"npm": ">=6"
}
},
"node_modules/bowser": {
"version": "2.11.0",
"resolved": "https://registry.npmjs.org/bowser/-/bowser-2.11.0.tgz",
"integrity": "sha512-AlcaJBi/pqqJBIQ8U9Mcpc9i8Aqxn88Skv5d+xBX006BY5u8N3mGLHa5Lgppa7L/HfwgwLgZ6NYs+Ag6uUmJRA=="
},
"node_modules/busboy": {
"version": "1.6.0",
"resolved": "https://registry.npmjs.org/busboy/-/busboy-1.6.0.tgz",
"integrity": "sha512-8SFQbg/0hQ9xy3UNTB0YEnsNBbWfhf7RtnzpL7TkBiTBRfrQ9Fxcnz7VJsleJpyp6rVLvXiuORqjlHi5q+PYuA==",
"dependencies": {
"streamsearch": "^1.1.0"
},
"engines": {
"node": ">=10.16.0"
}
},
"node_modules/callsites": {
"version": "3.1.0",
"resolved": "https://registry.npmjs.org/callsites/-/callsites-3.1.0.tgz",
"integrity": "sha512-P8BjAsXvZS+VIDUI11hHCQEv74YT67YUi5JJFNWIqL235sBmjX4+qx9Muvls5ivyNENctx46xQLQ3aTuE7ssaQ==",
"engines": {
"node": ">=6"
}
},
"node_modules/caniuse-lite": {
"version": "1.0.30001507",
"resolved": "https://registry.npmjs.org/caniuse-lite/-/caniuse-lite-1.0.30001507.tgz",
"integrity": "sha512-SFpUDoSLCaE5XYL2jfqe9ova/pbQHEmbheDf5r4diNwbAgR3qxM9NQtfsiSscjqoya5K7kFcHPUQ+VsUkIJR4A==",
"funding": [
{
"type": "opencollective",
"url": "https://opencollective.com/browserslist"
},
{
"type": "tidelift",
"url": "https://tidelift.com/funding/github/npm/caniuse-lite"
},
{
"type": "github",
"url": "https://github.com/sponsors/ai"
}
]
},
"node_modules/chalk": {
"version": "2.4.2",
"resolved": "https://registry.npmjs.org/chalk/-/chalk-2.4.2.tgz",
"integrity": "sha512-Mti+f9lpJNcwF4tWV8/OrTTtF1gZi+f8FqlyAdouralcFWFQWF2+NgCHShjkCb+IFBLq9buZwE1xckQU4peSuQ==",
"dependencies": {
"ansi-styles": "^3.2.1",
"escape-string-regexp": "^1.0.5",
"supports-color": "^5.3.0"
},
"engines": {
"node": ">=4"
}
},
"node_modules/chalk/node_modules/escape-string-regexp": {
"version": "1.0.5",
"resolved": "https://registry.npmjs.org/escape-string-regexp/-/escape-string-regexp-1.0.5.tgz",
"integrity": "sha512-vbRorB5FUQWvla16U8R/qgaFIya2qGzwDrNmCZuYKrbdSUMG6I1ZCGQRefkRVhuOkIGVne7BQ35DSfo1qvJqFg==",
"engines": {
"node": ">=0.8.0"
}
},
"node_modules/client-only": {
"version": "0.0.1",
"resolved": "https://registry.npmjs.org/client-only/-/client-only-0.0.1.tgz",
"integrity": "sha512-IV3Ou0jSMzZrd3pZ48nLkT9DA7Ag1pnPzaiQhpW7c3RbcqqzvzzVu+L8gfqMp/8IM2MQtSiqaCxrrcfu8I8rMA=="
},
"node_modules/clsx": {
"version": "1.2.1",
"resolved": "https://registry.npmjs.org/clsx/-/clsx-1.2.1.tgz",
"integrity": "sha512-EcR6r5a8bj6pu3ycsa/E/cKVGuTgZJZdsyUYHOksG/UHIiKfjxzRxYJpyVBwYaQeOvghal9fcc4PidlgzugAQg==",
"engines": {
"node": ">=6"
}
},
"node_modules/color-convert": {
"version": "1.9.3",
"resolved": "https://registry.npmjs.org/color-convert/-/color-convert-1.9.3.tgz",
"integrity": "sha512-QfAUtd+vFdAtFQcC8CCyYt1fYWxSqAiK2cSD6zDB8N3cpsEBAvRxp9zOGg6G/SHHJYAT88/az/IuDGALsNVbGg==",
"dependencies": {
"color-name": "1.1.3"
}
},
"node_modules/color-name": {
"version": "1.1.3",
"resolved": "https://registry.npmjs.org/color-name/-/color-name-1.1.3.tgz",
"integrity": "sha512-72fSenhMw2HZMTVHeCA9KCmpEIbzWiQsjN+BHcBbS9vr1mtt+vJjPdksIBNUmKAW8TFUDPJK5SUU3QhE9NEXDw=="
},
"node_modules/convert-source-map": {
"version": "1.9.0",
"resolved": "https://registry.npmjs.org/convert-source-map/-/convert-source-map-1.9.0.tgz",
"integrity": "sha512-ASFBup0Mz1uyiIjANan1jzLQami9z1PoYSZCiiYW2FczPbenXc45FZdBZLzOT+r6+iciuEModtmCti+hjaAk0A=="
},
"node_modules/cosmiconfig": {
"version": "7.1.0",
"resolved": "https://registry.npmjs.org/cosmiconfig/-/cosmiconfig-7.1.0.tgz",
"integrity": "sha512-AdmX6xUzdNASswsFtmwSt7Vj8po9IuqXm0UXz7QKPuEUmPB4XyjGfaAr2PSuELMwkRMVH1EpIkX5bTZGRB3eCA==",
"dependencies": {
"@types/parse-json": "^4.0.0",
"import-fresh": "^3.2.1",
"parse-json": "^5.0.0",
"path-type": "^4.0.0",
"yaml": "^1.10.0"
},
"engines": {
"node": ">=10"
}
},
"node_modules/csstype": {
"version": "3.1.2",
"resolved": "https://registry.npmjs.org/csstype/-/csstype-3.1.2.tgz",
"integrity": "sha512-I7K1Uu0MBPzaFKg4nI5Q7Vs2t+3gWWW648spaF+Rg7pI9ds18Ugn+lvg4SHczUdKlHI5LWBXyqfS8+DufyBsgQ=="
},
"node_modules/dom-helpers": {
"version": "5.2.1",
"resolved": "https://registry.npmjs.org/dom-helpers/-/dom-helpers-5.2.1.tgz",
"integrity": "sha512-nRCa7CK3VTrM2NmGkIy4cbK7IZlgBE/PYMn55rrXefr5xXDP0LdtfPnblFDoVdcAfslJ7or6iqAUnx0CCGIWQA==",
"dependencies": {
"@babel/runtime": "^7.8.7",
"csstype": "^3.0.2"
}
},
"node_modules/error-ex": {
"version": "1.3.2",
"resolved": "https://registry.npmjs.org/error-ex/-/error-ex-1.3.2.tgz",
"integrity": "sha512-7dFHNmqeFSEt2ZBsCriorKnn3Z2pj+fd9kmI6QoWw4//DL+icEBfc0U7qJCisqrTsKTjw4fNFy2pW9OqStD84g==",
"dependencies": {
"is-arrayish": "^0.2.1"
}
},
"node_modules/escape-string-regexp": {
"version": "4.0.0",
"resolved": "https://registry.npmjs.org/escape-string-regexp/-/escape-string-regexp-4.0.0.tgz",
"integrity": "sha512-TtpcNJ3XAzx3Gq8sWRzJaVajRs0uVxA2YAkdb1jm2YkPz4G6egUFAyA3n5vtEIZefPk5Wa4UXbKuS5fKkJWdgA==",
"engines": {
"node": ">=10"
},
"funding": {
"url": "https://github.com/sponsors/sindresorhus"
}
},
"node_modules/fast-xml-parser": {
"version": "4.2.5",
"resolved": "https://registry.npmjs.org/fast-xml-parser/-/fast-xml-parser-4.2.5.tgz",
"integrity": "sha512-B9/wizE4WngqQftFPmdaMYlXoJlJOYxGQOanC77fq9k8+Z0v5dDSVh+3glErdIROP//s/jgb7ZuxKfB8nVyo0g==",
"funding": [
{
"type": "paypal",
"url": "https://paypal.me/naturalintelligence"
},
{
"type": "github",
"url": "https://github.com/sponsors/NaturalIntelligence"
}
],
"dependencies": {
"strnum": "^1.0.5"
},
"bin": {
"fxparser": "src/cli/cli.js"
}
},
"node_modules/find-root": {
"version": "1.1.0",
"resolved": "https://registry.npmjs.org/find-root/-/find-root-1.1.0.tgz",
"integrity": "sha512-NKfW6bec6GfKc0SGx1e07QZY9PE99u0Bft/0rzSD5k3sO/vwkVUpDUKVm5Gpp5Ue3YfShPFTX2070tDs5kB9Ng=="
},
"node_modules/function-bind": {
"version": "1.1.1",
"resolved": "https://registry.npmjs.org/function-bind/-/function-bind-1.1.1.tgz",
"integrity": "sha512-yIovAzMX49sF8Yl58fSCWJ5svSLuaibPxXQJFLmBObTuCr0Mf1KiPopGM9NiFjiYBCbfaa2Fh6breQ6ANVTI0A=="
},
"node_modules/glob-to-regexp": {
"version": "0.4.1",
"resolved": "https://registry.npmjs.org/glob-to-regexp/-/glob-to-regexp-0.4.1.tgz",
"integrity": "sha512-lkX1HJXwyMcprw/5YUZc2s7DrpAiHB21/V+E1rHUrVNokkvB6bqMzT0VfV6/86ZNabt1k14YOIaT7nDvOX3Iiw=="
},
"node_modules/graceful-fs": {
"version": "4.2.11",
"resolved": "https://registry.npmjs.org/graceful-fs/-/graceful-fs-4.2.11.tgz",
"integrity": "sha512-RbJ5/jmFcNNCcDV5o9eTnBLJ/HszWV0P73bc+Ff4nS/rJj+YaS6IGyiOL0VoBYX+l1Wrl3k63h/KrH+nhJ0XvQ=="
},
"node_modules/has": {
"version": "1.0.3",
"resolved": "https://registry.npmjs.org/has/-/has-1.0.3.tgz",
"integrity": "sha512-f2dvO0VU6Oej7RkWJGrehjbzMAjFp5/VKPp5tTpWIV4JHHZK1/BxbFRtf/siA2SWTe09caDmVtYYzWEIbBS4zw==",
"dependencies": {
"function-bind": "^1.1.1"
},
"engines": {
"node": ">= 0.4.0"
}
},
"node_modules/has-flag": {
"version": "3.0.0",
"resolved": "https://registry.npmjs.org/has-flag/-/has-flag-3.0.0.tgz",
"integrity": "sha512-sKJf1+ceQBr4SMkvQnBDNDtf4TXpVhVGateu0t918bl30FnbE2m4vNLX+VWe/dpjlb+HugGYzW7uQXH98HPEYw==",
"engines": {
"node": ">=4"
}
},
"node_modules/hoist-non-react-statics": {
"version": "3.3.2",
"resolved": "https://registry.npmjs.org/hoist-non-react-statics/-/hoist-non-react-statics-3.3.2.tgz",
"integrity": "sha512-/gGivxi8JPKWNm/W0jSmzcMPpfpPLc3dY/6GxhX2hQ9iGj3aDfklV4ET7NjKpSinLpJ5vafa9iiGIEZg10SfBw==",
"dependencies": {
"react-is": "^16.7.0"
}
},
"node_modules/hoist-non-react-statics/node_modules/react-is": {
"version": "16.13.1",
"resolved": "https://registry.npmjs.org/react-is/-/react-is-16.13.1.tgz",
"integrity": "sha512-24e6ynE2H+OKt4kqsOvNd8kBpV65zoxbA4BVsEOB3ARVWQki/DHzaUoC5KuON/BiccDaCCTZBuOcfZs70kR8bQ=="
},
"node_modules/import-fresh": {
"version": "3.3.0",
"resolved": "https://registry.npmjs.org/import-fresh/-/import-fresh-3.3.0.tgz",
"integrity": "sha512-veYYhQa+D1QBKznvhUHxb8faxlrwUnxseDAbAp457E0wLNio2bOSKnjYDhMj+YiAq61xrMGhQk9iXVk5FzgQMw==",
"dependencies": {
"parent-module": "^1.0.0",
"resolve-from": "^4.0.0"
},
"engines": {
"node": ">=6"
},
"funding": {
"url": "https://github.com/sponsors/sindresorhus"
}
},
"node_modules/is-arrayish": {
"version": "0.2.1",
"resolved": "https://registry.npmjs.org/is-arrayish/-/is-arrayish-0.2.1.tgz",
"integrity": "sha512-zz06S8t0ozoDXMG+ube26zeCTNXcKIPJZJi8hBrF4idCLms4CG9QtK7qBl1boi5ODzFpjswb5JPmHCbMpjaYzg=="
},
"node_modules/is-core-module": {
"version": "2.12.1",
"resolved": "https://registry.npmjs.org/is-core-module/-/is-core-module-2.12.1.tgz",
"integrity": "sha512-Q4ZuBAe2FUsKtyQJoQHlvP8OvBERxO3jEmy1I7hcRXcJBGGHFh/aJBswbXuS9sgrDH2QUO8ilkwNPHvHMd8clg==",
"dependencies": {
"has": "^1.0.3"
},
"funding": {
"url": "https://github.com/sponsors/ljharb"
}
},
"node_modules/js-tokens": {
"version": "4.0.0",
"resolved": "https://registry.npmjs.org/js-tokens/-/js-tokens-4.0.0.tgz",
"integrity": "sha512-RdJUflcE3cUzKiMqQgsCu06FPu9UdIJO0beYbPhHN4k6apgJtifcoCtT9bcxOpYBtpD2kCM6Sbzg4CausW/PKQ=="
},
"node_modules/json-parse-even-better-errors": {
"version": "2.3.1",
"resolved": "https://registry.npmjs.org/json-parse-even-better-errors/-/json-parse-even-better-errors-2.3.1.tgz",
"integrity": "sha512-xyFwyhro/JEof6Ghe2iz2NcXoj2sloNsWr/XsERDK/oiPCfaNhl5ONfp+jQdAZRQQ0IJWNzH9zIZF7li91kh2w=="
},
"node_modules/lines-and-columns": {
"version": "1.2.4",
"resolved": "https://registry.npmjs.org/lines-and-columns/-/lines-and-columns-1.2.4.tgz",
"integrity": "sha512-7ylylesZQ/PV29jhEDl3Ufjo6ZX7gCqJr5F7PKrqc93v7fzSymt1BpwEU8nAUXs8qzzvqhbjhK5QZg6Mt/HkBg=="
},
"node_modules/loose-envify": {
"version": "1.4.0",
"resolved": "https://registry.npmjs.org/loose-envify/-/loose-envify-1.4.0.tgz",
"integrity": "sha512-lyuxPGr/Wfhrlem2CL/UcnUc1zcqKAImBDzukY7Y5F/yQiNdko6+fRLevlw1HgMySw7f611UIY408EtxRSoK3Q==",
"dependencies": {
"js-tokens": "^3.0.0 || ^4.0.0"
},
"bin": {
"loose-envify": "cli.js"
}
},
"node_modules/nanoid": {
"version": "3.3.6",
"resolved": "https://registry.npmjs.org/nanoid/-/nanoid-3.3.6.tgz",
"integrity": "sha512-BGcqMMJuToF7i1rt+2PWSNVnWIkGCU78jBG3RxO/bZlnZPK2Cmi2QaffxGO/2RvWi9sL+FAiRiXMgsyxQ1DIDA==",
"funding": [
{
"type": "github",
"url": "https://github.com/sponsors/ai"
}
],
"bin": {
"nanoid": "bin/nanoid.cjs"
},
"engines": {
"node": "^10 || ^12 || ^13.7 || ^14 || >=15.0.1"
}
},
"node_modules/next": {
"version": "13.4.7",
"resolved": "https://registry.npmjs.org/next/-/next-13.4.7.tgz",
"integrity": "sha512-M8z3k9VmG51SRT6v5uDKdJXcAqLzP3C+vaKfLIAM0Mhx1um1G7MDnO63+m52qPdZfrTFzMZNzfsgvm3ghuVHIQ==",
"dependencies": {
"@next/env": "13.4.7",
"@swc/helpers": "0.5.1",
"busboy": "1.6.0",
"caniuse-lite": "^1.0.30001406",
"postcss": "8.4.14",
"styled-jsx": "5.1.1",
"watchpack": "2.4.0",
"zod": "3.21.4"
},
"bin": {
"next": "dist/bin/next"
},
"engines": {
"node": ">=16.8.0"
},
"optionalDependencies": {
"@next/swc-darwin-arm64": "13.4.7",
"@next/swc-darwin-x64": "13.4.7",
"@next/swc-linux-arm64-gnu": "13.4.7",
"@next/swc-linux-arm64-musl": "13.4.7",
"@next/swc-linux-x64-gnu": "13.4.7",
"@next/swc-linux-x64-musl": "13.4.7",
"@next/swc-win32-arm64-msvc": "13.4.7",
"@next/swc-win32-ia32-msvc": "13.4.7",
"@next/swc-win32-x64-msvc": "13.4.7"
},
"peerDependencies": {
"@opentelemetry/api": "^1.1.0",
"fibers": ">= 3.1.0",
"react": "^18.2.0",
"react-dom": "^18.2.0",
"sass": "^1.3.0"
},
"peerDependenciesMeta": {
"@opentelemetry/api": {
"optional": true
},
"fibers": {
"optional": true
},
"sass": {
"optional": true
}
}
},
"node_modules/object-assign": {
"version": "4.1.1",
"resolved": "https://registry.npmjs.org/object-assign/-/object-assign-4.1.1.tgz",
"integrity": "sha512-rJgTQnkUnH1sFw8yT6VSU3zD3sWmu6sZhIseY8VX+GRu3P6F7Fu+JNDoXfklElbLJSnc3FUQHVe4cU5hj+BcUg==",
"engines": {
"node": ">=0.10.0"
}
},
"node_modules/parent-module": {
"version": "1.0.1",
"resolved": "https://registry.npmjs.org/parent-module/-/parent-module-1.0.1.tgz",
"integrity": "sha512-GQ2EWRpQV8/o+Aw8YqtfZZPfNRWZYkbidE9k5rpl/hC3vtHHBfGm2Ifi6qWV+coDGkrUKZAxE3Lot5kcsRlh+g==",
"dependencies": {
"callsites": "^3.0.0"
},
"engines": {
"node": ">=6"
}
},
"node_modules/parse-json": {
"version": "5.2.0",
"resolved": "https://registry.npmjs.org/parse-json/-/parse-json-5.2.0.tgz",
"integrity": "sha512-ayCKvm/phCGxOkYRSCM82iDwct8/EonSEgCSxWxD7ve6jHggsFl4fZVQBPRNgQoKiuV/odhFrGzQXZwbifC8Rg==",
"dependencies": {
"@babel/code-frame": "^7.0.0",
"error-ex": "^1.3.1",
"json-parse-even-better-errors": "^2.3.0",
"lines-and-columns": "^1.1.6"
},
"engines": {
"node": ">=8"
},
"funding": {
"url": "https://github.com/sponsors/sindresorhus"
}
},
"node_modules/path-parse": {
"version": "1.0.7",
"resolved": "https://registry.npmjs.org/path-parse/-/path-parse-1.0.7.tgz",
"integrity": "sha512-LDJzPVEEEPR+y48z93A0Ed0yXb8pAByGWo/k5YYdYgpY2/2EsOsksJrq7lOHxryrVOn1ejG6oAp8ahvOIQD8sw=="
},
"node_modules/path-type": {
"version": "4.0.0",
"resolved": "https://registry.npmjs.org/path-type/-/path-type-4.0.0.tgz",
"integrity": "sha512-gDKb8aZMDeD/tZWs9P6+q0J9Mwkdl6xMV8TjnGP3qJVJ06bdMgkbBlLU8IdfOsIsFz2BW1rNVT3XuNEl8zPAvw==",
"engines": {
"node": ">=8"
}
},
"node_modules/picocolors": {
"version": "1.0.0",
"resolved": "https://registry.npmjs.org/picocolors/-/picocolors-1.0.0.tgz",
"integrity": "sha512-1fygroTLlHu66zi26VoTDv8yRgm0Fccecssto+MhsZ0D/DGW2sm8E8AjW7NU5VVTRt5GxbeZ5qBuJr+HyLYkjQ=="
},
"node_modules/postcss": {
"version": "8.4.14",
"resolved": "https://registry.npmjs.org/postcss/-/postcss-8.4.14.tgz",
"integrity": "sha512-E398TUmfAYFPBSdzgeieK2Y1+1cpdxJx8yXbK/m57nRhKSmk1GB2tO4lbLBtlkfPQTDKfe4Xqv1ASWPpayPEig==",
"funding": [
{
"type": "opencollective",
"url": "https://opencollective.com/postcss/"
},
{
"type": "tidelift",
"url": "https://tidelift.com/funding/github/npm/postcss"
}
],
"dependencies": {
"nanoid": "^3.3.4",
"picocolors": "^1.0.0",
"source-map-js": "^1.0.2"
},
"engines": {
"node": "^10 || ^12 || >=14"
}
},
"node_modules/prop-types": {
"version": "15.8.1",
"resolved": "https://registry.npmjs.org/prop-types/-/prop-types-15.8.1.tgz",
"integrity": "sha512-oj87CgZICdulUohogVAR7AjlC0327U4el4L6eAvOqCeudMDVU0NThNaV+b9Df4dXgSP1gXMTnPdhfe/2qDH5cg==",
"dependencies": {
"loose-envify": "^1.4.0",
"object-assign": "^4.1.1",
"react-is": "^16.13.1"
}
},
"node_modules/prop-types/node_modules/react-is": {
"version": "16.13.1",
"resolved": "https://registry.npmjs.org/react-is/-/react-is-16.13.1.tgz",
"integrity": "sha512-24e6ynE2H+OKt4kqsOvNd8kBpV65zoxbA4BVsEOB3ARVWQki/DHzaUoC5KuON/BiccDaCCTZBuOcfZs70kR8bQ=="
},
"node_modules/react": {
"version": "18.2.0",
"resolved": "https://registry.npmjs.org/react/-/react-18.2.0.tgz",
"integrity": "sha512-/3IjMdb2L9QbBdWiW5e3P2/npwMBaU9mHCSCUzNln0ZCYbcfTsGbTJrU/kGemdH2IWmB2ioZ+zkxtmq6g09fGQ==",
"dependencies": {
"loose-envify": "^1.1.0"
},
"engines": {
"node": ">=0.10.0"
}
},
"node_modules/react-dom": {
"version": "18.2.0",
"resolved": "https://registry.npmjs.org/react-dom/-/react-dom-18.2.0.tgz",
"integrity": "sha512-6IMTriUmvsjHUjNtEDudZfuDQUoWXVxKHhlEGSk81n4YFS+r/Kl99wXiwlVXtPBtJenozv2P+hxDsw9eA7Xo6g==",
"dependencies": {
"loose-envify": "^1.1.0",
"scheduler": "^0.23.0"
},
"peerDependencies": {
"react": "^18.2.0"
}
},
"node_modules/react-hook-form": {
"version": "7.45.4",
"resolved": "https://registry.npmjs.org/react-hook-form/-/react-hook-form-7.45.4.tgz",
"integrity": "sha512-HGDV1JOOBPZj10LB3+OZgfDBTn+IeEsNOKiq/cxbQAIbKaiJUe/KV8DBUzsx0Gx/7IG/orWqRRm736JwOfUSWQ==",
"engines": {
"node": ">=12.22.0"
},
"funding": {
"type": "opencollective",
"url": "https://opencollective.com/react-hook-form"
},
"peerDependencies": {
"react": "^16.8.0 || ^17 || ^18"
}
},
"node_modules/react-is": {
"version": "18.2.0",
"resolved": "https://registry.npmjs.org/react-is/-/react-is-18.2.0.tgz",
"integrity": "sha512-xWGDIW6x921xtzPkhiULtthJHoJvBbF3q26fzloPCK0hsvxtPVelvftw3zjbHWSkR2km9Z+4uxbDDK/6Zw9B8w=="
},
"node_modules/react-transition-group": {
"version": "4.4.5",
"resolved": "https://registry.npmjs.org/react-transition-group/-/react-transition-group-4.4.5.tgz",
"integrity": "sha512-pZcd1MCJoiKiBR2NRxeCRg13uCXbydPnmB4EOeRrY7480qNWO8IIgQG6zlDkm6uRMsURXPuKq0GWtiM59a5Q6g==",
"dependencies": {
"@babel/runtime": "^7.5.5",
"dom-helpers": "^5.0.1",
"loose-envify": "^1.4.0",
"prop-types": "^15.6.2"
},
"peerDependencies": {
"react": ">=16.6.0",
"react-dom": ">=16.6.0"
}
},
"node_modules/regenerator-runtime": {
"version": "0.13.11",
"resolved": "https://registry.npmjs.org/regenerator-runtime/-/regenerator-runtime-0.13.11.tgz",
"integrity": "sha512-kY1AZVr2Ra+t+piVaJ4gxaFaReZVH40AKNo7UCX6W+dEwBo/2oZJzqfuN1qLq1oL45o56cPaTXELwrTh8Fpggg=="
},
"node_modules/resolve": {
"version": "1.22.2",
"resolved": "https://registry.npmjs.org/resolve/-/resolve-1.22.2.tgz",
"integrity": "sha512-Sb+mjNHOULsBv818T40qSPeRiuWLyaGMa5ewydRLFimneixmVy2zdivRl+AF6jaYPC8ERxGDmFSiqui6SfPd+g==",
"dependencies": {
"is-core-module": "^2.11.0",
"path-parse": "^1.0.7",
"supports-preserve-symlinks-flag": "^1.0.0"
},
"bin": {
"resolve": "bin/resolve"
},
"funding": {
"url": "https://github.com/sponsors/ljharb"
}
},
"node_modules/resolve-from": {
"version": "4.0.0",
"resolved": "https://registry.npmjs.org/resolve-from/-/resolve-from-4.0.0.tgz",
"integrity": "sha512-pb/MYmXstAkysRFx8piNI1tGFNQIFA3vkE3Gq4EuA1dF6gHp/+vgZqsCGJapvy8N3Q+4o7FwvquPJcnZ7RYy4g==",
"engines": {
"node": ">=4"
}
},
"node_modules/scheduler": {
"version": "0.23.0",
"resolved": "https://registry.npmjs.org/scheduler/-/scheduler-0.23.0.tgz",
"integrity": "sha512-CtuThmgHNg7zIZWAXi3AsyIzA3n4xx7aNyjwC2VJldO2LMVDhFK+63xGqq6CsJH4rTAt6/M+N4GhZiDYPx9eUw==",
"dependencies": {
"loose-envify": "^1.1.0"
}
},
"node_modules/source-map": {
"version": "0.5.7",
"resolved": "https://registry.npmjs.org/source-map/-/source-map-0.5.7.tgz",
"integrity": "sha512-LbrmJOMUSdEVxIKvdcJzQC+nQhe8FUZQTXQy6+I75skNgn3OoQ0DZA8YnFa7gp8tqtL3KPf1kmo0R5DoApeSGQ==",
"engines": {
"node": ">=0.10.0"
}
},
"node_modules/source-map-js": {
"version": "1.0.2",
"resolved": "https://registry.npmjs.org/source-map-js/-/source-map-js-1.0.2.tgz",
"integrity": "sha512-R0XvVJ9WusLiqTCEiGCmICCMplcCkIwwR11mOSD9CR5u+IXYdiseeEuXCVAjS54zqwkLcPNnmU4OeJ6tUrWhDw==",
"engines": {
"node": ">=0.10.0"
}
},
"node_modules/streamsearch": {
"version": "1.1.0",
"resolved": "https://registry.npmjs.org/streamsearch/-/streamsearch-1.1.0.tgz",
"integrity": "sha512-Mcc5wHehp9aXz1ax6bZUyY5afg9u2rv5cqQI3mRrYkGC8rW2hM02jWuwjtL++LS5qinSyhj2QfLyNsuc+VsExg==",
"engines": {
"node": ">=10.0.0"
}
},
"node_modules/strnum": {
"version": "1.0.5",
"resolved": "https://registry.npmjs.org/strnum/-/strnum-1.0.5.tgz",
"integrity": "sha512-J8bbNyKKXl5qYcR36TIO8W3mVGVHrmmxsd5PAItGkmyzwJvybiw2IVq5nqd0i4LSNSkB/sx9VHllbfFdr9k1JA=="
},
"node_modules/styled-jsx": {
"version": "5.1.1",
"resolved": "https://registry.npmjs.org/styled-jsx/-/styled-jsx-5.1.1.tgz",
"integrity": "sha512-pW7uC1l4mBZ8ugbiZrcIsiIvVx1UmTfw7UkC3Um2tmfUq9Bhk8IiyEIPl6F8agHgjzku6j0xQEZbfA5uSgSaCw==",
"dependencies": {
"client-only": "0.0.1"
},
"engines": {
"node": ">= 12.0.0"
},
"peerDependencies": {
"react": ">= 16.8.0 || 17.x.x || ^18.0.0-0"
},
"peerDependenciesMeta": {
"@babel/core": {
"optional": true
},
"babel-plugin-macros": {
"optional": true
}
}
},
"node_modules/stylis": {
"version": "4.2.0",
"resolved": "https://registry.npmjs.org/stylis/-/stylis-4.2.0.tgz",
"integrity": "sha512-Orov6g6BB1sDfYgzWfTHDOxamtX1bE/zo104Dh9e6fqJ3PooipYyfJ0pUmrZO2wAvO8YbEyeFrkV91XTsGMSrw=="
},
"node_modules/supports-color": {
"version": "5.5.0",
"resolved": "https://registry.npmjs.org/supports-color/-/supports-color-5.5.0.tgz",
"integrity": "sha512-QjVjwdXIt408MIiAqCX4oUKsgU2EqAGzs2Ppkm4aQYbjm+ZEWEcW4SfFNTr4uMNZma0ey4f5lgLrkB0aX0QMow==",
"dependencies": {
"has-flag": "^3.0.0"
},
"engines": {
"node": ">=4"
}
},
"node_modules/supports-preserve-symlinks-flag": {
"version": "1.0.0",
"resolved": "https://registry.npmjs.org/supports-preserve-symlinks-flag/-/supports-preserve-symlinks-flag-1.0.0.tgz",
"integrity": "sha512-ot0WnXS9fgdkgIcePe6RHNk1WA8+muPa6cSjeR3V8K27q9BB1rTE3R1p7Hv0z1ZyAc8s6Vvv8DIyWf681MAt0w==",
"engines": {
"node": ">= 0.4"
},
"funding": {
"url": "https://github.com/sponsors/ljharb"
}
},
"node_modules/to-fast-properties": {
"version": "2.0.0",
"resolved": "https://registry.npmjs.org/to-fast-properties/-/to-fast-properties-2.0.0.tgz",
"integrity": "sha512-/OaKK0xYrs3DmxRYqL/yDc+FxFUVYhDlXMhRmv3z915w2HF1tnN1omB354j8VUGO/hbRzyD6Y3sA7v7GS/ceog==",
"engines": {
"node": ">=4"
}
},
"node_modules/tslib": {
"version": "2.5.3",
"resolved": "https://registry.npmjs.org/tslib/-/tslib-2.5.3.tgz",
"integrity": "sha512-mSxlJJwl3BMEQCUNnxXBU9jP4JBktcEGhURcPR6VQVlnP0FdDEsIaz0C35dXNGLyRfrATNofF0F5p2KPxQgB+w=="
},
"node_modules/uuid": {
"version": "8.3.2",
"resolved": "https://registry.npmjs.org/uuid/-/uuid-8.3.2.tgz",
"integrity": "sha512-+NYs2QeMWy+GWFOEm9xnn6HCDp0l7QBD7ml8zLUmJ+93Q5NF0NocErnwkTkXVFNiX3/fpC6afS8Dhb/gz7R7eg==",
"bin": {
"uuid": "dist/bin/uuid"
}
},
"node_modules/watchpack": {
"version": "2.4.0",
"resolved": "https://registry.npmjs.org/watchpack/-/watchpack-2.4.0.tgz",
"integrity": "sha512-Lcvm7MGST/4fup+ifyKi2hjyIAwcdI4HRgtvTpIUxBRhB+RFtUh8XtDOxUfctVCnhVi+QQj49i91OyvzkJl6cg==",
"dependencies": {
"glob-to-regexp": "^0.4.1",
"graceful-fs": "^4.1.2"
},
"engines": {
"node": ">=10.13.0"
}
},
"node_modules/yaml": {
"version": "1.10.2",
"resolved": "https://registry.npmjs.org/yaml/-/yaml-1.10.2.tgz",
"integrity": "sha512-r3vXyErRCYJ7wg28yvBY5VSoAF8ZvlcW9/BwUzEtUsjvX/DKs24dIkuwjtuprwJJHsbyUbLApepYTR1BN4uHrg==",
"engines": {
"node": ">= 6"
}
},
"node_modules/zod": {
"version": "3.21.4",
"resolved": "https://registry.npmjs.org/zod/-/zod-3.21.4.tgz",
"integrity": "sha512-m46AKbrzKVzOzs/DZgVnG5H55N1sv1M8qZU3A8RIKbs3mrACDNeIOeilDymVb2HdmP8uwshOCF4uJ8uM9rCqJw==",
"funding": {
"url": "https://github.com/sponsors/colinhacks"
}
}
}
}
{
"name": "xai-frontend",
"version": "0.1.0",
"private": true,
"scripts": {
"dev": "next dev",
"build": "next build",
"start": "next start",
"lint": "next lint"
},
"dependencies": {
"@aws-sdk/client-lambda": "^3.359.0",
"@emotion/react": "^11.11.1",
"@emotion/styled": "^11.11.0",
"@mui/icons-material": "^5.11.16",
"@mui/lab": "^5.0.0-alpha.134",
"@mui/material": "^5.13.6",
"next": "13.4.7",
"react": "18.2.0",
"react-dom": "18.2.0",
"react-hook-form": "^7.45.4"
}
}
precision recall f1-score support
0 0.82 0.78 0.80 2529
1 0.79 0.82 0.81 2470
accuracy 0.80 4999
macro avg 0.80 0.80 0.80 4999
weighted avg 0.80 0.80 0.80 4999
---- Classification report for LR ----
precision recall f1-score support
0 0.90 0.89 0.90 2529
1 0.89 0.90 0.89 2470
accuracy 0.89 4999
macro avg 0.89 0.89 0.89 4999
weighted avg 0.89 0.89 0.89 4999
precision recall f1-score support
0 0.86 0.85 0.86 2529
1 0.85 0.86 0.85 2470
accuracy 0.85 4999
macro avg 0.85 0.85 0.85 4999
weighted avg 0.85 0.85 0.85 4999
precision recall f1-score support
0 0.90 0.90 0.90 2529
1 0.90 0.89 0.90 2470
accuracy 0.90 4999
macro avg 0.90 0.90 0.90 4999
weighted avg 0.90 0.90 0.90 4999
import { AWS_ACCESS_KEY, AWS_REGION, AWS_SECRET_KEY } from "@/constants";
import { LambdaClient } from "@aws-sdk/client-lambda";
const lambdaClient = new LambdaClient({
region: AWS_REGION,
credentials: {
accessKeyId: AWS_ACCESS_KEY,
secretAccessKey: AWS_SECRET_KEY,
},
});
export { lambdaClient };
import * as React from "react";
import TextareaAutosize from "@mui/base/TextareaAutosize";
import { styled } from "@mui/system";
export default React.forwardRef((props, ref) => {
const blue = {
100: "#DAECFF",
200: "#b6daff",
400: "#3399FF",
500: "#007FFF",
600: "#0072E5",
900: "#003A75",
};
const grey = {
50: "#f6f8fa",
100: "#eaeef2",
200: "#d0d7de",
300: "#afb8c1",
400: "#8c959f",
500: "#6e7781",
600: "#57606a",
700: "#424a53",
800: "#32383f",
900: "#24292f",
};
const StyledTextarea = styled(TextareaAutosize)(
({ theme }) => `
width: 100%;
font-family: IBM Plex Sans, sans-serif;
font-size: 0.875rem;
font-weight: 400;
line-height: 1.5;
padding: 12px;
border-radius: 12px 12px 0 12px;
color: ${theme.palette.mode === "dark" ? grey[300] : grey[900]};
background: ${theme.palette.mode === "dark" ? grey[900] : "#fff"};
border: 1px solid ${theme.palette.mode === "dark" ? grey[700] : grey[200]};
box-shadow: 0px 2px 24px ${
theme.palette.mode === "dark" ? blue[900] : blue[100]
};
&:hover {
border-color: ${blue[400]};
}
&:focus {
border-color: ${blue[400]};
box-shadow: 0 0 0 3px ${
theme.palette.mode === "dark" ? blue[600] : blue[200]
};
}
// firefox
&:focus-visible {
outline: 0;
}
`
);
return <StyledTextarea {...props} ref={ref} />;
});
import commonStyles from "@/styles/commonStyles";
import { Box, Button } from "@mui/material";
const FormButtons = ({ reset, close }) => {
return (
<Box sx={commonStyles.btnContainer}>
<Button variant="outlined" type="submit">
Add
</Button>
<Button variant="outlined" onClick={reset}>
Reset
</Button>
<Button variant="outlined" onClick={close}>
Close
</Button>
</Box>
);
};
export default FormButtons;
import commonStyles from "@/styles/commonStyles";
import { Box, IconButton, Paper } from "@mui/material";
import styles from "./styles";
import { Close } from "@mui/icons-material";
const ModalContainer = ({ children, show, close }) => {
return (
<Box sx={commonStyles.backdrop(show)} onClick={close}>
<Paper sx={styles.modal} onClick={(e) => e.stopPropagation()}>
<Box sx={styles.headRibbon}>
<IconButton sx={styles.cross} onClick={close}>
<Close />
</IconButton>
</Box>
<Box sx={styles.body}>{children}</Box>
</Paper>
</Box>
);
};
export default ModalContainer;
import commonStyles from "@/styles/commonStyles";
export default {
modal: {
minWidth: "640px",
},
headRibbon: {
display: "flex",
justifyContent: "flex-end",
},
cross: {
...commonStyles.iconBtn,
margin: "7px",
padding: "2px",
},
body: {
padding: "20px",
paddingTop: "0px",
},
};
export const AWS_ACCESS_KEY = process.env["NEXT_PUBLIC_AWS_ACCESS_KEY"];
export const AWS_SECRET_KEY = process.env["NEXT_PUBLIC_AWS_SECRET_KEY"];
export const AWS_REGION = process.env["NEXT_PUBLIC_AWS_REGION"];
export const AWS_LAMBDA_NAME = process.env["NEXT_PUBLIC_AWS_XAI_LAMBDA"];
export const MODEL_NAME_KNN = "knn";
export const MODEL_NAME_SVM = "svm";
export const MODEL_NAME_LR = "lr";
export const MODEL_NAME_RF = "rf";
export const STATUS_CODE_MAP = {
200: "OK",
400: "Bad Request",
500: "Server Error",
};
import commonStyles from "@/styles/commonStyles";
import { Box, IconButton, TextField, Typography } from "@mui/material";
import Textarea from "@/components/Textarea";
import { useContext, useEffect, useRef, useState } from "react";
import { startTestCase } from "@/functions/api";
import { LoadingButton } from "@mui/lab";
import Configurations from "./configurations/Configurations";
import { ModalContext } from "@/providers/modalProvider/ModalProvider";
import { MODEL_NAME_KNN, MODEL_NAME_SVM } from "@/constants";
import { Close } from "@mui/icons-material";
import styles from "./styles";
const Analysis = ({ model }) => {
const { setNotification } = useContext(ModalContext);
const textareaRef = useRef();
const variationsRef = useRef();
const [loading, setLoading] = useState(false);
const [configurations, setConfigurations] = useState([]);
const [report, setReport] = useState();
const evaluationHandler = () => {
const prompt = textareaRef.current && textareaRef.current.value;
const variations = variationsRef.current
? parseInt(variationsRef.current.value)
: null;
if (prompt !== "" && model && variations !== 0) {
setLoading(true);
startTestCase({
model_name: model,
prompt,
variations,
configurations,
})
.then(setReport)
.catch(setNotification)
.finally(() => {
setLoading(false);
});
}
};
useEffect(() => {
setConfigurations([]);
setReport();
}, [model]);
return (
<Box sx={commonStyles.sectionContainer}>
<Typography variant="h3">Analysis</Typography>
<Textarea
ref={textareaRef}
placeholder="Prompt"
sx={commonStyles.text}
/>
{(model === MODEL_NAME_SVM || model === MODEL_NAME_KNN) && (
<TextField
inputRef={variationsRef}
sx={commonStyles.text}
label="Variations"
size="small"
type="number"
defaultValue={2}
/>
)}
<Typography variant="h4">Test Cases</Typography>
<Configurations
model={model}
configurations={configurations}
setConfigurations={setConfigurations}
/>
<LoadingButton
loading={loading}
sx={commonStyles.btn}
onClick={evaluationHandler}
>
Analyze
</LoadingButton>
{report && (
<>
<Typography variant="h4">Report</Typography>
<Box sx={commonStyles.outputContainer()}>
<IconButton
sx={styles.close}
onClick={() => setReport()}
>
<Close />
</IconButton>
<Typography variant="body1" sx={commonStyles.codeBlock}>
{report}
</Typography>
</Box>
</>
)}
</Box>
);
};
export default Analysis;
import { Box, IconButton, Typography } from "@mui/material";
import styles from "./configForm/styles";
import { Add } from "@mui/icons-material";
import { useState } from "react";
import ConfigForm from "./configForm/ConfigForm";
import ConfigTable from "./configTable/ConfigTable";
import ModalContainer from "@/components/modalContainer/ModalContainer";
const Configurations = ({ model, configurations, setConfigurations }) => {
const [displayModal, setDisplayModal] = useState(false);
const addConfig = (newConfig) => {
setConfigurations([...configurations, newConfig]);
setDisplayModal(false);
};
return (
<Box sx={styles.root}>
{configurations.length !== 0 && (
<ConfigTable
configurations={configurations}
setConfigurations={setConfigurations}
model={model}
/>
)}
<IconButton
onClick={() => setDisplayModal(true)}
sx={styles.addRow}
>
<Add />
</IconButton>
<ModalContainer
show={displayModal}
close={() => setDisplayModal(false)}
>
<Typography variant="h4">Add Test Case</Typography>
<ConfigForm
model={model}
addConfig={addConfig}
close={() => setDisplayModal(false)}
displayModal={displayModal}
/>
</ModalContainer>
</Box>
);
};
export default Configurations;
import { useForm } from "react-hook-form";
import { useEffect } from "react";
import {
MODEL_NAME_KNN,
MODEL_NAME_LR,
MODEL_NAME_RF,
MODEL_NAME_SVM,
} from "@/constants";
import SVMKNN, { defaultValues as SVMKNNDefaultValues } from "./SVMKNN";
import RFLR, { LRDefaultValues, RFDefaultValues } from "./RFLR";
import { Box } from "@mui/material";
const ConfigForm = ({ model, addConfig, close, displayModal }) => {
let defaultValues = SVMKNNDefaultValues;
if (model === MODEL_NAME_RF) defaultValues = RFDefaultValues;
else if (model === MODEL_NAME_LR) defaultValues = LRDefaultValues;
const {
register,
handleSubmit,
control,
formState: { errors },
reset,
} = useForm({
defaultValues,
});
useEffect(() => {
if (displayModal) reset();
}, [displayModal]);
if (model === MODEL_NAME_KNN || model === MODEL_NAME_SVM) {
return (
<SVMKNN
handleSubmit={handleSubmit}
addConfig={addConfig}
register={register}
control={control}
errors={errors}
reset={reset}
close={close}
/>
);
} else if (model === MODEL_NAME_RF || model === MODEL_NAME_LR) {
return (
<RFLR
handleSubmit={handleSubmit}
addConfig={addConfig}
register={register}
errors={errors}
reset={reset}
close={close}
/>
);
} else {
return <Box>unknown</Box>;
}
};
export default ConfigForm;
import commonStyles from "@/styles/commonStyles";
import { TextField } from "@mui/material";
import styles from "./styles";
import FormButtons from "@/components/formButtons/FormButtons";
export const RFDefaultValues = {
threshold_classifier: 0.493399999999838,
max_iter: 50,
time_maximum: 120,
};
export const LRDefaultValues = {
threshold_classifier: 0.491799999999785,
max_iter: 50,
time_maximum: 120,
};
const RFLR = ({ handleSubmit, addConfig, register, errors, reset, close }) => {
const onSubmit = (config) => {
config.threshold_classifier = parseFloat(config.threshold_classifier);
config.max_iter = parseInt(config.max_iter);
config.time_maximum = parseInt(config.time_maximum);
addConfig(config);
};
return (
<form onSubmit={handleSubmit(onSubmit)} style={styles.modalRoot}>
<TextField
sx={commonStyles.text}
label="Name"
type="text"
{...register("name", { required: "Name is required" })}
helperText={errors.name && errors.name.message}
error={errors.name ? true : false}
/>
<TextField
sx={commonStyles.text}
label="Classification Threshold"
type="number"
inputProps={{ step: 0.000000000000001 }}
{...register("threshold_classifier", {
required: "Classification Threshold is required",
min: 0,
})}
helperText={
errors.threshold_classifier &&
errors.threshold_classifier.message
}
error={errors.threshold_classifier ? true : false}
/>
<TextField
sx={commonStyles.text}
label="Maximum Iterations"
type="number"
{...register("max_iter", {
required: "Maximum Iterations is required",
})}
helperText={errors.max_iter && errors.max_iter.message}
error={errors.max_iter ? true : false}
/>
<TextField
sx={commonStyles.text}
label="Maximum Time"
type="number"
{...register("time_maximum", {
required: "Maximum Time is required",
})}
helperText={errors.time_maximum && errors.time_maximum.message}
error={errors.time_maximum ? true : false}
/>
<FormButtons reset={reset} close={close} />
</form>
);
};
export default RFLR;
import commonStyles from "@/styles/commonStyles";
import styles from "./styles";
import { Autocomplete, TextField, Typography } from "@mui/material";
import { Controller } from "react-hook-form";
import { tags } from "../flippingTags";
import FormButtons from "@/components/formButtons/FormButtons";
export const defaultValues = {
sample_prob_decay_factor: 0.2,
flip_prob: 0.5,
};
const SVMKNN = ({
handleSubmit,
addConfig,
register,
errors,
control,
close,
reset,
}) => {
const onAddConfig = (newConfig) => {
if (newConfig.flipping_tags)
newConfig.flipping_tags = JSON.parse(newConfig.flipping_tags);
const { name, ...generator_config } = newConfig;
generator_config["flip_prob"] = parseFloat(
generator_config["flip_prob"]
);
generator_config["sample_prob_decay_factor"] = parseFloat(
generator_config["sample_prob_decay_factor"]
);
const formattedConfig = { name, generator_config };
addConfig(formattedConfig);
};
return (
<form onSubmit={handleSubmit(onAddConfig)} style={styles.modalRoot}>
<TextField
sx={commonStyles.text}
label="Name"
type="text"
{...register("name", { required: "Name is required" })}
helperText={errors.name && errors.name.message}
error={errors.name ? true : false}
/>
<Typography variant="h5">Generator Configurations</Typography>
<TextField
sx={commonStyles.text}
label="Sampling Probability Decay Factor"
type="number"
inputProps={{ step: 0.000000000000001 }}
{...register("sample_prob_decay_factor", {
required: "Sampling Probability Decay Factor is required",
min: 0,
})}
helperText={
errors.sample_prob_decay_factor &&
errors.sample_prob_decay_factor.message
}
error={errors.sample_prob_decay_factor ? true : false}
/>
<TextField
sx={commonStyles.text}
label="Flipping Probability"
type="number"
inputProps={{ step: 0.000000000000001 }}
{...register("flip_prob", {
required: "Flipping Probability is required",
min: 0,
max: 1,
})}
helperText={errors.flip_prob && errors.flip_prob.message}
error={errors.flip_prob ? true : false}
/>
<Controller
name="flipping_tags"
control={control}
rules={{ required: "Flipping Tags are required" }}
render={({ field }) => (
<Autocomplete
{...field}
onChange={(e, val) => {
field.onChange({
target: { value: JSON.stringify(val) },
});
}}
value={field.value ? JSON.parse(field.value) : []}
multiple
options={tags}
renderInput={(params) => (
<TextField
{...params}
label="Flipping Tags"
error={!!errors.flipping_tags}
helperText={errors.flipping_tags?.message}
/>
)}
/>
)}
/>
<FormButtons reset={reset} close={close} />
</form>
);
};
export default SVMKNN;
import { alpha } from "@mui/material";
export default {
root: {
width: "100%",
},
addRow: {
border: (theme) =>
`1px dashed ${alpha(theme.palette.text.primary, 0.5)}`,
textAlign: "center",
width: "100%",
borderRadius: "10px",
},
modalRoot: {
display: "flex",
flexDirection: "column",
},
};
import {
MODEL_NAME_KNN,
MODEL_NAME_LR,
MODEL_NAME_RF,
MODEL_NAME_SVM,
} from "@/constants";
import { Table } from "@mui/material";
import SVMKNN from "./SVMKNN";
import RFLR from "./RFLR";
const ConfigTable = ({ configurations, setConfigurations, model }) => {
const handleDelete = (i) => {
const newConfigs = [...configurations];
newConfigs.splice(i, 1);
setConfigurations(newConfigs);
};
if (model === MODEL_NAME_KNN || model === MODEL_NAME_SVM) {
return (
<SVMKNN
configurations={configurations}
handleDelete={handleDelete}
/>
);
} else if (model === MODEL_NAME_RF || model === MODEL_NAME_LR) {
return (
<RFLR configurations={configurations} handleDelete={handleDelete} />
);
} else {
return <Table></Table>;
}
};
export default ConfigTable;
import { Close } from "@mui/icons-material";
import {
IconButton,
Table,
TableBody,
TableCell,
TableHead,
TableRow,
} from "@mui/material";
const RFLR = ({ configurations, handleDelete }) => {
return (
<Table>
<TableHead>
<TableRow>
<TableCell>Name</TableCell>
<TableCell>Classification Threshold</TableCell>
<TableCell>Maximum Iterations</TableCell>
<TableCell>Maximum Time</TableCell>
<TableCell />
</TableRow>
</TableHead>
<TableBody>
{configurations.map((config, key) => (
<TableRow key={key}>
<TableCell>{config.name}</TableCell>
<TableCell>{config.threshold_classifier}</TableCell>
<TableCell>{config.max_iter}</TableCell>
<TableCell>{config.time_maximum}</TableCell>
<TableCell>
<IconButton
color="error"
onClick={() => handleDelete(key)}
>
<Close />
</IconButton>
</TableCell>
</TableRow>
))}
</TableBody>
</Table>
);
};
export default RFLR;
import { Close } from "@mui/icons-material";
import {
IconButton,
Table,
TableBody,
TableCell,
TableHead,
TableRow,
} from "@mui/material";
const SVMKNN = ({ configurations, handleDelete }) => {
return (
<Table>
<TableHead>
<TableRow>
<TableCell>Name</TableCell>
<TableCell>Sampling Probability Decay Factor</TableCell>
<TableCell>Flipping Probability</TableCell>
<TableCell>Flipping Tags</TableCell>
<TableCell />
</TableRow>
</TableHead>
<TableBody>
{configurations.map((config, key) => (
<TableRow key={key}>
<TableCell>{config.name}</TableCell>
<TableCell>
{config.generator_config?.sample_prob_decay_factor}
</TableCell>
<TableCell>
{config.generator_config?.flip_prob}
</TableCell>
<TableCell>
{config.generator_config?.flipping_tags.join(", ")}
</TableCell>
<TableCell>
<IconButton
color="error"
onClick={() => handleDelete(key)}
>
<Close />
</IconButton>
</TableCell>
</TableRow>
))}
</TableBody>
</Table>
);
};
export default SVMKNN;
export const tagHelp = `CC: conjunction, coordinating
& 'n and both but either et for less minus neither nor or plus so
therefore times v. versus vs. whether yet
CD: numeral, cardinal
mid-1890 nine-thirty forty-two one-tenth ten million 0.5 one forty-
seven 1987 twenty '79 zero two 78-degrees eighty-four IX '60s .025
fifteen 271,124 dozen quintillion DM2,000 ...
DT: determiner
all an another any both del each either every half la many much nary
neither no some such that the them these this those
EX: existential there
there
FW: foreign word
gemeinschaft hund ich jeux habeas Haementeria Herr K'ang-si vous
lutihaw alai je jour objets salutaris fille quibusdam pas trop Monte
terram fiche oui corporis ...
IN: preposition or conjunction, subordinating
astride among uppon whether out inside pro despite on by throughout
below within for towards near behind atop around if like until below
next into if beside ...
JJ: adjective or numeral, ordinal
third ill-mannered pre-war regrettable oiled calamitous first separable
ectoplasmic battery-powered participatory fourth still-to-be-named
multilingual multi-disciplinary ...
JJR: adjective, comparative
bleaker braver breezier briefer brighter brisker broader bumper busier
calmer cheaper choosier cleaner clearer closer colder commoner costlier
cozier creamier crunchier cuter ...
JJS: adjective, superlative
calmest cheapest choicest classiest cleanest clearest closest commonest
corniest costliest crassest creepiest crudest cutest darkest deadliest
dearest deepest densest dinkiest ...
LS: list item marker
A A. B B. C C. D E F First G H I J K One SP-44001 SP-44002 SP-44005
SP-44007 Second Third Three Two * a b c d first five four one six three
two
MD: modal auxiliary
can cannot could couldn't dare may might must need ought shall should
shouldn't will would
NN: noun, common, singular or mass
common-carrier cabbage knuckle-duster Casino afghan shed thermostat
investment slide humour falloff slick wind hyena override subhumanity
machinist ...
NNP: noun, proper, singular
Motown Venneboerger Czestochwa Ranzer Conchita Trumplane Christos
Oceanside Escobar Kreisler Sawyer Cougar Yvette Ervin ODI Darryl CTCA
Shannon A.K.C. Meltex Liverpool ...
NNPS: noun, proper, plural
Americans Americas Amharas Amityvilles Amusements Anarcho-Syndicalists
Andalusians Andes Andruses Angels Animals Anthony Antilles Antiques
Apache Apaches Apocrypha ...
NNS: noun, common, plural
undergraduates scotches bric-a-brac products bodyguards facets coasts
divestitures storehouses designs clubs fragrances averages
subjectivists apprehensions muses factory-jobs ...
PDT: pre-determiner
all both half many quite such sure this
POS: genitive marker
' 's
PRP: pronoun, personal
hers herself him himself hisself it itself me myself one oneself ours
ourselves ownself self she thee theirs them themselves they thou thy us
PRP$: pronoun, possessive
her his mine my our ours their thy your
RB: adverb
occasionally unabatingly maddeningly adventurously professedly
stirringly prominently technologically magisterially predominately
swiftly fiscally pitilessly ...
RBR: adverb, comparative
further gloomier grander graver greater grimmer harder harsher
healthier heavier higher however larger later leaner lengthier less-
perfectly lesser lonelier longer louder lower more ...
RBS: adverb, superlative
best biggest bluntest earliest farthest first furthest hardest
heartiest highest largest least less most nearest second tightest worst
RP: particle
aboard about across along apart around aside at away back before behind
by crop down ever fast for forth from go high i.e. in into just later
low more off on open out over per pie raising start teeth that through
under unto up up-pp upon whole with you
SYM: symbol
% & ' '' ''. ) ). * + ,. < = > @ A[fj] U.S U.S.S.R * ** ***
TO: "to" as preposition or infinitive marker
to
UH: interjection
Goodbye Goody Gosh Wow Jeepers Jee-sus Hubba Hey Kee-reist Oops amen
huh howdy uh dammit whammo shucks heck anyways whodunnit honey golly
man baby diddle hush sonuvabitch ...
VB: verb, base form
ask assemble assess assign assume atone attention avoid bake balkanize
bank begin behold believe bend benefit bevel beware bless boil bomb
boost brace break bring broil brush build ...
VBD: verb, past tense
dipped pleaded swiped regummed soaked tidied convened halted registered
cushioned exacted snubbed strode aimed adopted belied figgered
speculated wore appreciated contemplated ...
VBG: verb, present participle or gerund
telegraphing stirring focusing angering judging stalling lactating
hankerin' alleging veering capping approaching traveling besieging
encrypting interrupting erasing wincing ...
VBN: verb, past participle
multihulled dilapidated aerosolized chaired languished panelized used
experimented flourished imitated reunifed factored condensed sheared
unsettled primed dubbed desired ...
VBP: verb, present tense, not 3rd person singular
predominate wrap resort sue twist spill cure lengthen brush terminate
appear tend stray glisten obtain comprise detest tease attract
emphasize mold postpone sever return wag ...
VBZ: verb, present tense, 3rd person singular
bases reconstructs marks mixes displeases seals carps weaves snatches
slumps stretches authorizes smolders pictures emerges stockpiles
seduces fizzes uses bolsters slaps speaks pleads ...
WDT: WH-determiner
that what whatever which whichever
WP: WH-pronoun
that what whatever whatsoever which who whom whosoever
WP$: WH-pronoun, possessive
whose
WRB: Wh-adverb
how however whence whenever where whereby whereever wherein whereof why`;
export const tags = [
"CC",
"CD",
"DT",
"EX",
"FW",
"IN",
"JJ",
"JJR",
"JJS",
"LS",
"MD",
"NN",
"NNP",
"NNPS",
"NNS",
"PDT",
"POS",
"PRP",
"PRP$",
"RB",
"RBR",
"RBS",
"RP",
"SYM",
"TO",
"UH",
"VB",
"VBD",
"VBG",
"VBN",
"VBP",
"VBZ",
"WDT",
"WP",
"WP$",
"WRB",
];
export default {
close: {
position: "absolute",
right: 0,
top: 0,
},
};
import Textarea from "@/components/Textarea";
import { Box, Typography } from "@mui/material";
import { LoadingButton } from "@mui/lab";
import { ThumbDownOffAlt, ThumbUpOffAlt } from "@mui/icons-material";
import styles from "./styles";
import commonStyles from "@/styles/commonStyles";
import { useContext, useRef, useState } from "react";
import { getSentiment } from "@/functions/api";
import { ModalContext } from "@/providers/modalProvider/ModalProvider";
const Evaluation = ({ model }) => {
const { setNotification } = useContext(ModalContext);
const textareaRef = useRef();
const [loading, setLoading] = useState(false);
const [prompt, setPrompt] = useState("");
const [result, setResult] = useState();
const evaluationHandler = () => {
const prompt = textareaRef.current.value;
if (prompt !== "" && model) {
setLoading(true);
setPrompt(prompt);
setResult();
getSentiment(model, prompt)
.then((sentiment) => setResult(sentiment))
.catch(setNotification)
.finally(() => {
setLoading(false);
});
}
};
const status = result && result.prediction === "positive";
return (
<Box sx={commonStyles.sectionContainer}>
<Typography variant="h3">Prompt Evaluation</Typography>
<Textarea
ref={textareaRef}
placeholder="Prompt"
sx={commonStyles.text}
/>
<LoadingButton
loading={loading}
sx={commonStyles.btn}
onClick={evaluationHandler}
>
Evaluate
</LoadingButton>
{prompt !== "" && (
<Box sx={commonStyles.outputContainer(status)}>
<Typography>{prompt}</Typography>
{status !== undefined && (
<Box sx={styles.iconWrapper}>
{status ? <ThumbUpOffAlt /> : <ThumbDownOffAlt />}
<Typography>{result.score}</Typography>
</Box>
)}
</Box>
)}
</Box>
);
};
export default Evaluation;
export default {
iconWrapper: {
width: "100%",
textAlign: "center",
},
};
import {
MODEL_NAME_KNN,
MODEL_NAME_LR,
MODEL_NAME_RF,
MODEL_NAME_SVM,
} from "@/constants";
import commonStyles from "@/styles/commonStyles";
import { Autocomplete, Box, TextField } from "@mui/material";
const ModelSelection = ({ setModel }) => {
const models = [
{ label: "K Nearest Neighbour Model", value: MODEL_NAME_KNN },
{ label: "Logistic Regression Model", value: MODEL_NAME_LR },
{ label: "Random Forest Model", value: MODEL_NAME_RF },
{ label: "Support Vector Machine Model", value: MODEL_NAME_SVM },
];
return (
<Box sx={{ width: "300px" }}>
<Autocomplete
disablePortal
onChange={(e, obj) => (obj ? setModel(obj.value) : setModel())}
options={models}
sx={commonStyles.autocomplete}
isOptionEqualToValue={(obj) => obj.label}
size="small"
renderInput={(params) => (
<TextField {...params} label="Models" />
)}
/>
</Box>
);
};
export default ModelSelection;
import { Box, Typography } from "@mui/material";
import { useEffect, useState } from "react";
import commonStyles from "@/styles/commonStyles";
const Stats = ({ model }) => {
const [text, setText] = useState("");
useEffect(() => {
const updateText = async () => {
const response = await fetch(
`/evaluations/${model}/evaluation.txt`
);
if (response.status == 200) {
const text = await response.text();
setText(text);
}
};
updateText();
}, [model]);
return (
<Box sx={commonStyles.sectionContainer}>
<Typography variant="h3">Test Set Performance</Typography>
<Typography variant="h4">Report</Typography>
<Typography variant="body1" sx={commonStyles.codeBlock}>
{text}
</Typography>
<Typography variant="h4">Visualizations</Typography>
<img src={`/evaluations/${model}/evaluation.jpg`} />
</Box>
);
};
export default Stats;
import { lambdaClient } from "@/clients/aws";
import { AWS_LAMBDA_NAME, STATUS_CODE_MAP } from "@/constants";
import { InvokeCommand } from "@aws-sdk/client-lambda";
export const getSentiment = async (model_name, prompt) => {
const payload = {
task: "evaluation",
payload: { model_name, texts: [prompt] },
};
const command = new InvokeCommand({
FunctionName: AWS_LAMBDA_NAME,
Payload: JSON.stringify(payload),
});
const { Payload } = await lambdaClient.send(command);
const result = Buffer.from(Payload).toString();
const parsedResult = JSON.parse(result);
if (parsedResult.status === 200) {
const response = {
score: parsedResult.body.scores[0],
prediction: parsedResult.body.predictions[0],
};
return response;
} else {
console.error(parsedResult.body);
const bubble = STATUS_CODE_MAP[parsedResult.status];
throw bubble;
}
};
export const startTestCase = async ({
model_name,
prompt,
variations,
configurations,
}) => {
const payload = {
task: "analysis",
payload: { model_name, prompt, variations, configurations },
};
const command = new InvokeCommand({
FunctionName: AWS_LAMBDA_NAME,
Payload: JSON.stringify(payload),
});
const { Payload } = await lambdaClient.send(command);
const result = Buffer.from(Payload).toString();
const parsedResult = JSON.parse(result);
if (parsedResult.status === 200) {
return parsedResult.body;
} else {
console.error(parsedResult.body);
const bubble = STATUS_CODE_MAP[parsedResult.status];
throw bubble;
}
};
export const deepMerge = (...objects) => {
const isObject = (obj) => obj && typeof obj === "object";
return objects.reduce((prev, obj) => {
Object.keys(obj).forEach((key) => {
const pVal = prev[key];
const oVal = obj[key];
if (Array.isArray(pVal) && Array.isArray(oVal)) {
prev[key] = pVal.concat(...oVal);
} else if (isObject(pVal) && isObject(oVal)) {
prev[key] = mergeDeep(pVal, oVal);
} else {
prev[key] = oVal;
}
});
return prev;
}, {});
};
import ModalProvider from "@/providers/modalProvider/ModalProvider";
import "@/styles/globals.css";
import { darkTheme, lightTheme } from "@/styles/themes";
import { ThemeProvider } from "@mui/material";
import { useEffect, useState } from "react";
export default function App({ Component, pageProps }) {
const [theme, setTheme] = useState(lightTheme);
useEffect(() => {
if (
window.matchMedia &&
window.matchMedia("(prefers-color-scheme: dark)").matches
) {
setTheme(darkTheme);
}
}, []);
return (
<ThemeProvider theme={theme}>
<ModalProvider>
<Component {...pageProps} />
</ModalProvider>
</ThemeProvider>
);
}
import { Html, Head, Main, NextScript } from 'next/document'
export default function Document() {
return (
<Html lang="en">
<Head />
<body>
<Main />
<NextScript />
</body>
</Html>
)
}
import Head from "next/head";
import { Box } from "@mui/material";
import pageStyles from "@/styles/pageStyles/index";
import Evaluation from "@/containers/evaluation/Evaluation";
import { useState } from "react";
import Stats from "@/containers/stats/Stats";
import ModelSelection from "@/containers/modelSelection/ModelSelection";
import Analysis from "@/containers/analysis/Analysis";
import { MODEL_NAME_LR } from "@/constants";
export default function Home() {
const [model, setModel] = useState();
return (
<>
<Head>
<title>XAI</title>
<meta
name="description"
content="An implementation of explainable AI algorithms"
/>
<meta
name="viewport"
content="width=device-width, initial-scale=1"
/>
<link rel="icon" href="/favicon.ico" />
</Head>
<Box sx={pageStyles.main}>
<ModelSelection setModel={setModel} />
{model && (
<>
<Evaluation model={model} />
{/* <Stats model={model} /> */}
<Analysis model={model} />
</>
)}
</Box>
</>
);
}
import ModalContainer from "@/components/modalContainer/ModalContainer";
import { Box, Button, IconButton, Typography, useTheme } from "@mui/material";
import { createContext, useState } from "react";
import styles from "./styles";
import { Close, ErrorOutline } from "@mui/icons-material";
export const ModalContext = createContext();
const ModalProvider = ({ children }) => {
const theme = useTheme();
const [modalContent, setModalContent] = useState();
const [notificationText, setNotificationText] = useState();
const [notificationActive, setNotificationActive] = useState(false);
const displayModal = ({
heading,
body,
state,
closeBtnText,
closeBtnProps,
closeBtnExtraActions,
extraBtnTexts,
extraBtnActions,
}) => {
let color = undefined;
let headIcon = undefined;
if (state === "error") {
color = theme.palette.error.main;
headIcon = (
<ErrorOutline
color={"error"}
fontSize="inherit"
sx={styles.headIcon}
/>
);
}
const extraBtns = [];
if (extraBtnTexts) {
for (let i = 0; i < extraBtnTexts.length; i++) {
const text = extraBtnTexts[i];
const action = extraBtnActions[i];
extraBtns.push({ text, action });
}
}
let [sx, rest] = [undefined, undefined];
if (closeBtnProps) {
const { sxN, ...restN } = closeBtnProps;
sx = sxN;
rest = restN;
}
setModalContent({
headIcon,
heading,
body,
color,
closeBtnText,
closeBtnProps: rest,
closeBtnSx: sx,
closeBtnExtraActions,
extraBtns,
});
};
const setNotification = (text) => {
setNotificationText(text);
setNotificationActive(true);
setTimeout(() => {
setNotificationActive(false);
}, 3000);
};
const closeModal = () => {
modalContent.closeBtnExtraActions &&
modalContent.closeBtnExtraActions();
setModalContent();
};
return (
<ModalContext.Provider
value={{ displayModal, setNotification, closeModal }}
>
<Box sx={styles.root}>
{children}
{modalContent && (
<ModalContainer show={true} close={closeModal}>
<Typography
sx={styles.heading(modalContent.color)}
variant="h3"
>
{modalContent.headIcon} {modalContent.heading}
</Typography>
{typeof modalContent.body === "string" ? (
<Typography sx={styles.body} variant="body1">
{modalContent.body}
</Typography>
) : (
modalContent.body
)}
<Box sx={styles.btnContainer}>
{modalContent.extraBtns.map((btn) => (
<Button
key={btn.text}
sx={styles.btn}
variant="outlined"
onClick={(e) => {
btn.action(e);
closeModal();
}}
>
{btn.text}
</Button>
))}
<Button
sx={{
...styles.btn,
...modalContent.closeBtnSx,
}}
{...modalContent.closeBtnProps}
variant="outlined"
onClick={closeModal}
>
{modalContent.closeBtnText
? modalContent.closeBtnText
: "Close"}
</Button>
</Box>
</ModalContainer>
)}
<Box sx={styles.notification(notificationActive)}>
<Typography variant="body1">{notificationText}</Typography>
<IconButton
sx={styles.cross}
onClick={() => setNotificationActive(false)}
>
<Close />
</IconButton>
</Box>
</Box>
</ModalContext.Provider>
);
};
export default ModalProvider;
import common from "@/styles/commonStyles";
import { alpha } from "@mui/material";
export default {
root: {
position: "relative",
overflow: "hidden",
},
heading: (color) => ({
display: "flex",
alignItems: "center",
justifyContent: "center",
backgroundColor: color && alpha(color, 0.2),
}),
body: {
padding: "20px 0px",
},
headIcon: {
margin: "0px 0.7rem",
},
btnContainer: {
display: "flex",
justifyContent: "center",
},
btn: {
margin: "0px 5px",
},
notification: (active) => ({
display: "flex",
alignItems: "center",
position: "absolute",
bottom: active ? "0px" : "-70px",
left: "0px",
backgroundColor: (theme) => theme.palette.background.paper,
padding: "10px 20px",
paddingRight: "5px",
margin: "5px",
borderRadius: "5px",
transition: "0.3s",
zIndex: 1,
boxShadow: (theme) => theme.shadows[10],
}),
cross: {
...common.iconBtn,
margin: "7px",
padding: "2px",
},
};
import { alpha } from "@mui/material";
export default {
btn: {
margin: (theme) => theme.spacing(2),
},
btnContainer: {
textAlign: "center",
".MuiButton-root": {
margin: (theme) => theme.spacing(1),
},
},
autocomplete: {
margin: (theme) => `${theme.spacing(2)} 0`,
width: "100%",
},
text: {
margin: (theme) => `${theme.spacing(2)} 0`,
},
codeBlock: {
fontFamily: "monospace",
whiteSpace: "pre",
overflowX: "scroll",
},
sectionContainer: {
display: "flex",
flexDirection: "column",
alignItems: "center",
border: (theme) => `1px solid ${theme.palette.text.primary}`,
padding: "30px",
margin: "10px",
width: "100%",
borderRadius: "10px",
".MuiTypography-h3": {
margin: "15px 0px",
width: "100%",
},
".MuiTypography-h4": {
margin: "10px 0px",
width: "100%",
},
},
outputContainer: (ok) => ({
position: "relative",
border: (theme) =>
`1px solid ${
ok === undefined
? theme.palette.text.icon
: ok
? theme.palette.success.main
: theme.palette.error.main
}`,
width: "100%",
padding: (theme) => theme.spacing(2),
borderRadius: (theme) => theme.spacing(1),
color: (theme) =>
ok === undefined
? theme.palette.text.icon
: ok
? theme.palette.success.main
: theme.palette.error.main,
}),
backdrop: (active) => ({
display: active ? "flex" : "none",
justifyContent: "center",
alignItems: "center",
position: "fixed",
width: "100vw",
height: "100vh",
left: 0,
top: 0,
backgroundColor: (theme) =>
alpha(theme.palette.background.default, 0.5),
zIndex: 1,
}),
};
:root {
--max-width: 1100px;
--border-radius: 12px;
--font-mono: ui-monospace, Menlo, Monaco, "Cascadia Mono", "Segoe UI Mono",
"Roboto Mono", "Oxygen Mono", "Ubuntu Monospace", "Source Code Pro",
"Fira Mono", "Droid Sans Mono", "Courier New", monospace;
--foreground-rgb: 0, 0, 0;
--background-start-rgb: 214, 219, 220;
--background-end-rgb: 255, 255, 255;
--primary-glow: conic-gradient(
from 180deg at 50% 50%,
#1e72a333 0deg,
#3890e933 55deg,
#54d6ff33 120deg,
#0071ff33 160deg,
transparent 360deg
);
--secondary-glow: radial-gradient(
rgba(255, 255, 255, 1),
rgba(255, 255, 255, 0)
);
--tile-start-rgb: 239, 245, 249;
--tile-end-rgb: 228, 232, 233;
--tile-border: conic-gradient(
#00000080,
#00000040,
#00000030,
#00000020,
#00000010,
#00000010,
#00000080
);
--callout-rgb: 238, 240, 241;
--callout-border-rgb: 172, 175, 176;
--card-rgb: 180, 185, 188;
--card-border-rgb: 131, 134, 135;
}
@media (prefers-color-scheme: dark) {
:root {
--foreground-rgb: 255, 255, 255;
--background-start-rgb: 0, 0, 0;
--background-end-rgb: 0, 0, 0;
--primary-glow: radial-gradient(
rgba(1, 175, 255, 0.4),
rgba(1, 65, 255, 0)
);
--secondary-glow: linear-gradient(
to bottom right,
rgba(1, 65, 255, 0),
rgba(1, 65, 255, 0),
rgba(1, 175, 255, 0.4)
);
--tile-start-rgb: 2, 13, 46;
--tile-end-rgb: 2, 5, 19;
--tile-border: conic-gradient(
#ffffff80,
#ffffff40,
#ffffff30,
#ffffff20,
#ffffff10,
#ffffff10,
#ffffff80
);
--callout-rgb: 20, 20, 20;
--callout-border-rgb: 108, 108, 108;
--card-rgb: 100, 100, 100;
--card-border-rgb: 200, 200, 200;
}
}
* {
box-sizing: border-box;
padding: 0;
margin: 0;
}
html,
body {
max-width: 100vw;
overflow-x: hidden;
}
body {
color: rgb(var(--foreground-rgb));
background: linear-gradient(
to bottom,
transparent,
rgb(var(--background-end-rgb))
)
rgb(var(--background-start-rgb));
}
a {
color: inherit;
text-decoration: none;
}
@media (prefers-color-scheme: dark) {
html {
color-scheme: dark;
}
}
export default {
main: {
display: "flex",
flexDirection: "column",
justifyContent: "flex-start",
alignItems: "center",
padding: "6rem",
minHeight: "100vh",
"&::before": {
background: "var(--secondary-glow)",
borderRadius: "50%",
width: "480px",
height: "360px",
marginLeft: "-400px",
content: '""',
left: "50%",
position: "absolute",
filter: "blur(45px)",
transform: "translateZ(0)",
zIndex: "-1",
},
"&::after": {
background: "var(--primary-glow)",
width: "240px",
height: "180px",
zIndex: "-1",
content: '""',
left: "50%",
position: "absolute",
filter: "blur(45px)",
transform: "translateZ(0)",
},
},
};
export default {
typography: {
h3: {
fontSize: "2rem",
},
h4: {
fontSize: "1.5rem",
},
},
};
import { deepMerge } from "@/functions/util";
import base from "./base";
const overWrite = {
palette: {
mode: "dark",
},
};
export default deepMerge(base, overWrite);
import { createTheme } from "@mui/material";
import darkThemeConfig from "./darkThemeConfig";
import lightThemeConfig from "./lightThemeConfig";
export const darkTheme = createTheme(darkThemeConfig);
export const lightTheme = createTheme(lightThemeConfig);
import { deepMerge } from "@/functions/util";
import base from "./base";
const overWrite = {
palette: {
mode: "light",
},
};
export default deepMerge(base, overWrite);
name: xai
channels:
- defaults
dependencies:
- python=3.9
- pip=23.1.2
- pip:
- scikit-learn==1.2.2
- nltk==3.8.1
- ipykernel==6.24.0
- ipywidgets==7.6.5
- pyyaml==6.0
- pandas==2.0.3
- beautifulsoup4==4.12.2
- wget==3.2
- numpy==1.23.5
- shap==0.41.0
- matplotlib==3.5.1
- seaborn==0.11.2
- ordered-set==4.1.0
- boto3==1.27.0
- torch==2.0.1
- transformers==4.31.0
- sagemaker==2.173.0
- sentencepiece==0.1.99
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from src.test_bench import TestBench\n",
"from src.datasets import IMDBDataset\n",
"\n",
"ds = IMDBDataset(config_path=\"./configs/datasets/imdb.yaml\", root=\"datasets/imdb\")\n",
"x, y = ds.x_test, ds.y_test"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"tb = TestBench(\n",
" model_path=\"./models/analysis-models/knn.pkl\",\n",
" vectorizer_path=\"./models/analysis-models/tfidf.pkl\",\n",
" cf_generator_config_path=\"./configs/models/wf-cf-generator.yaml\",\n",
" analyzer_name=\"knn\"\n",
")\n",
"tb.evaluate(x, y)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"tb = TestBench(\n",
" model_path=\"./models/analysis-models/svm.pkl\",\n",
" vectorizer_path=\"./models/analysis-models/tfidf.pkl\",\n",
" cf_generator_config_path=\"./configs/models/wf-cf-generator.yaml\",\n",
" analyzer_name=\"svc\"\n",
")\n",
"tb.evaluate(x, y)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"tb = TestBench(\n",
" model_path=\"./models/analysis-models/lr.pkl\",\n",
" vectorizer_path=\"./models/analysis-models/tfidf.pkl\",\n",
" cf_generator_config_path=\"./configs/models/wf-cf-generator.yaml\",\n",
" analyzer_name=\"lr\"\n",
")\n",
"tb.evaluate(x, y)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"tb = TestBench(\n",
" model_path=\"./models/analysis-models/rf.pkl\",\n",
" vectorizer_path=\"./models/analysis-models/tfidf.pkl\",\n",
" cf_generator_config_path=\"./configs/models/wf-cf-generator.yaml\",\n",
" analyzer_name=\"rf\"\n",
")\n",
"tb.evaluate(ds.x_test, ds.y_test)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "xai",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.17"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
This source diff could not be displayed because it is too large. You can view the blob instead.
{
"name": "frontend",
"version": "0.1.0",
"private": true,
"dependencies": {
"@emotion/react": "^11.10.5",
"@mantine/core": "6.0.0",
"@mantine/hooks": "6.0.0",
"@testing-library/jest-dom": "^5.16.5",
"@testing-library/react": "^13.4.0",
"@testing-library/user-event": "^14.4.3",
"@types/jest": "^29.2.3",
"@types/node": "^18.11.9",
"@types/react": "^18.0.25",
"@types/react-dom": "^18.0.9",
"react": "^18.2.0",
"react-dom": "^18.2.0",
"react-router-dom": "^6.11.2",
"react-scripts": "5.0.1",
"typescript": "^4.9.3",
"web-vitals": "^3.1.0"
},
"scripts": {
"start": "react-scripts start",
"build": "react-scripts build",
"test": "react-scripts test",
"eject": "react-scripts eject",
"typecheck": "tsc --noEmit"
},
"eslintConfig": {
"extends": [
"react-app",
"react-app/jest"
]
},
"browserslist": {
"production": [
">0.2%",
"not dead",
"not op_mini all"
],
"development": [
"last 1 chrome version",
"last 1 firefox version",
"last 1 safari version"
]
}
}
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8" />
<link rel="icon" href="%PUBLIC_URL%/favicon.ico" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<meta name="theme-color" content="#000000" />
<meta
name="description"
content="Web site created using create-react-app"
/>
<link rel="apple-touch-icon" href="%PUBLIC_URL%/logo192.png" />
<!--
manifest.json provides metadata used when your web app is installed on a
user's mobile device or desktop. See https://developers.google.com/web/fundamentals/web-app-manifest/
-->
<link rel="manifest" href="%PUBLIC_URL%/manifest.json" />
<!--
Notice the use of %PUBLIC_URL% in the tags above.
It will be replaced with the URL of the `public` folder during the build.
Only files inside the `public` folder can be referenced from the HTML.
Unlike "/favicon.ico" or "favicon.ico", "%PUBLIC_URL%/favicon.ico" will
work correctly both with client-side routing and a non-root public URL.
Learn how to configure a non-root public URL by running `npm run build`.
-->
<title>React App</title>
</head>
<body>
<noscript>You need to enable JavaScript to run this app.</noscript>
<div id="root"></div>
<!--
This HTML file is a template.
If you open it directly in the browser, you will see an empty page.
You can add webfonts, meta tags, or analytics to this file.
The build step will place the bundled scripts into the <body> tag.
To begin the development, run `npm start` or `yarn start`.
To create a production bundle, use `npm run build` or `yarn build`.
-->
</body>
</html>
{
"short_name": "React App",
"name": "Create React App Sample",
"icons": [
{
"src": "favicon.ico",
"sizes": "64x64 32x32 24x24 16x16",
"type": "image/x-icon"
},
{
"src": "logo192.png",
"type": "image/png",
"sizes": "192x192"
},
{
"src": "logo512.png",
"type": "image/png",
"sizes": "512x512"
}
],
"start_url": ".",
"display": "standalone",
"theme_color": "#000000",
"background_color": "#ffffff"
}
# https://www.robotstxt.org/robotstxt.html
User-agent: *
Disallow:
import { createStyles } from "@mantine/core";
import { ThemeProvider } from "./ThemeProvider";
import { NavbarSimpleColored } from "./Components/NavBar/NavBar";
import {BrowserRouter as Router, Routes, Route} from 'react-router-dom';
import Knn from "./Components/k-NN/k-NN";
import SVM from "./Components/SVM/SVM";
import RandomForest from "./Components/RandomForest/RandomForest";
import LogisticRegression from "./Components/LogisticRegression/LogisticRegression";
import Settings from "./Components/Settings/Settings";
import Home from "./Components/Home/Home";
const useStyles = createStyles((theme) => ({
sections: {
display: "flex",
flexDirection: "row",
}
}))
export default function App() {
const { classes } = useStyles()
return (
<ThemeProvider>
<Router>
<div className={classes.sections}>
<NavbarSimpleColored />
<div style={{width: window.innerWidth/5*4}}>
<Routes>
<Route path='/' element={<Home />} />
<Route path='/svm' element={<SVM />} />
<Route path='/knn' element={<Knn />} />
<Route path='/rf' element={<RandomForest />} />
<Route path='/lr' element={<LogisticRegression />} />
<Route path='/settings' element={<Settings />} />
</Routes>
</div>
</div>
</Router>
</ThemeProvider>
);
}
function Home() {
return (
<div>Home</div>
)
}
export default Home
\ No newline at end of file
function LogisticRegression() {
return (
<div>LogisticRegression</div>
)
}
export default LogisticRegression
\ No newline at end of file
import { useState } from 'react';
import { createStyles, Navbar, Group, Code, getStylesRef, rem } from '@mantine/core';
import { useNavigate } from "react-router-dom";
import HOME_IMG from '../../assets/home-icon.png';
import SVM_IMG from '../../assets/svm-icon.png'
import KNN_IMG from '../../assets/knn-icon.png'
import LR_IMG from '../../assets/lr-icon.png'
import RF_IMG from '../../assets/rf-icon.png'
import SETTING_IMG from '../../assets/setting-icon.png'
import { LOGO } from '../consts';
const useStyles = createStyles((theme) => ({
navbar: {
backgroundColor: theme.fn.variant({ variant: 'filled', color: theme.primaryColor }).background,
},
version: {
backgroundColor: theme.fn.lighten(
theme.fn.variant({ variant: 'filled', color: theme.primaryColor }).background!,
0.1
),
color: theme.white,
fontWeight: 700,
},
header: {
paddingBottom: theme.spacing.md,
marginBottom: `calc(${theme.spacing.md} * 1.5)`,
borderBottom: `${rem(1)} solid ${theme.fn.lighten(
theme.fn.variant({ variant: 'filled', color: theme.primaryColor }).background!,
0.1
)}`,
},
footer: {
paddingTop: theme.spacing.md,
marginTop: theme.spacing.md,
borderTop: `${rem(1)} solid ${theme.fn.lighten(
theme.fn.variant({ variant: 'filled', color: theme.primaryColor }).background!,
0.1
)}`,
},
link: {
...theme.fn.focusStyles(),
display: 'flex',
alignItems: 'center',
textDecoration: 'none',
fontSize: theme.fontSizes.sm,
color: theme.white,
padding: `${theme.spacing.xs} ${theme.spacing.sm}`,
borderRadius: theme.radius.sm,
fontWeight: 500,
'&:hover': {
backgroundColor: theme.fn.lighten(
theme.fn.variant({ variant: 'filled', color: theme.primaryColor }).background!,
0.1
),
},
},
linkIcon: {
ref: getStylesRef('icon'),
color: theme.white,
// opacity: 0.75,
marginRight: theme.spacing.sm,
width: "30px",
height: "30px"
},
linkActive: {
'&, &:hover': {
backgroundColor: theme.fn.lighten(
theme.fn.variant({ variant: 'filled', color: theme.primaryColor }).background!,
0.15
),
[`& .${getStylesRef('icon')}`]: {
opacity: 0.9,
},
},
},
}));
const data = [
{ link: '/', label: 'Home', icon: HOME_IMG },
{ link: '/svm', label: 'SVM', icon: SVM_IMG },
{ link: '/knn', label: 'k-NN', icon: KNN_IMG },
{ link: '/rf', label: 'Random Forest', icon: RF_IMG },
{ link: '/lr', label: 'Logistic Regression', icon: LR_IMG },
];
export function NavbarSimpleColored() {
const { classes, cx } = useStyles();
const [active, setActive] = useState('Billing');
const navigate = useNavigate();
const links = data.map((item) => (
<a
className={cx(classes.link, { [classes.linkActive]: item.label === active })}
href=""
key={item.label}
onClick={(event) => {
event.preventDefault();
setActive(item.label);
navigate(item.link)
}}
>
{/* <item.icon className={classes.linkIcon} stroke={1.5} /> */}
<img src={item.icon} className={classes.linkIcon} />
<span>{item.label}</span>
</a>
));
return (
<Navbar height={window.innerHeight} width={{ sm: window.innerWidth/5 }} p="md" className={classes.navbar}>
<Navbar.Section grow>
<Group className={classes.header} position="apart">
{/* <MantineLogo size={28} inverted /> */}
<img src={LOGO} className={classes.linkIcon} />
<Code className={classes.version}>v1.0.0</Code>
</Group>
{links}
</Navbar.Section>
<Navbar.Section className={classes.footer}>
<a href="#" className={classes.link} onClick={(event) => {
event.preventDefault()
navigate("/settings")
}}>
{/* <IconSwitchHorizontal className={classes.linkIcon} stroke={1.5} /> */}
<img src={SETTING_IMG} className={classes.linkIcon} />
<span>Settings</span>
</a>
{/* <a href="#" className={classes.link} onClick={(event) => event.preventDefault()}>
<img src={HOME_IMG} className={classes.linkIcon} />
<span>Logout</span>
</a> */}
</Navbar.Section>
</Navbar>
);
}
\ No newline at end of file
function RandomForest() {
return (
<div>RandomForest</div>
)
}
export default RandomForest
\ No newline at end of file
function SVM() {
return (
<div>SVM</div>
)
}
export default SVM
\ No newline at end of file
function Settings() {
return (
<div>Settings</div>
)
}
export default Settings
\ No newline at end of file
import LOGO_IMG from '../assets/logo.png';
export const LOGO = LOGO_IMG;
\ No newline at end of file
function Knn() {
return (
<div>k-NN</div>
)
}
export default Knn
\ No newline at end of file
import { MantineProvider, MantineThemeOverride } from "@mantine/core";
export const theme: MantineThemeOverride = {
colorScheme: "light",
};
interface ThemeProviderProps {
children: React.ReactNode;
}
export function ThemeProvider({ children }: ThemeProviderProps) {
return (
<MantineProvider withGlobalStyles withNormalizeCSS theme={theme}>
{children}
</MantineProvider>
);
}
import { StrictMode } from "react";
import ReactDOM from "react-dom/client";
import App from "./App";
import reportWebVitals from "./reportWebVitals";
const root = ReactDOM.createRoot(
document.getElementById("root") as HTMLElement
);
root.render(
<StrictMode>
<App />
</StrictMode>
);
// If you want to start measuring performance in your app, pass a function
// to log results (for example: reportWebVitals(console.log))
// or send to an analytics endpoint. Learn more: https://bit.ly/CRA-vitals
reportWebVitals();
/// <reference types="react-scripts" />
import { ReportHandler } from 'web-vitals';
const reportWebVitals = (onPerfEntry?: ReportHandler) => {
if (onPerfEntry && onPerfEntry instanceof Function) {
import('web-vitals').then(({ getCLS, getFID, getFCP, getLCP, getTTFB }) => {
getCLS(onPerfEntry);
getFID(onPerfEntry);
getFCP(onPerfEntry);
getLCP(onPerfEntry);
getTTFB(onPerfEntry);
});
}
};
export default reportWebVitals;
// jest-dom adds custom jest matchers for asserting on DOM nodes.
// allows you to do things like:
// expect(element).toHaveTextContent(/react/i)
// learn more: https://github.com/testing-library/jest-dom
import '@testing-library/jest-dom';
{
"compilerOptions": {
"target": "es5",
"lib": [
"dom",
"dom.iterable",
"esnext"
],
"allowJs": true,
"skipLibCheck": true,
"esModuleInterop": true,
"allowSyntheticDefaultImports": true,
"strict": true,
"forceConsistentCasingInFileNames": true,
"noFallthroughCasesInSwitch": true,
"module": "esnext",
"moduleResolution": "node",
"resolveJsonModule": true,
"isolatedModules": true,
"noEmit": true,
"jsx": "react-jsx"
},
"include": [
"src"
]
}
This source diff could not be displayed because it is too large. You can view the blob instead.
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# T5"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Train"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Local"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from src.train.t5 import fit\n",
"from src.datasets import CFGenerativeDataset\n",
"from torch.utils.data import DataLoader, Subset\n",
"from transformers import T5ForConditionalGeneration"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"BATCH_SIZE=16\n",
"EPOCHS=100\n",
"PATIENCE=10\n",
"SAVE_DIR=\".\"\n",
"MODEL_NAME=\"t5-small\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"train_ds = CFGenerativeDataset(\"./configs/datasets/snli_1.0_contra.yaml\", \"./datasets/snli_1.0_contra\", split=\"train\")\n",
"val_ds = CFGenerativeDataset(\"./configs/datasets/snli_1.0_contra.yaml\", \"./datasets/snli_1.0_contra\", split=\"val\")\n",
"\n",
"subset_indices = list(range(100))\n",
"train_ds = Subset(train_ds, subset_indices)\n",
"val_ds = Subset(val_ds, subset_indices)\n",
"\n",
"train_dl = DataLoader(train_ds, batch_size=BATCH_SIZE, shuffle=True)\n",
"val_dl = DataLoader(val_ds, batch_size=BATCH_SIZE)\n",
"\n",
"model=T5ForConditionalGeneration.from_pretrained(MODEL_NAME)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"fit(\n",
" train_dl,\n",
" val_dl,\n",
" model,\n",
" epochs= 2,\n",
" patience= 10,\n",
" save_dir= \"models/t5-model\",\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Sagemaker"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from sagemaker.pytorch import PyTorch\n",
"from sagemaker.inputs import TrainingInput\n",
"\n",
"def train()->None:\n",
" estimator = PyTorch(\n",
" entry_point=f\"sagemaker_t5.py\",\n",
" role=\"arn:aws:iam::065257926712:role/SagemakerRole\",\n",
" framework_version=\"2.0\",\n",
" py_version=\"py310\",\n",
" source_dir=\"src\",\n",
" output_path=f\"s3://sliit-xai/training-jobs/results\",\n",
" code_location=f\"s3://sliit-xai/training-jobs/code\",\n",
" instance_count=1,\n",
" instance_type=\"ml.g4dn.xlarge\",\n",
" max_run=5 * 24 * 60 * 60\n",
" )\n",
" # Setting the input channels for tuning job\n",
" s3_input_train = TrainingInput(s3_data=\"s3://sliit-xai/datasets/snli_1.0_contra/\", s3_data_type=\"S3Prefix\")\n",
"\n",
" # Start job\n",
" estimator.fit(inputs={\"train\": s3_input_train})\n",
"\n",
"train()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Usage"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from src.cf_generators import T5Generator\n",
"\n",
"cf_gen = T5Generator(\"./configs/models/t5-cf-generator.yaml\", \"./models/t5-cf-generator\", download=True)\n",
"review = \"\\\"Ice Age\\\" is an animated masterpiece that captivates both young and old audiences alike. The film's heartwarming and humorous storyline follows a mismatched group of prehistoric creatures on an epic adventure, which is filled with laughter, action, and valuable life lessons. The endearing characters, including Manny the mammoth, Sid the sloth, and Diego the saber-toothed tiger, effortlessly steal our hearts with their lovable quirks and undeniable chemistry. The animation is visually stunning, with breathtaking ice-capped landscapes and attention to detail that immerses viewers in a prehistoric wonderland. The movie's witty dialogue, clever jokes, and hilarious antics ensure that every moment is a joy to watch. Beyond the entertainment, \\\"Ice Age\\\" touches on themes of friendship, acceptance, and the importance of family, making it a truly heartwarming experience. This timeless classic stands the test of time, and its charm remains undiminished, making it a must-watch for anyone seeking an enchanting and delightful cinematic experience.\"\n",
"sentence_count = 4\n",
"contrads = cf_gen(review, sentence_count)\n",
"\n",
"print(\"\\n\".join(contrads))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# WordFlippingGenerator"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from src.cf_generators import WordFlippingGenerator\n",
"\n",
"review = \"One of the other reviewers has mentioned that after watching just 1 Oz episode you'll be hooked. They are right, as this is exactly what happened with me. The first thing that struck me about Oz was its brutality and unflinching scenes of violence, which set in right from the word GO. Trust me, this is not a show for the faint hearted or timid. This show pulls no punches with regards to drugs, sex or violence. Its is hardcore, in the classic use of the word.<br /><br />It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary. It focuses mainly on Emerald City, an experimental section of the prison where all the cells have glass fronts and face inwards, so privacy is not high on the agenda. Em City is home to many..Aryans, Muslims, gangstas, Latinos, Christians, Italians, Irish and more....so scuffles, death stares, dodgy dealings and shady agreements are never far away.<br /><br />I would say the main appeal of the show is due to the fact that it goes where other shows wouldn't dare. Forget pretty pictures painted for mainstream audiences, forget charm, forget romance...OZ doesn't mess around. The first episode I ever saw struck me as so nasty it was surreal, I couldn't say I was ready for it, but as I watched more, I developed a taste for Oz, and got accustomed to the high levels of graphic violence. Not just violence, but injustice (crooked guards who'll be sold out for a nickel, inmates who'll kill on order and get away with it, well mannered, middle class inmates being turned into prison bitches due to their lack of street skills or prison experience) Watching Oz, you may become comfortable with what is uncomfortable viewing....thats if you can get in touch with your darker side.\"\n",
"sentence_count = 4\n",
"\n",
"config_path = \"./configs/models/wf-cf-generator.yaml\"\n",
"wf = WordFlippingGenerator(config_path)\n",
"contrads = wf(review, sentence_count)\n",
"print(\"\\n\".join(contrads))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"wf.describe_tags()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "xai",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.17"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from src.datasets import IMDBDataset\n",
"from src.models import RFModel, SVCModel, KNNModel, LRModel\n",
"import numpy as np"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/avishka/Personal/Projects/xai/src/datasets.py:25: MarkupResemblesLocatorWarning: The input looks more like a filename than markup. You may want to open this file and pass the filehandle into Beautiful Soup.\n",
" soup = BeautifulSoup(text, \"html.parser\")\n"
]
}
],
"source": [
"ds_config_path = \"./datasets/imdb/dataset.yaml\"\n",
"ds = IMDBDataset(ds_config_path)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"# Number of trees in random forest\n",
"n_estimators = np.linspace(start = 10, stop = 100, num = 10).astype(int).tolist()\n",
"# Maximum number of levels in tree\n",
"max_depth = np.linspace(10, 100, num = 5).astype(int).tolist()\n",
"max_depth.append(None)\n",
"# Minimum number of samples required to split a node\n",
"min_samples_split = [2, 5, 10]\n",
"# Minimum number of samples required at each leaf node\n",
"min_samples_leaf = [1, 2, 4]\n",
"# Method of selecting samples for training each tree\n",
"bootstrap = [True, False]\n",
"rf_model = RFModel(n_estimators, max_depth, min_samples_split, min_samples_leaf, bootstrap)"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"C = [0.1, 1, 10, 100]\n",
"gamma = [1, 0.1, 0.01, 0.001]\n",
"kernel = [\"rbf\"]\n",
"svc_model = SVCModel(C, gamma, kernel)"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"n_neighbors = [30, 40, 50, 60, 70, 80, 90]\n",
"metric = [\"manhattan\", \"minkowski\"]\n",
"weights = [\"uniform\", \"distance\"]\n",
"knn_model = KNNModel(n_neighbors, metric, weights)"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"penalty = [\"l1\", \"l2\", \"elasticnet\"]\n",
"C = np.logspace(-4, 4, 20)\n",
"solver = [\"lbfgs\", \"newton-cg\", \"sag\"]\n",
"max_iter = [100, 1000, 5000]\n",
"lr_model = LRModel(penalty, C, solver, max_iter)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "xai-env",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.11"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "9bVgiFiQYiSS"
},
"source": [
"# Initialisation"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"cellView": "form",
"executionInfo": {
"elapsed": 3329,
"status": "ok",
"timestamp": 1686739620332,
"user": {
"displayName": "Avishka Perera",
"userId": "05205841493968506808"
},
"user_tz": -330
},
"id": "P9uxhgMQYiSX",
"trusted": true
},
"outputs": [],
"source": [
"# @title Install Packages\n",
"\n",
"!pip install -qq ordered_set"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"cellView": "form",
"colab": {
"base_uri": "https://localhost:8080/"
},
"executionInfo": {
"elapsed": 3043,
"status": "ok",
"timestamp": 1686739623368,
"user": {
"displayName": "Avishka Perera",
"userId": "05205841493968506808"
},
"user_tz": -330
},
"id": "flFwmgm8Y14i",
"outputId": "8b5eb0fb-967d-4784-ba45-85f2b06fb919"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount(\"/content/drive\", force_remount=True).\n"
]
}
],
"source": [
"# @title Mount Google Drive for Credentials\n",
"\n",
"from google.colab import drive\n",
"drive.mount(\"/content/drive\")\n",
"!rm -r -f /content/sample_data\n",
"!cp -r /content/drive/MyDrive/.kaggle ~"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"executionInfo": {
"elapsed": 239763,
"status": "ok",
"timestamp": 1686739863120,
"user": {
"displayName": "Avishka Perera",
"userId": "05205841493968506808"
},
"user_tz": -330
},
"id": "5OexGopuYtfI",
"outputId": "7a083161-a021-4028-f7f9-bcc65476bea9"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Downloading imdb-dataset-of-50k-movie-reviews.zip to /content\n",
"\r 0% 0.00/25.7M [00:00<?, ?B/s]\r 51% 13.0M/25.7M [00:00<00:00, 134MB/s]\n",
"\r100% 25.7M/25.7M [00:00<00:00, 176MB/s]\n",
"mkdir: cannot create directory ‘/content/data’: File exists\n",
"replace /content/data/imdb-dataset-of-50k-movie-reviews/IMDB Dataset.csv? [y]es, [n]o, [A]ll, [N]one, [r]ename: Downloading models.zip to /content\n",
" 99% 16.0M/16.2M [00:00<00:00, 167MB/s]\n",
"100% 16.2M/16.2M [00:00<00:00, 168MB/s]\n",
"mkdir: cannot create directory ‘/content/models-weights’: File exists\n",
"replace /content/models-weights/grid_imdb_knn.pickle? [y]es, [n]o, [A]ll, [N]one, [r]ename: "
]
}
],
"source": [
"# @title Downloads\n",
"\n",
"# nltk\n",
"import nltk\n",
"nltk.download('wordnet', quiet=True)\n",
"nltk.download('stopwords', quiet=True)\n",
"nltk.download('punkt', quiet=True)\n",
"\n",
"# imdb sentiment dataset\n",
"!kaggle datasets download -d lakshmi25npathi/imdb-dataset-of-50k-movie-reviews\n",
"!mkdir /content/data\n",
"!mv ./imdb-dataset-of-50k-movie-reviews.zip /content/data/imdb-dataset-of-50k-movie-reviews.zip\n",
"!unzip -qq /content/data/imdb-dataset-of-50k-movie-reviews.zip -d /content/data/imdb-dataset-of-50k-movie-reviews\n",
"\n",
"# model weights\n",
"!kaggle datasets download -d tharushalekamge/models\n",
"!mkdir /content/models-weights\n",
"!mv ./models.zip /content/models-weights/models.zip\n",
"!unzip -qq /content/models-weights/models.zip -d /content/models-weights"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {
"cellView": "form",
"executionInfo": {
"elapsed": 22,
"status": "ok",
"timestamp": 1686739863122,
"user": {
"displayName": "Avishka Perera",
"userId": "05205841493968506808"
},
"user_tz": -330
},
"id": "FktEw3aCPeNg"
},
"outputs": [],
"source": [
"# @title Static paths\n",
"\n",
"dataset_csv_path = \"/content/data/imdb-dataset-of-50k-movie-reviews/IMDB Dataset.csv\"\n",
"model_weights_dir = \"/content/models-weights\""
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "7r-WrRBrOYJU"
},
"source": [
"# Create dataset"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {
"cellView": "form",
"executionInfo": {
"elapsed": 20,
"status": "ok",
"timestamp": 1686739863124,
"user": {
"displayName": "Avishka Perera",
"userId": "05205841493968506808"
},
"user_tz": -330
},
"id": "d0tuyM0-YiSa",
"trusted": true
},
"outputs": [],
"source": [
"# @title Module Imports\n",
"\n",
"import time\n",
"\n",
"import numpy as np # linear algebra\n",
"import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)\n",
"\n",
"from sklearn.model_selection import RandomizedSearchCV\n",
"from sklearn.model_selection import GridSearchCV\n",
"from sklearn.metrics import roc_auc_score, accuracy_score\n",
"from sklearn.model_selection import ParameterGrid\n",
"from sklearn.svm import SVC\n",
"import sklearn.feature_extraction\n",
"from sklearn.model_selection import train_test_split\n",
"from sklearn.pipeline import Pipeline\n",
"from sklearn.feature_extraction.text import TfidfVectorizer\n",
"from sklearn.feature_extraction.text import TfidfTransformer\n",
"\n",
"import nltk\n",
"from nltk.corpus import stopwords\n",
"from nltk.tokenize import word_tokenize\n",
"from nltk.stem import WordNetLemmatizer\n",
"\n",
"\n",
"from bs4 import BeautifulSoup\n",
"import re\n",
"import pickle\n",
"import seaborn as sns\n",
"\n",
"from ordered_set import OrderedSet\n",
"from scipy.sparse import lil_matrix\n",
"from itertools import compress"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {
"executionInfo": {
"elapsed": 17,
"status": "ok",
"timestamp": 1686739863125,
"user": {
"displayName": "Avishka Perera",
"userId": "05205841493968506808"
},
"user_tz": -330
},
"id": "peZNRrqQoCOK"
},
"outputs": [],
"source": [
"# @title Dataset definition\n",
"\n",
"class IMDBDataset:\n",
" def _strip_html(self, text):\n",
" soup = BeautifulSoup(text, \"html.parser\")\n",
" return soup.get_text()\n",
"\n",
" def _remove_special_characters(self, text, remove_digits=True):\n",
" pattern=r'[^a-zA-z0-9\\s]'\n",
" text=re.sub(pattern,'',text)\n",
" return text\n",
"\n",
" def _remove_stopwords(self, text, is_lower_case=False):\n",
" tokens = self.tokenizer.tokenize(text)\n",
" tokens = [token.strip() for token in tokens]\n",
" if is_lower_case:\n",
" filtered_tokens = [token for token in tokens if token not in self.stop_words]\n",
" else:\n",
" filtered_tokens = [token for token in tokens if token.lower() not in self.stop_words]\n",
" filtered_text = ' '.join(filtered_tokens)\n",
" return filtered_text\n",
"\n",
" def _lemmatize_text(self, text):\n",
" words=word_tokenize(text)\n",
" edited_text = ''\n",
" for word in words:\n",
" lemma_word=self.lemmatizer.lemmatize(word)\n",
" extra=\" \"+str(lemma_word)\n",
" edited_text+=extra\n",
" return edited_text\n",
"\n",
" def __init__(self, stop_words, tokenizer, lemmatizer, loaded_vectorizer, label_binarizer, dataset_csv_path):\n",
" self.stop_words = stop_words\n",
" self.tokenizer = tokenizer\n",
" self.lemmatizer = lemmatizer\n",
"\n",
" ## Import\n",
" data = pd.read_csv(dataset_csv_path)\n",
" data = data.sample(10000)\n",
"\n",
" ## Preprocess\n",
" data.review = data.review.str.lower()\n",
" data.review = data.review.apply(self._strip_html)\n",
" data.review = data.review.apply(self._remove_special_characters)\n",
" data.review = data.review.apply(self._remove_stopwords)\n",
" data.review = data.review.apply(self._lemmatize_text)\n",
"\n",
" ## Split Data\n",
" x_imdb = data['review']\n",
" y_imdb = data['sentiment']\n",
"\n",
" x_train_i, x_test_i, y_train_i, y_test_i = train_test_split(x_imdb,y_imdb,test_size=0.2)\n",
" x_test, x_val, y_test_i, y_val_i = train_test_split(x_test_i,y_test_i,test_size=0.5)\n",
"\n",
" ## X data\n",
" x_train_imdb = loaded_vectorizer.fit_transform(x_train_i)\n",
" x_test_imdb = loaded_vectorizer.transform(x_test)\n",
" x_val_imdb = loaded_vectorizer.transform(x_val)\n",
"\n",
" # Y data - Positive is 1\n",
" y_train_imdb = label_binarizer.fit_transform(y_train_i)\n",
" y_test_imdb = label_binarizer.fit_transform(y_test_i)\n",
" y_val_imdb = label_binarizer.fit_transform(y_val_i)\n",
"\n",
" self.x_train_imdb = x_train_imdb\n",
" self.x_test_imdb = x_test_imdb\n",
" self.x_val_imdb = x_val_imdb\n",
" self.y_train_imdb = y_train_imdb\n",
" self.y_test_imdb = y_test_imdb\n",
" self.y_val_imdb = y_val_imdb\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"executionInfo": {
"elapsed": 13663,
"status": "ok",
"timestamp": 1686739876774,
"user": {
"displayName": "Avishka Perera",
"userId": "05205841493968506808"
},
"user_tz": -330
},
"id": "6_ZrsD-hMj9-",
"outputId": "6c825b8e-0fb1-45e3-c408-30d370bb0778"
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"<ipython-input-36-e6a152c6ae4c>:5: MarkupResemblesLocatorWarning: The input looks more like a filename than markup. You may want to open this file and pass the filehandle into Beautiful Soup.\n",
" soup = BeautifulSoup(text, \"html.parser\")\n"
]
}
],
"source": [
"# @title Dataset instantiation\n",
"\n",
"loaded_vocab = pickle.load(open(f'{model_weights_dir}/vectorizer_imdb.pkl', 'rb'))\n",
"stop_words = set(stopwords.words('english'))\n",
"tokenizer = nltk.tokenize.toktok.ToktokTokenizer()\n",
"lemmatizer = WordNetLemmatizer()\n",
"loaded_vectorizer = TfidfVectorizer(min_df=2, vocabulary=loaded_vocab)\n",
"label_binarizer = sklearn.preprocessing.LabelBinarizer()\n",
"feature_names = loaded_vectorizer.get_feature_names_out()\n",
"\n",
"ds = IMDBDataset(stop_words, tokenizer, lemmatizer, loaded_vectorizer, label_binarizer, dataset_csv_path)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "5k-p3Hu3Ptw1"
},
"source": [
"# Training"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "up2MXZ7gYiSi"
},
"source": [
"## Train RF model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"execution": {
"iopub.status.busy": "2023-05-24T01:59:14.021625Z",
"iopub.status.idle": "2023-05-24T01:59:14.022487Z",
"shell.execute_reply": "2023-05-24T01:59:14.022216Z",
"shell.execute_reply.started": "2023-05-24T01:59:14.022189Z"
},
"id": "KD9f8npnYiSi"
},
"outputs": [],
"source": [
"# Number of trees in random forest\n",
"n_estimators = [int(x) for x in np.linspace(start = 10, stop = 100, num = 10)]\n",
"# Maximum number of levels in tree\n",
"max_depth = [int(x) for x in np.linspace(10, 100, num = 5)]\n",
"max_depth.append(None)\n",
"# Minimum number of samples required to split a node\n",
"min_samples_split = [2, 5, 10]\n",
"# Minimum number of samples required at each leaf node\n",
"min_samples_leaf = [1, 2, 4]\n",
"# Method of selecting samples for training each tree\n",
"bootstrap = [True, False]\n",
"# Create the grid\n",
"grid_rf = {'n_estimators': n_estimators,\n",
" 'max_depth': max_depth,\n",
" 'min_samples_split': min_samples_split,\n",
" 'min_samples_leaf': min_samples_leaf,\n",
" 'bootstrap': bootstrap}\n",
"print(grid_rf)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"execution": {
"iopub.status.busy": "2023-05-24T01:59:14.024015Z",
"iopub.status.idle": "2023-05-24T01:59:14.024474Z",
"shell.execute_reply": "2023-05-24T01:59:14.024286Z",
"shell.execute_reply.started": "2023-05-24T01:59:14.024258Z"
},
"id": "tEzjMAlVYiSj"
},
"outputs": [],
"source": [
"from sklearn.ensemble import RandomForestClassifier"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"execution": {
"iopub.status.busy": "2023-05-24T01:59:14.025697Z",
"iopub.status.idle": "2023-05-24T01:59:14.026125Z",
"shell.execute_reply": "2023-05-24T01:59:14.025929Z",
"shell.execute_reply.started": "2023-05-24T01:59:14.025909Z"
},
"id": "QZILUtoKYiSj"
},
"outputs": [],
"source": [
"grid_imdb_rf = RandomizedSearchCV(RandomForestClassifier(), param_distributions = grid_rf, n_iter = 200, cv = 3, verbose=2, random_state=42, n_jobs = -1)# Fit the random search model\n",
"# # Fit the random search model\n",
"grid_imdb_rf.fit(ds.x_train_imdb, ds.y_train_imdb.ravel())\n",
"pickle.dump(grid_imdb_rf, open('grid_imdb_rf.pickle', \"wb\"))"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "NuKAyto2YiSk"
},
"source": [
"## Train SVC Model"
]
},
{
"cell_type": "code",
"execution_count": 131,
"metadata": {
"executionInfo": {
"elapsed": 421,
"status": "ok",
"timestamp": 1686745062320,
"user": {
"displayName": "Avishka Perera",
"userId": "05205841493968506808"
},
"user_tz": -330
},
"id": "oVEo4umEYiSk"
},
"outputs": [],
"source": [
"# Param Optimisation\n",
"param_grid_imdb = {'C': [0.1,1, 10, 100], 'gamma': [1,0.1,0.01,0.001],'kernel': ['rbf']}\n",
"grid_imdb_svc = GridSearchCV(SVC(probability=True),param_grid_imdb,refit=True,verbose=2)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "myKK_fX8YiSk",
"outputId": "f9a0f90b-05e3-4295-f7c7-dcc327ebf7ed"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Fitting 5 folds for each of 16 candidates, totalling 80 fits\n"
]
}
],
"source": [
"grid_imdb_svc.fit(ds.x_train_imdb,ds.y_train_imdb.ravel())\n",
"pickle.dump(grid_imdb_svc, open('grid_imdb_svc.pickle', \"wb\"))"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "SY3pUI7EYiSl"
},
"source": [
"## Train KNN model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "W1wrjBhuYiSl"
},
"outputs": [],
"source": [
"from sklearn.neighbors import KNeighborsClassifier\n",
"grid_params_imdb_knn = { 'n_neighbors' : [30,40,50,60,70,80,90], 'metric' : ['manhattan', 'minkowski'], 'weights': ['uniform', 'distance']}\n",
"grid_imdb_knn = GridSearchCV(KNeighborsClassifier(), grid_params_imdb_knn, n_jobs=-1,verbose=2)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "5aZsJHx9YiSl"
},
"outputs": [],
"source": [
"grid_imdb_knn.fit(ds.x_train_imdb,np.ravel(ds.y_train_imdb,order='C'))\n",
"pickle.dump(grid_imdb_knn, open('grid_imdb_knn.pickle', \"wb\"))"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "JiEMj2yVYiSm"
},
"source": [
"## Train LR Model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "P-pJK2fpYiSm"
},
"outputs": [],
"source": [
"from sklearn.linear_model import LogisticRegression\n",
"param_grid_imdb_lr = [\n",
" {'penalty' : ['l1', 'l2', 'elasticnet'],\n",
" 'C' : np.logspace(-4, 4, 20),\n",
" 'solver' : ['lbfgs','newton-cg','sag'],\n",
" 'max_iter' : [100, 1000, 5000]\n",
" }\n",
"]\n",
"grid_imdb_lr = GridSearchCV(LogisticRegression(), param_grid = param_grid_imdb_lr, cv = 3, verbose=2, n_jobs=-1)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "3wb9Q7sKYiSm"
},
"outputs": [],
"source": [
"grid_imdb_lr.fit(ds.x_train_imdb, np.ravel(ds.y_train_imdb,order='C'))\n",
"pickle.dump(grid_imdb_lr, open('grid_imdb_lr.pickle', \"wb\"))"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "INf0tUGIYiSm"
},
"source": [
"# Load Models"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"executionInfo": {
"elapsed": 11,
"status": "ok",
"timestamp": 1686739876774,
"user": {
"displayName": "Avishka Perera",
"userId": "05205841493968506808"
},
"user_tz": -330
},
"id": "mg2it509YiSn",
"outputId": "823b8151-9b56-4240-fafc-330b6a10f761",
"trusted": true
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"<ipython-input-38-83c76ea3603f>:2: DeprecationWarning: Please use `csr_matrix` from the `scipy.sparse` namespace, the `scipy.sparse.csr` namespace is deprecated.\n",
" loaded_svc_imdb = pickle.load(open(f'{model_weights_dir}/grid_imdb_svc.pickle', \"rb\"))\n",
"<ipython-input-38-83c76ea3603f>:3: DeprecationWarning: Please use `csr_matrix` from the `scipy.sparse` namespace, the `scipy.sparse.csr` namespace is deprecated.\n",
" loaded_knn_imdb = pickle.load(open(f'{model_weights_dir}/grid_imdb_knn.pickle', \"rb\"))\n"
]
}
],
"source": [
"# Load\n",
"loaded_svc_imdb = pickle.load(open(f'{model_weights_dir}/grid_imdb_svc.pickle', \"rb\"))\n",
"loaded_lr_imdb = pickle.load(open(f'{model_weights_dir}/grid_imdb_lr.pickle', \"rb\"))\n",
"loaded_rf_imdb = pickle.load(open(f'{model_weights_dir}/grid_imdb_rf.pickle', \"rb\"))\n",
"loaded_knn_imdb = pickle.load(open(f'{model_weights_dir}/grid_imdb_knn.pickle', \"rb\"))"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"executionInfo": {
"elapsed": 7,
"status": "ok",
"timestamp": 1686739876775,
"user": {
"displayName": "Avishka Perera",
"userId": "05205841493968506808"
},
"user_tz": -330
},
"id": "rLnVkjyFYiSn",
"outputId": "9fa322cd-9956-40d3-c2e3-66f2dc05ec43",
"trusted": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'C': 10, 'gamma': 1, 'kernel': 'rbf'}\n",
"{'C': 4.281332398719396, 'max_iter': 100, 'penalty': 'l2', 'solver': 'lbfgs'}\n",
"{'n_estimators': 100, 'min_samples_split': 5, 'min_samples_leaf': 4, 'max_depth': None, 'bootstrap': False}\n",
"{'metric': 'minkowski', 'n_neighbors': 90, 'weights': 'distance'}\n"
]
}
],
"source": [
"print(loaded_svc_imdb.best_params_)\n",
"print(loaded_lr_imdb.best_params_)\n",
"print(loaded_rf_imdb.best_params_)\n",
"print(loaded_knn_imdb.best_params_)"
]
}
],
"metadata": {
"colab": {
"collapsed_sections": [
"up2MXZ7gYiSi",
"NuKAyto2YiSk",
"JiEMj2yVYiSm"
],
"provenance": []
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.10"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Tests"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## VDM\n",
"$$\n",
"d(x,y)=\\sqrt{\\sum_{a=1}^m{{vdm_a(x_a,y_a)}^2}} \\\\\n",
"vdm_a(x_a,y_a)=\\sum_{c=1}^C{{|\\frac{N_{a,x_a,c}}{N_{a,x_a}}-\\frac{N_{a,y_a,c}}{N_{a,y_a}}|}^q}\n",
"$$\n",
"* $N_{a,x}$ is the number of instances in the training set T that have value $x_a$ for attribute $a$\n",
"* $N_{a,x,c}$ is the number of instances in the training set T that have value $x_a$ for attribute $a$ and class $c$\n",
"* $C$ is the number of classes\n",
"* $q$ is a constant. Usually $1$ or $2$"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"\n",
"\n",
"def vdm(\n",
" x_a: float,\n",
" y_a: float,\n",
" a: int,\n",
" input_sample_space: np.ndarray = input_sample_space,\n",
" output_sample_space=output_sample_space,\n",
" q=1\n",
") -> float:\n",
" classes = np.unique(output_sample_space)\n",
" attribute_in = input_sample_space[:,a]\n",
" attribute_c = output_sample_space\n",
" vals = []\n",
" for c in classes:\n",
" n_x_c = ((attribute_c==c)&(attribute_in==x_a)).sum()\n",
" n_y_c = ((attribute_c==c)&(attribute_in==y_a)).sum()\n",
" n_x = (attribute_in==x_a).sum()\n",
" n_y = (attribute_in==y_a).sum()\n",
" diff = n_x_c/n_x-n_y_c/n_y\n",
" vals.append(diff)\n",
" val = (np.abs(vals)**q).sum()\n",
" \n",
" return val\n",
"\n",
"\n",
"def dist_vdm(x: np.ndarray, y: np.ndarray) -> float:\n",
" assert x.size == y.size, \"The lengths of the arrays must be equal\"\n",
" m = x.size\n",
" dist = 0\n",
" for a in range(m):\n",
" dist += vdm(x[a], y[a], a) ** 2\n",
" dist = np.sqrt(dist)\n",
"\n",
" return dist\n",
"\n",
"\n",
"dist_vdm(x1, x2)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## LIME"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import joblib\n",
"import numpy as np\n",
"from src.datasets import IMDBDataset\n",
"from lime.lime_tabular import LimeTabularExplainer\n",
"\n",
"ds = IMDBDataset(config_path=\"./configs/datasets/imdb.yaml\", root=\"datasets/imdb\")\n",
"ds.set_split(\"test\")\n",
"x1 = ds[0][0]\n",
"x2 = ds[1][0]\n",
"x1.shape\n",
"knn_classifier = joblib.load(\"models/analysis-models/knn.pkl\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Choose an instance to explain (index of a test point)\n",
"instance_index = 0\n",
"instance = ds[instance_index][0]\n",
"\n",
"# Create a LimeTabularExplainer instance\n",
"explainer = LimeTabularExplainer(ds.x_train, mode=\"classification\")\n",
"\n",
"# Generate an explanation for the chosen instance\n",
"explanation = explainer.explain_instance(instance, knn_classifier.predict_proba)\n",
"\n",
"# Display the explanation\n",
"# explanation.show_in_notebook()\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"cont = explanation.as_html()\n",
"with open(\"test.html\", \"w\") as handler:\n",
" handler.write(cont)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"exp = explanation.as_map()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Implementation"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Predefined configuration"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"======== Analysis Report ========\n",
"\n",
"Input text : One of the other reviewers has mentioned that after watching just 1 Oz episode you'll be hooked. They are right, as this is exactly what happened with me.<br /><br />The first thing that struck me about Oz was its brutality and unflinching scenes of violence, which set in right from the word GO. Trust me, this is not a show for the faint hearted or timid. This show pulls no punches with regards to drugs, sex or violence. Its is hardcore, in the classic use of the word.<br /><br />It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary. It focuses mainly on Emerald City, an experimental section of the prison where all the cells have glass fronts and face inwards, so privacy is not high on the agenda. Em City is home to many..Aryans, Muslims, gangstas, Latinos, Christians, Italians, Irish and more....so scuffles, death stares, dodgy dealings and shady agreements are never far away.<br /><br />I would say the main appeal of the show is due to the fact that it goes where other shows wouldn't dare. Forget pretty pictures painted for mainstream audiences, forget charm, forget romance...OZ doesn't mess around. The first episode I ever saw struck me as so nasty it was surreal, I couldn't say I was ready for it, but as I watched more, I developed a taste for Oz, and got accustomed to the high levels of graphic violence. Not just violence, but injustice (crooked guards who'll be sold out for a nickel, inmates who'll kill on order and get away with it, well mannered, middle class inmates being turned into prison bitches due to their lack of street skills or prison experience) Watching Oz, you may become comfortable with what is uncomfortable viewing....thats if you can get in touch with your darker side.\n",
"Input neighbor counts : {'negative': 34, 'positive': 56}\n",
"Input class probabilities : {'negative': 0.37777777777777777, 'positive': 0.6222222222222222}\n",
"Input class densities : {'negative': 25.998423572007233, 'positive': 44.53242593645316}\n",
"Input review class : positive\n",
"\n",
"Contradictory texts: \n",
"\tOne of the other reviewers has mentioned that after watching just 1 Oz episode you 'll be unhook . They are right , as this is exactly what dematerialize with me. < br / > < br / > The first thing that struck me about Oz was its brutality and unflinching scenes of violence , which set in right from the word GO . Trust me , this differ not a show for the faint hearted or timid . This show pulls no punches with regards to drugs , sex or violence . Its is hardcore , in the classic use of the word. < br / > < br / > It differ called OZ as that is the nickname starve to the Oswald Maximum Security State Penitentary . It blur mainly on Emerald City , an experimental section of the prison where all the cells lack glass fronts and face inwards , so privacy differ not high on the agenda . Em City is away to many .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and more .... so scuffles , death stares , dodgy dealings and shady agreements are never far away. < br / > < br / > I would say the main appeal of the show differ due to the fact that it malfunction where other shows would n't dare . Forget pretty pictures painted for mainstream audiences , forget charm , forget romance ... OZ unmake n't mess around . The first episode I ever saw struck me as so nasty it was surreal , I could n't say I differ ready for it , but as I watched more , I undeveloped a taste for Oz , and take away accustomed to the high levels of graphic violence . Not just violence , but injustice ( crooked guards who 'll be sold out for a nickel , inmates who 'll kill on order and end away with it , well mannered , middle class inmates differ turned into prison bitches due to their lack of street skills or prison experience ) Watching Oz , you may become comfortable with what differ uncomfortable viewing .... thats if you can end in touch with your darker side .\n",
"\tOne of the other reviewers lack mentioned that after watching just 1 Oz episode you 'll be hooked . They are right , as this differ exactly what dematerialise with me. < br / > < br / > The first thing that struck me about Oz differ its brutality and unflinching scenes of violence , which rise in right from the word GO . Trust me , this is not a show for the faint hearted or timid . This show pulls no punches with regards to drugs , sex or violence . Its is hardcore , in the classic use of the word. < br / > < br / > It differ called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary . It blur mainly on Emerald City , an experimental section of the prison where all the cells lack glass fronts and face inwards , so privacy is not high on the agenda . Em City is home to many .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and more .... so scuffles , death stares , dodgy dealings and shady agreements are never far away. < br / > < br / > I would say the main appeal of the show is due to the fact that it goes where other shows would n't dare . Forget pretty pictures unpainted for mainstream audiences , forget charm , forget romance ... OZ unmake n't mess around . The first episode I ever saw struck me as so nasty it was surreal , I could n't say I was ready for it , but as I watched more , I developed a taste for Oz , and got unaccustomed to the high levels of graphic violence . Not just violence , but injustice ( crooked guards who 'll differ buy out for a nickel , inmates who 'll kill on order and take away away with it , well mannered , middle class inmates being turned into prison bitches due to their lack of street skills or prison experience ) Watching Oz , you may become comfortable with what is uncomfortable viewing .... thats if you can take away in touch with your darker side .\n",
"Contradictory neighbor counts: \n",
"\t{'negative': 31, 'positive': 59}\n",
"\t{'negative': 32, 'positive': 58}\n",
"Contradictory class probabilities: \n",
"\t{'negative': 0.34444444444444444, 'positive': 0.6555555555555556}\n",
"\t{'negative': 0.35555555555555557, 'positive': 0.6444444444444445}\n",
"Contradictory class densities: \n",
"\t{'negative': 23.531772458395043, 'positive': 45.93646841277097}\n",
"\t{'negative': 24.454972081224167, 'positive': 45.695798135269165}\n",
"Closest counterfactual ID: 1\n",
"\n"
]
}
],
"source": [
"from src.analyzers.knn import KNNAnalyzer\n",
"analyzer = KNNAnalyzer(\n",
" knn_path=\"./models/analysis-models/knn.pkl\",\n",
" vectorizer_path=\"./models/analysis-models/tfidf.pkl\",\n",
" cf_generator_config=\"./configs/models/wf-cf-generator.yaml\"\n",
")\n",
"text=\"One of the other reviewers has mentioned that after watching just 1 Oz episode you'll be hooked. They are right, as this is exactly what happened with me.<br /><br />The first thing that struck me about Oz was its brutality and unflinching scenes of violence, which set in right from the word GO. Trust me, this is not a show for the faint hearted or timid. This show pulls no punches with regards to drugs, sex or violence. Its is hardcore, in the classic use of the word.<br /><br />It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary. It focuses mainly on Emerald City, an experimental section of the prison where all the cells have glass fronts and face inwards, so privacy is not high on the agenda. Em City is home to many..Aryans, Muslims, gangstas, Latinos, Christians, Italians, Irish and more....so scuffles, death stares, dodgy dealings and shady agreements are never far away.<br /><br />I would say the main appeal of the show is due to the fact that it goes where other shows wouldn't dare. Forget pretty pictures painted for mainstream audiences, forget charm, forget romance...OZ doesn't mess around. The first episode I ever saw struck me as so nasty it was surreal, I couldn't say I was ready for it, but as I watched more, I developed a taste for Oz, and got accustomed to the high levels of graphic violence. Not just violence, but injustice (crooked guards who'll be sold out for a nickel, inmates who'll kill on order and get away with it, well mannered, middle class inmates being turned into prison bitches due to their lack of street skills or prison experience) Watching Oz, you may become comfortable with what is uncomfortable viewing....thats if you can get in touch with your darker side.\"\n",
"analyzer(text, 2)\n",
"print(analyzer.explanation())"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Test bench"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"from src.test_bench import TestBench\n",
"\n",
"configurations = [\n",
" {\n",
" \"name\": \"adjectives\",\n",
" \"generator_config\": {\n",
" \"sample_prob_decay_factor\": 0.2,\n",
" \"flip_prob\": 0.5,\n",
" \"flipping_tags\": [\"JJ\", \"JJR\", \"JJS\"],\n",
" },\n",
" },\n",
" {\n",
" \"name\": \"nouns\",\n",
" \"generator_config\": {\n",
" \"sample_prob_decay_factor\": 0.2,\n",
" \"flip_prob\": 0.5,\n",
" \"flipping_tags\": [\"NN\", \"NNP\", \"NNPS\", \"NNS\"],\n",
" },\n",
" },\n",
" {\n",
" \"name\": \"adverbs\",\n",
" \"generator_config\": {\n",
" \"sample_prob_decay_factor\": 0.2,\n",
" \"flip_prob\": 0.5,\n",
" \"flipping_tags\": [\"RB\", \"RBR\", \"RBS\", \"RP\"],\n",
" },\n",
" },\n",
" {\n",
" \"name\": \"verbs\",\n",
" \"generator_config\": {\n",
" \"sample_prob_decay_factor\": 0.2,\n",
" \"flip_prob\": 0.5,\n",
" \"flipping_tags\": [\"VB\", \"VBD\", \"VBG\", \"VBN\", \"VBP\", \"VBZ\"],\n",
" },\n",
" },\n",
"]\n",
"text=\"One of the other reviewers has mentioned that after watching just 1 Oz episode you'll be hooked. They are right, as this is exactly what happened with me.<br /><br />The first thing that struck me about Oz was its brutality and unflinching scenes of violence, which set in right from the word GO. Trust me, this is not a show for the faint hearted or timid. This show pulls no punches with regards to drugs, sex or violence. Its is hardcore, in the classic use of the word.<br /><br />It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary. It focuses mainly on Emerald City, an experimental section of the prison where all the cells have glass fronts and face inwards, so privacy is not high on the agenda. Em City is home to many..Aryans, Muslims, gangstas, Latinos, Christians, Italians, Irish and more....so scuffles, death stares, dodgy dealings and shady agreements are never far away.<br /><br />I would say the main appeal of the show is due to the fact that it goes where other shows wouldn't dare. Forget pretty pictures painted for mainstream audiences, forget charm, forget romance...OZ doesn't mess around. The first episode I ever saw struck me as so nasty it was surreal, I couldn't say I was ready for it, but as I watched more, I developed a taste for Oz, and got accustomed to the high levels of graphic violence. Not just violence, but injustice (crooked guards who'll be sold out for a nickel, inmates who'll kill on order and get away with it, well mannered, middle class inmates being turned into prison bitches due to their lack of street skills or prison experience) Watching Oz, you may become comfortable with what is uncomfortable viewing....thats if you can get in touch with your darker side.\"\n",
"\n",
"tb = TestBench(\n",
" model_path=\"./models/analysis-models/knn.pkl\",\n",
" vectorizer_path=\"./models/analysis-models/tfidf.pkl\",\n",
" analyzer_name=\"knn\",\n",
" cf_generator_config=\"./configs/models/wf-cf-generator.yaml\",\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"reports = tb(configurations, text, 2)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"==== Configuration adjectives (1) ====\n",
"\n",
"======== Analysis Report ========\n",
"\n",
"Input text : One of the other reviewers has mentioned that after watching just 1 Oz episode you'll be hooked. They are right, as this is exactly what happened with me.<br /><br />The first thing that struck me about Oz was its brutality and unflinching scenes of violence, which set in right from the word GO. Trust me, this is not a show for the faint hearted or timid. This show pulls no punches with regards to drugs, sex or violence. Its is hardcore, in the classic use of the word.<br /><br />It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary. It focuses mainly on Emerald City, an experimental section of the prison where all the cells have glass fronts and face inwards, so privacy is not high on the agenda. Em City is home to many..Aryans, Muslims, gangstas, Latinos, Christians, Italians, Irish and more....so scuffles, death stares, dodgy dealings and shady agreements are never far away.<br /><br />I would say the main appeal of the show is due to the fact that it goes where other shows wouldn't dare. Forget pretty pictures painted for mainstream audiences, forget charm, forget romance...OZ doesn't mess around. The first episode I ever saw struck me as so nasty it was surreal, I couldn't say I was ready for it, but as I watched more, I developed a taste for Oz, and got accustomed to the high levels of graphic violence. Not just violence, but injustice (crooked guards who'll be sold out for a nickel, inmates who'll kill on order and get away with it, well mannered, middle class inmates being turned into prison bitches due to their lack of street skills or prison experience) Watching Oz, you may become comfortable with what is uncomfortable viewing....thats if you can get in touch with your darker side.\n",
"Input neighbor counts : {'negative': 34, 'positive': 56}\n",
"Input class probabilities : {'negative': 0.37777777777777777, 'positive': 0.6222222222222222}\n",
"Input class densities : {'negative': 25.998423572007233, 'positive': 44.53242593645316}\n",
"Input review class : positive\n",
"\n",
"Contradictory texts: \n",
"\tOne of the other reviewers has mentioned that after watching just 1 Oz episode you 'll be hooked . They are wrong , as this is exactly what happened with me. < br / > < br / > The middle thing that struck me about Oz was its brutality and unflinching scenes of violence , which set in right from the word GO . Trust me , this is not a show for the faint hearted or brave . This show pulls no punches with regards to drugs , sex or violence . Its is hardcore , in the classic use of the word. < br / > < br / > It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary . It focuses mainly on Emerald City , an experimental section of the prison where all the cells have glass fronts and face inwards , so privacy is not low spirits on the agenda . Em City is home to few .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and more .... so scuffles , death stares , dodgy dealings and shady agreements are never far away. < br / > < br / > I would say the main appeal of the show is due to the fact that it goes where same shows would n't dare . Forget immoderately pictures painted for mainstream audiences , forget charm , forget romance ... OZ does n't mess around . The last episode I ever saw struck me as so nice it was surreal , I could n't say I was ready for it , but as I watched more , I developed a taste for Oz , and got accustomed to the high levels of graphic violence . Not just violence , but injustice ( straight guards who 'll be sold out for a nickel , inmates who 'll kill on order and get away with it , well mannered , end class inmates being turned into prison bitches due to their lack of street skills or prison experience ) Watching Oz , you may become uncomfortable with what is uncomfortable viewing .... thats if you can get in touch with your darker side .\n",
"\tOne of the same reviewers has mentioned that after watching just 1 Oz episode you 'll be hooked . They are wrong , as this is exactly what happened with me. < br / > < br / > The first thing that struck me about Oz was its brutality and unflinching scenes of violence , which set in right from the word GO . Trust me , this is not a show for the faint hearted or timid . This show pulls no punches with regards to drugs , sex or violence . Its is hardcore , in the nonclassical use of the word. < br / > < br / > It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary . It focuses mainly on Emerald City , an experimental section of the prison where all the cells have glass fronts and face inwards , so privacy is not low spirits on the agenda . Em City is home to many .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and more .... so scuffles , death stares , dodgy dealings and shady agreements are never far away. < br / > < br / > I would say the main appeal of the show is undue to the fact that it goes where same shows would n't dare . Forget pretty pictures painted for mainstream audiences , forget charm , forget romance ... OZ does n't mess around . The first episode I ever saw struck me as so nice it was surreal , I could n't say I was ready for it , but as I watched more , I developed a taste for Oz , and got accustomed to the high levels of graphic violence . Not just violence , but injustice ( crooked guards who 'll be sold out for a nickel , inmates who 'll kill on order and get away with it , well mannered , late class inmates being turned into prison bitches undue to their lack of street skills or prison experience ) Watching Oz , you may become comfortable with what is comfortable viewing .... thats if you can get in touch with your darker side .\n",
"Contradictory neighbor counts: \n",
"\t{'negative': 28, 'positive': 62}\n",
"\t{'negative': 27, 'positive': 63}\n",
"Contradictory class probabilities: \n",
"\t{'negative': 0.3111111111111111, 'positive': 0.6888888888888889}\n",
"\t{'negative': 0.3, 'positive': 0.7}\n",
"Contradictory class densities: \n",
"\t{'negative': 21.605340281370125, 'positive': 49.19819210455166}\n",
"\t{'negative': 20.867382558209222, 'positive': 50.01536878659934}\n",
"Closest counterfactual ID: 0\n",
"\n",
"\n",
"==== Configuration nouns (2) ====\n",
"\n",
"======== Analysis Report ========\n",
"\n",
"Input text : One of the other reviewers has mentioned that after watching just 1 Oz episode you'll be hooked. They are right, as this is exactly what happened with me.<br /><br />The first thing that struck me about Oz was its brutality and unflinching scenes of violence, which set in right from the word GO. Trust me, this is not a show for the faint hearted or timid. This show pulls no punches with regards to drugs, sex or violence. Its is hardcore, in the classic use of the word.<br /><br />It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary. It focuses mainly on Emerald City, an experimental section of the prison where all the cells have glass fronts and face inwards, so privacy is not high on the agenda. Em City is home to many..Aryans, Muslims, gangstas, Latinos, Christians, Italians, Irish and more....so scuffles, death stares, dodgy dealings and shady agreements are never far away.<br /><br />I would say the main appeal of the show is due to the fact that it goes where other shows wouldn't dare. Forget pretty pictures painted for mainstream audiences, forget charm, forget romance...OZ doesn't mess around. The first episode I ever saw struck me as so nasty it was surreal, I couldn't say I was ready for it, but as I watched more, I developed a taste for Oz, and got accustomed to the high levels of graphic violence. Not just violence, but injustice (crooked guards who'll be sold out for a nickel, inmates who'll kill on order and get away with it, well mannered, middle class inmates being turned into prison bitches due to their lack of street skills or prison experience) Watching Oz, you may become comfortable with what is uncomfortable viewing....thats if you can get in touch with your darker side.\n",
"Input neighbor counts : {'negative': 34, 'positive': 56}\n",
"Input class probabilities : {'negative': 0.37777777777777777, 'positive': 0.6222222222222222}\n",
"Input class densities : {'negative': 25.998423572007233, 'positive': 44.53242593645316}\n",
"Input review class : positive\n",
"\n",
"Contradictory texts: \n",
"\tOne of the other reviewers has mentioned that after watching just 1 Oz episode you 'll be hooked . They are right , as this is exactly what happened with me. < br / > < br / > The first thing that struck me about Oz was its brutality and unflinching scenes of violence , which set in right from the word stay in place . distrust me , this is not a hide for the faint hearted or timid . This hide pulls no punches with regards to drugs , sex or violence . Its is hardcore , in the classic use of the word. < br / > < br / > It is called OZ as that is the nickname given to the Oswald minimal Security State Penitentary . It focuses mainly on Emerald City , an experimental section of the prison where all the cells have glass fronts and back inwards , so privacy is not high on the agenda . Em City is home to many .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and more .... so scuffles , birth stares , dodgy dealings and shady agreements are never far away. < br / > < br / > I would say the main appeal of the show is due to the fact that it goes where other disprove would n't dare . Forget pretty pictures painted for mainstream audiences , forget charm , forget romance ... OZ does n't mess around . The first episode I ever saw struck me as so nasty it was surreal , I could n't say I was ready for it , but as I watched more , I developed a taste for Oz , and got accustomed to the high levels of graphic violence . Not just violence , but justice ( crooked guards who 'll be sold out for a nickel , outpatient who 'll kill on order and get away with it , well mannered , middle class outpatient being turned into prison bitches due to their have of street skills or prison experience ) Watching Oz , you may become comfortable with what is uncomfortable viewing .... thats if you can get in touch with your darker side .\n",
"\tOne of the other reviewers has mentioned that after watching just 1 Oz episode you 'll be hooked . They are right , as this is exactly what happened with me. < br / > < br / > The first thing that struck me about Oz was its brutality and unflinching scenes of violence , which set in left from the word GO . Trust me , this is not a show for the faint hearted or timid . This show pulls no punches with regards to drugs , sex or violence . Its is hardcore , in the classic use of the word. < br / > < br / > It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary . It focuses mainly on Emerald City , an experimental section of the prison where all the cells have glass rear and face inwards , so privacy is not high on the agenda . Em City is home to many .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and more .... so scuffles , birth stares , dodgy dealings and shady disagreement are never far away. < br / > < br / > I would say the main appeal of the show is due to the fact that it goes where other disprove would n't dare . Forget pretty pictures painted for mainstream audiences , forget charm , forget romance ... OZ does n't mess around . The first episode I ever saw struck me as so nasty it was surreal , I could n't say I was ready for it , but as I watched more , I developed a taste for Oz , and got accustomed to the high raise of graphic violence . Not just violence , but justice ( crooked guards who 'll be sold out for a nickel , outpatient who 'll kill on disorder and get away with it , well mannered , middle class inmates being turned into prison bitches due to their lack of street skills or prison experience ) Watching Oz , you may become comfortable with what is uncomfortable viewing .... thats if you can get in touch with your light side .\n",
"Contradictory neighbor counts: \n",
"\t{'negative': 30, 'positive': 60}\n",
"\t{'negative': 29, 'positive': 61}\n",
"Contradictory class probabilities: \n",
"\t{'negative': 0.3333333333333333, 'positive': 0.6666666666666666}\n",
"\t{'negative': 0.32222222222222224, 'positive': 0.6777777777777778}\n",
"Contradictory class densities: \n",
"\t{'negative': 23.008724820298664, 'positive': 47.321844590411864}\n",
"\t{'negative': 22.341738692006995, 'positive': 48.37071038360361}\n",
"Closest counterfactual ID: 0\n",
"\n",
"\n",
"==== Configuration adverbs (3) ====\n",
"\n",
"======== Analysis Report ========\n",
"\n",
"Input text : One of the other reviewers has mentioned that after watching just 1 Oz episode you'll be hooked. They are right, as this is exactly what happened with me.<br /><br />The first thing that struck me about Oz was its brutality and unflinching scenes of violence, which set in right from the word GO. Trust me, this is not a show for the faint hearted or timid. This show pulls no punches with regards to drugs, sex or violence. Its is hardcore, in the classic use of the word.<br /><br />It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary. It focuses mainly on Emerald City, an experimental section of the prison where all the cells have glass fronts and face inwards, so privacy is not high on the agenda. Em City is home to many..Aryans, Muslims, gangstas, Latinos, Christians, Italians, Irish and more....so scuffles, death stares, dodgy dealings and shady agreements are never far away.<br /><br />I would say the main appeal of the show is due to the fact that it goes where other shows wouldn't dare. Forget pretty pictures painted for mainstream audiences, forget charm, forget romance...OZ doesn't mess around. The first episode I ever saw struck me as so nasty it was surreal, I couldn't say I was ready for it, but as I watched more, I developed a taste for Oz, and got accustomed to the high levels of graphic violence. Not just violence, but injustice (crooked guards who'll be sold out for a nickel, inmates who'll kill on order and get away with it, well mannered, middle class inmates being turned into prison bitches due to their lack of street skills or prison experience) Watching Oz, you may become comfortable with what is uncomfortable viewing....thats if you can get in touch with your darker side.\n",
"Input neighbor counts : {'negative': 34, 'positive': 56}\n",
"Input class probabilities : {'negative': 0.37777777777777777, 'positive': 0.6222222222222222}\n",
"Input class densities : {'negative': 25.998423572007233, 'positive': 44.53242593645316}\n",
"Input review class : positive\n",
"\n",
"Contradictory texts: \n",
"\tOne of the other reviewers has mentioned that after watching just 1 Oz episode you 'll be hooked . They are right , as this is imprecisely what happened with me. < br / > < br / > The first thing that struck me about Oz was its brutality and unflinching scenes of violence , which set in right from the word GO . Trust me , this is not a show for the faint hearted or timid . This show pulls no punches with regards to drugs , sex or violence . Its is hardcore , in the classic use of the word. < br / > < br / > It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary . It focuses mainly on Emerald City , an experimental section of the prison where all the cells have glass fronts and face inwards , so privacy is not high on the agenda . Em City is home to many .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and more .... so scuffles , death stares , dodgy dealings and shady agreements are ever far away. < br / > < br / > I would say the main appeal of the show is due to the fact that it goes where other shows would n't dare . Forget pretty pictures painted for mainstream audiences , forget charm , forget romance ... OZ does n't mess around . The first episode I never saw struck me as so nasty it was surreal , I could n't say I was ready for it , but as I watched more , I developed a taste for Oz , and got accustomed to the high levels of graphic violence . Not just violence , but injustice ( crooked guards who 'll be sold out for a nickel , inmates who 'll kill on order and get away with it , disadvantageously mannered , middle class inmates being turned into prison bitches due to their lack of street skills or prison experience ) Watching Oz , you may become comfortable with what is uncomfortable viewing .... thats if you can get in touch with your darker side .\n",
"\tOne of the other reviewers has mentioned that after watching inequitable 1 Oz episode you 'll be hooked . They are right , as this is imprecisely what happened with me. < br / > < br / > The first thing that struck me about Oz was its brutality and unflinching scenes of violence , which set in right from the word GO . Trust me , this is not a show for the faint hearted or timid . This show pulls no punches with regards to drugs , sex or violence . Its is hardcore , in the classic use of the word. < br / > < br / > It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary . It focuses mainly on Emerald City , an experimental section of the prison where all the cells have glass fronts and face inwards , so privacy is not high on the agenda . Em City is home to many .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and more .... so scuffles , death stares , dodgy dealings and shady agreements are ever far away. < br / > < br / > I would say the main appeal of the show is due to the fact that it goes where other shows would n't dare . Forget pretty pictures painted for mainstream audiences , forget charm , forget romance ... OZ does n't mess around . The first episode I never saw struck me as so nasty it was surreal , I could n't say I was ready for it , but as I watched less , I developed a taste for Oz , and got accustomed to the high levels of graphic violence . Not just violence , but injustice ( crooked guards who 'll be sold safe for a nickel , inmates who 'll kill on order and get away with it , well mannered , middle class inmates being turned into prison bitches due to their lack of street skills or prison experience ) Watching Oz , you may become comfortable with what is uncomfortable viewing .... thats if you can get in touch with your darker side .\n",
"Contradictory neighbor counts: \n",
"\t{'negative': 27, 'positive': 63}\n",
"\t{'negative': 27, 'positive': 63}\n",
"Contradictory class probabilities: \n",
"\t{'negative': 0.3, 'positive': 0.7}\n",
"\t{'negative': 0.3, 'positive': 0.7}\n",
"Contradictory class densities: \n",
"\t{'negative': 20.861371469611136, 'positive': 50.0508594241745}\n",
"\t{'negative': 20.86274130324263, 'positive': 50.036596370411694}\n",
"Closest counterfactual ID: 1\n",
"\n",
"\n",
"==== Configuration verbs (4) ====\n",
"\n",
"======== Analysis Report ========\n",
"\n",
"Input text : One of the other reviewers has mentioned that after watching just 1 Oz episode you'll be hooked. They are right, as this is exactly what happened with me.<br /><br />The first thing that struck me about Oz was its brutality and unflinching scenes of violence, which set in right from the word GO. Trust me, this is not a show for the faint hearted or timid. This show pulls no punches with regards to drugs, sex or violence. Its is hardcore, in the classic use of the word.<br /><br />It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary. It focuses mainly on Emerald City, an experimental section of the prison where all the cells have glass fronts and face inwards, so privacy is not high on the agenda. Em City is home to many..Aryans, Muslims, gangstas, Latinos, Christians, Italians, Irish and more....so scuffles, death stares, dodgy dealings and shady agreements are never far away.<br /><br />I would say the main appeal of the show is due to the fact that it goes where other shows wouldn't dare. Forget pretty pictures painted for mainstream audiences, forget charm, forget romance...OZ doesn't mess around. The first episode I ever saw struck me as so nasty it was surreal, I couldn't say I was ready for it, but as I watched more, I developed a taste for Oz, and got accustomed to the high levels of graphic violence. Not just violence, but injustice (crooked guards who'll be sold out for a nickel, inmates who'll kill on order and get away with it, well mannered, middle class inmates being turned into prison bitches due to their lack of street skills or prison experience) Watching Oz, you may become comfortable with what is uncomfortable viewing....thats if you can get in touch with your darker side.\n",
"Input neighbor counts : {'negative': 34, 'positive': 56}\n",
"Input class probabilities : {'negative': 0.37777777777777777, 'positive': 0.6222222222222222}\n",
"Input class densities : {'negative': 25.998423572007233, 'positive': 44.53242593645316}\n",
"Input review class : positive\n",
"\n",
"Contradictory texts: \n",
"\tOne of the other reviewers has mentioned that after watching just 1 Oz episode you 'll differ unhook . They differ right , as this differ exactly what dematerialise with me. < br / > < br / > The first thing that struck me about Oz was its brutality and unflinching scenes of violence , which rise in right from the word GO . Trust me , this differ not a show for the faint hearted or timid . This show pulls no punches with regards to drugs , sex or violence . Its differ hardcore , in the classic use of the word. < br / > < br / > It is called OZ as that differ the nickname given to the Oswald Maximum Security State Penitentary . It blur mainly on Emerald City , an experimental section of the prison where all the cells abstain glass fronts and face inwards , so privacy is not high on the agenda . Em City differ away to many .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and more .... so scuffles , death stares , dodgy dealings and shady agreements are never far away. < br / > < br / > I would say the main appeal of the show is due to the fact that it goes where other shows would n't dare . Forget pretty pictures unpainted for mainstream audiences , mind charm , mind romance ... OZ unmake n't mess around . The first episode I ever saw struck me as so nasty it was surreal , I could n't say I differ ready for it , but as I watched more , I undeveloped a taste for Oz , and got unaccustomed to the high levels of graphic violence . Not just violence , but injustice ( crooked guards who 'll be sold out for a nickel , inmates who 'll kill on order and get away with it , well mannered , middle class inmates being turned into prison bitches due to their lack of street skills or prison experience ) Watching Oz , you may become comfortable with what is uncomfortable viewing .... thats if you can leave in touch with your darker side .\n",
"\tOne of the other reviewers refuse mentioned that after watching just 1 Oz episode you 'll be undercharge . They are right , as this differ exactly what happened with me. < br / > < br / > The first thing that struck me about Oz differ its brutality and unflinching scenes of violence , which rise in right from the word GO . Trust me , this differ not a show for the faint hearted or timid . This show repel no punches with regards to drugs , sex or violence . Its is hardcore , in the classic use of the word. < br / > < br / > It is called OZ as that differ the nickname given to the Oswald Maximum Security State Penitentary . It blur mainly on Emerald City , an experimental section of the prison where all the cells lack glass fronts and face inwards , so privacy is not high on the agenda . Em City differ away to many .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and more .... so scuffles , death stares , dodgy dealings and shady agreements are never far away. < br / > < br / > I would say the main appeal of the show is due to the fact that it malfunction where other shows would n't dare . Forget pretty pictures painted for mainstream audiences , remember charm , forget romance ... OZ does n't mess around . The first episode I ever saw miss me as so nasty it was surreal , I could n't say I differ ready for it , but as I watched more , I developed a taste for Oz , and end accustomed to the high levels of graphic violence . Not just violence , but injustice ( crooked guards who 'll differ sold out for a nickel , inmates who 'll kill on order and get away with it , well mannered , middle class inmates being turned into prison bitches due to their lack of street skills or prison experience ) Watching Oz , you may become comfortable with what is uncomfortable viewing .... thats if you can end in touch with your darker side .\n",
"Contradictory neighbor counts: \n",
"\t{'negative': 30, 'positive': 60}\n",
"\t{'negative': 29, 'positive': 61}\n",
"Contradictory class probabilities: \n",
"\t{'negative': 0.3333333333333333, 'positive': 0.6666666666666666}\n",
"\t{'negative': 0.32222222222222224, 'positive': 0.6777777777777778}\n",
"Contradictory class densities: \n",
"\t{'negative': 22.793874525913832, 'positive': 46.633940090391654}\n",
"\t{'negative': 22.094729676315186, 'positive': 47.490755875996506}\n",
"Closest counterfactual ID: 0\n",
"\n",
"\n"
]
}
],
"source": [
"for report in reports:\n",
" print(report)\n",
" print()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Creating dataset\n",
"Initializing objects\n",
"Encoding\n",
"Dataset created\n"
]
}
],
"source": [
"from src.datasets import IMDBDataset\n",
"\n",
"ds = IMDBDataset(config_path=\"./configs/datasets/imdb.yaml\", root=\"datasets/imdb\")\n",
"tb.evaluate(ds.x_test, ds.y_test, save_dir=\"evaluations/knn\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "xai",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.17"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Dataset usage"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Matplotlib is building the font cache; this may take a moment.\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Creating dataset\n",
"Downloading from source (https://sliit-xai.s3.ap-south-1.amazonaws.com/datasets/imdb.zip) to c:\\Users\\DELL\\Desktop\\research\\xai\\datasets\\imdb\\imdb.zip\n",
"Initializing objects\n",
"Preprocessing\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"c:\\Users\\DELL\\Desktop\\research\\xai\\src\\processors.py:30: MarkupResemblesLocatorWarning: The input looks more like a filename than markup. You may want to open this file and pass the filehandle into Beautiful Soup.\n",
" soup = BeautifulSoup(text, \"html.parser\")\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Encoding\n",
"Dataset created\n",
"(4999, 11612) (40000, 11612) (5000, 11612) (4999,) (40000,) (5000,)\n"
]
}
],
"source": [
"from src.datasets import IMDBDataset\n",
"ds = IMDBDataset(config_path=\"./configs/datasets/imdb.yaml\", root=\"datasets/imdb\", download=True)\n",
"ds.set_split(\"train\")\n",
"print(ds.x_test.shape, ds.x_train.shape, ds.x_val.shape, ds.y_test.shape, ds.y_train.shape, ds.y_val.shape)\n",
"x, y = ds[0]"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Model Usage"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Downloading from source (https://sliit-xai.s3.ap-south-1.amazonaws.com/models/analysis-models.zip) to c:\\Users\\DELL\\Desktop\\research\\xai\\models\\analysis-models\\analysis-models.zip\n",
"A collection of pretrained sklearn models.\n",
"Contains the models ['knn', 'lr', 'rf', 'svm']\n"
]
}
],
"source": [
"from src.models import AnalysisModels\n",
"models = AnalysisModels(config_path=\"./configs/models/analysis-models.yaml\", root=\"models/analysis-models\", download=True)\n",
"print(models)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"models.knn(\"This is a nice movie\"), models.rf(\"This is very boring\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"models.knn.model.predict_proba"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "xai",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.17"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
This source diff could not be displayed because it is too large. You can view the blob instead.
This source diff could not be displayed because it is too large. You can view the blob instead.
This source diff could not be displayed because it is too large. You can view the blob instead.
from .test_bench import TestBench
from .knn import KNNAnalyzer
from .svm import SVMMirrorAnalyzer as SVMAnalyzer
from .rf import RFAnalyzer
from .lr import LRAnalyzer
from typing import Any
class BaseAnalyzer:
_model = None
_text_vectorizer = None
_cf_generator = None
_report_data = None
def explanation(self) -> str:
raise NotImplementedError("Method not implemented yet.")
def __call__(self, text: str, search_space: int) -> str:
raise NotImplementedError("Method not implemented yet.")
def set_config(self, config) -> None:
raise NotImplementedError("Method not implemented yet.")
import numpy as np
from ..processors import TextVectorizer
from ..cf_generators import T5Generator, WordFlippingGenerator
from typing import Tuple, List, Dict, Union, Any
import yaml
from sklearn.neighbors import KNeighborsClassifier
from .base import BaseAnalyzer
import scipy
import joblib
class KNNAnalyzer(BaseAnalyzer):
_data_labels = ["negative", "positive"]
def __init__(
self,
knn_path: str,
vectorizer_path: str,
cf_generator_config: str = None,
cf_generator_root: str = None,
) -> None:
supported_cf_gens = ("t5-cf-generator", "wf-cf-generator")
if type(cf_generator_config) == str:
with open(cf_generator_config) as handler:
cf_gen_nm = yaml.load(handler, yaml.FullLoader)["name"]
else:
cf_gen_nm = cf_generator_config["name"]
if cf_gen_nm not in supported_cf_gens:
raise ValueError(
f"Unsupported Counterfactual Generator definition. Supported generators are {supported_cf_gens}"
)
model = joblib.load(knn_path)
text_vectorizer = TextVectorizer(vectorizer_path)
if cf_gen_nm == "t5-cf-generator":
assert (
type(cf_generator_config) == str
), "'cf_generator_config' must be a path object"
cf_generator = T5Generator(cf_generator_config, cf_generator_root)
else:
cf_generator = WordFlippingGenerator(cf_generator_config)
self._model: KNeighborsClassifier = model
self._text_vectorizer = text_vectorizer
self._cf_generator = cf_generator
self._report_data = {}
def _add_labels(self, inp: Union[Dict[int, Any], np.ndarray]) -> Dict[str, Any]:
assert len(inp) == len(self._data_labels)
if type(inp) == dict:
out = {self._data_labels[k]: v for (k, v) in inp.items()}
elif type(inp) == np.ndarray:
out = {lbl: v for (lbl, v) in zip(self._data_labels, inp)}
else:
raise ValueError("Unsupported data type")
return out
def explanation(self) -> str:
rd = self._report_data
if type(rd["output"]) == str:
report = f"""
======== Analysis Report ========
Input text : {rd["input"]["text"]}
Input neighbor counts : {self._add_labels(rd['input']['nb_cnts'])}
Input class probabilities : {self._add_labels(rd['input']['probs'])}
Input class densities : {self._add_labels(rd['input']['densi'])}
Input review class : {self._data_labels[rd['input']['review_cls']]}
Contadictions : {rd["output"]}"""
else:
contra_txts = rd["output"]["text"]
contra_nb_cnts = [
str(self._add_labels(cnt)) for cnt in rd["output"]["nb_cnts"]
]
contra_probs = [str(self._add_labels(cnt)) for cnt in rd["output"]["probs"]]
contra_densts = [
str(self._add_labels(cnt)) for cnt in rd["output"]["densi"]
]
tabbed_newline = "\n\t"
report = f"""
======== Analysis Report ========
Input text : {rd["input"]["text"]}
Input neighbor counts : {self._add_labels(rd['input']['nb_cnts'])}
Input class probabilities : {self._add_labels(rd['input']['probs'])}
Input class densities : {self._add_labels(rd['input']['densi'])}
Input review class : {self._data_labels[rd['input']['review_cls']]}
Contradictory texts: {tabbed_newline+tabbed_newline.join(contra_txts)}
Contradictory neighbor counts: {tabbed_newline+tabbed_newline.join(contra_nb_cnts)}
Contradictory class probabilities: {tabbed_newline+tabbed_newline.join(contra_probs)}
Contradictory class densities: {tabbed_newline+tabbed_newline.join(contra_densts)}
Closest counterfactual ID: {rd['output']['matching_cf_id']}
"""
return report
def _get_neighbour_stat(self, vects: List[np.ndarray]) -> List[Dict[int, int]]:
distances, indices = self._model.kneighbors(vects)
k_nearest_labels = np.array(
[self._model.classes_[self._model._y[idx]] for idx in indices]
)
counts = []
densts = []
for i in range(vects.shape[0]):
count = {}
densi = {}
for c in self._model.classes_:
count[c] = (k_nearest_labels[i] == c).sum()
lbl_dists = distances[i][k_nearest_labels[i] == c]
densi[c] = len(lbl_dists) ** 2 / lbl_dists.sum()
counts.append(count)
densts.append(densi)
return counts, densts
def _get_probs(self, counts: Tuple[Dict[int, int]]) -> np.ndarray:
probs = np.array([list(count.values()) for count in counts])
probs = probs / self._model.n_neighbors
return probs
def _get_farthest_id(self, densts: List[np.ndarray], rv_cls: int) -> int:
cf_cls = 1 - rv_cls
(review_densi, *contra_densts) = densts
review_densi = review_densi[rv_cls]
contra_densts = np.array([densi[cf_cls] for densi in contra_densts])
diff = np.abs(contra_densts - review_densi)
fthst_id = diff.argmin()
return fthst_id
def __call__(self, text: str, search_space: int) -> str:
self._report_data["input"] = {"text": text, "search_space": search_space}
# 1. generate contradictions
contradictions = self._cf_generator(text, search_space)
# 2. project to the vectorizer's vector space (X)
vect = self._text_vectorizer([text, *contradictions]) # (1+search_space,n)
# 3. get neighbour statistics
counts, densts = self._get_neighbour_stat(vect)
(review_count, *contra_counts) = counts
# 4. get probabilities
probs = self._get_probs(counts)
(review_prob, *contra_probs) = probs
review_cls = review_prob.argmax()
if len(densts) > 1:
# 5. get farthest opposite
fthst_id = self._get_farthest_id(densts, review_cls)
# 6. select counter factual
cf = contradictions[fthst_id]
# 7. reporting
(review_densi, *contra_densts) = densts
input = {
"text": text,
"nb_cnts": review_count,
"probs": review_prob,
"review_cls": review_cls,
"densi": review_densi,
}
output = {
"text": contradictions,
"nb_cnts": contra_counts,
"probs": contra_probs,
"densi": contra_densts,
"matching_cf_id": fthst_id,
}
self._report_data["input"] = input
self._report_data["output"] = output
return cf
else:
# no contradictions were available for the given configuration
review_densi = densts[0]
input = {
"text": text,
"nb_cnts": review_count,
"probs": review_prob,
"review_cls": review_cls,
"densi": review_densi,
}
self._report_data["input"] = input
self._report_data[
"output"
] = "No contradictions possible for the given test case configuration"
def set_config(self, config) -> None:
self._cf_generator.set_config(config["generator_config"])
import time
import numpy as np
from scipy.sparse import lil_matrix, csr_matrix
from ordered_set import OrderedSet
from nltk.corpus import wordnet
import joblib
from itertools import compress
from ..processors import TextVectorizer
from .base import BaseAnalyzer
import json
from typing import Dict, Any
class LRAnalyzer(BaseAnalyzer):
"""Class for generating evidence counterfactuals for classifiers on behavioral/text data"""
def __init__(
self,
model_path,
vectorizer_path,
threshold_classifier,
max_iter=100,
max_explained=1,
BB=True,
max_features=30,
time_maximum=120,
):
"""Init function
Args:
classifier_fn: [function] classifier prediction probability function
or decision function. For ScikitClassifiers, this is classifier.predict_proba
or classifier.decision_function or classifier.predict_log_proba.
Make sure the function only returns one (float) value. For instance, if you
use a ScikitClassifier, transform the classifier.predict_proba as follows:
def classifier_fn(X):
c=classification_model.predict_proba(X)
y_predicted_proba=c[:,1]
return y_predicted_proba
threshold_classifier: [float] the threshold that is used for classifying
instances as positive or not. When score or probability exceeds the
threshold value, then the instance is predicted as positive.
We have no default value, because it is important the user decides
a good value for the threshold.
feature_names: [numpy.array] contains the interpretable feature names,
such as the words themselves in case of document classification or the names
of visited URLs.
max_iter: [int] maximum number of iterations in the search procedure.
Default is set to 50.
max_explained: [int] maximum number of EDC explanations generated.
Default is set to 1.
BB: [“True” or “False”] when the algorithm is augmented with
branch-and-bound (BB=True), one is only interested in the (set of)
shortest explanation(s). Default is "True".
max_features: [int] maximum number of features allowed in the explanation(s).
Default is set to 30.
time_maximum: [int] maximum time allowed to generate explanations,
expressed in minutes. Default is set to 2 minutes (120 seconds).
"""
self.threshold_classifier = np.float64(threshold_classifier)
self.max_iter = max_iter
self.max_explained = max_explained
self.BB = BB
self.max_features = max_features
self.time_maximum = time_maximum
self.revert = None
self.initial_class = None
input_encoder = joblib.load(vectorizer_path)
self.feature_names = input_encoder.get_feature_names_out()
loaded_vocab = input_encoder.vocabulary_
self.loaded_vocab = loaded_vocab
model = joblib.load(model_path)
self._model = model
coefficients = self._model.coef_
self.coefficients = coefficients.reshape(-1)
text_vectorizer = TextVectorizer(vectorizer_path)
self._text_vectorizer = text_vectorizer
self._report_data = {}
def _print_ref_instance(self, ref_inst):
printable_array = []
indices_active_elements = np.nonzero(ref_inst)[1]
for item in indices_active_elements:
printable_array.append(".." + self.feature_names[item] + "..")
print(printable_array)
def _get_antonyms(self, word):
""" " Get antonyms of a word and their indices in the feature vector
Args:
word: word to get antonyms for
Returns:
tuple of antonyms and their indices in the feature vector
"""
antonyms = []
temp_dict = {}
for syn in wordnet.synsets(word):
for i in syn.lemmas():
if i.antonyms():
antonyms.append(i.antonyms()[0].name())
antonyms = list(set(antonyms))
for word in antonyms:
if word in self.loaded_vocab:
temp_dict[word] = abs(self.coefficients[self.loaded_vocab[word]])
if len(temp_dict) > 0:
max_importance_idx = max(temp_dict, key=temp_dict.get)
return [self.loaded_vocab[max_importance_idx]]
else:
return []
def _perturb_fn(self, x, inst, print_flag=0):
"""Function to perturb instance x -> Deform the array -> assign 0 to the x-th column"""
"""
Returns perturbed instance inst
"""
inst[:, x] = 0
return inst
def _replace_fn(self, x, y, inst, print_flag=0):
"""Function to perturb instance x -> Deform the array -> assign 0 to the x-th column"""
"""
Returns perturbed instance inst
"""
new_inst = inst.copy()
try:
temp_x = inst[:, x]
temp_y = inst[:, y]
new_inst[:, x] = temp_y
new_inst[:, y] = temp_x
except:
new_inst[:, x] = 0
return new_inst
def _classifier_fn(self, x, negative_to_positive=0):
"""Returns the prediction probability of class 1 -> Not class 0"""
prediction = self._model.predict_proba(x)
# If prediction is [1] retrurn the probability of class 1 else return probability of class 0
if negative_to_positive == 1:
return prediction[:, 0]
return prediction[:, 1]
def _print_instance(self, pert_inst, ref_inst):
"""Function to print the perturbed instance"""
"""
Returns perturbed instance inst
"""
feature_names = self.feature_names
indices_active_elements_ref = np.nonzero(ref_inst)[1]
indices_active_elements_pert = np.nonzero(pert_inst)[1]
ref_set = set(indices_active_elements_ref)
pert_set = set(indices_active_elements_pert)
# elements in ref_set but not in pert_set
removed_word_indices = ref_set - pert_set
# elements in pert_set but not in ref_set
added_word_indices = pert_set - ref_set
printable_array = []
for item in indices_active_elements_ref:
printable_array.append(".." + feature_names[item] + "..")
# Change formatting of removed words
for item in removed_word_indices:
printable_array[
printable_array.index(".." + feature_names[item] + "..")
] = ("--" + feature_names[item] + "--")
# change formatting of added words
for item in added_word_indices:
printable_array.append("++" + feature_names[item] + "++")
printable_array.append(" --> class 1 Score = ")
printable_array.append(self._classifier_fn(pert_inst)[0])
print(printable_array)
return printable_array
def _conditional_replace_fn(self, x, y, inst, print_flag=0):
for i in range(len(x)):
if isinstance(y[i], str):
inst[:, x[i]] = 0
else:
temp_x = inst[:, x[i]]
temp_y = inst[:, y[i]]
inst[:, x[i]] = temp_y
inst[:, y[i]] = temp_x
return inst
def _expand_and_prune(
self,
comb,
replacement_comb_to_expand,
expanded_combis,
feature_set,
candidates_to_expand,
candidates_to_expand_replacements,
explanations_sets,
explanation_replacement_sets,
scores_candidates_to_expand,
instance,
cf,
revert=0,
replacements=[],
):
"""Function to expand "best-first" feature combination and prune explanation_candidates and candidates_to_expand"""
comb = OrderedSet(comb)
replacement_comb_to_expand = OrderedSet(replacement_comb_to_expand)
print("\n\n")
expanded_combis.append(comb)
old_candidates_to_expand = [frozenset(x) for x in candidates_to_expand]
old_candidates_to_expand = set(old_candidates_to_expand)
feature_set_new = []
feature_set_new_replacements = []
## If the feature is not in the current combination -> add it to a new list
for feature in feature_set:
list_feature = list(feature)
if len(comb & feature) == 0: # set operation: intersection
replacement_feature = self._get_antonyms(
self.feature_names[list_feature[0]]
)
replacement_feature = frozenset(replacement_feature)
if replacement_feature == frozenset():
new_string = "0" * (len(comb) + 1)
replacement_feature = frozenset([new_string])
# print("replacement_feature: ", replacement_feature, "feature: ", feature)
feature_set_new.append(
feature
) # If the feature is not in the current combination to remove from the instance
feature_set_new_replacements.append(replacement_feature)
# Add each element in the new set -> which were initially not present -> to the accepted combination -> create new combinations -> (EXPANSION)
new_explanation_candidates = []
new_explanation_candidates_replacements = []
for i in range(len(feature_set_new)):
union = comb | feature_set_new[i]
union_replacements = (
replacement_comb_to_expand | feature_set_new_replacements[i]
)
new_explanation_candidates.append(
union
) # Create new combinations to remove from the instance
new_explanation_candidates_replacements.append(union_replacements)
# Add new explanation candidates to the list of candidates to expand
candidates_to_expand_notpruned = candidates_to_expand.copy()
candidates_to_expand_replacements_notpruned = (
candidates_to_expand_replacements.copy()
)
# for new_candidate in new_explanation_candidates:
# candidates_to_expand_notpruned.append(new_candidate)
for i in range(len(new_explanation_candidates_replacements)):
candidates_to_expand_notpruned.append(new_explanation_candidates[i])
candidates_to_expand_replacements_notpruned.append(
new_explanation_candidates_replacements[i]
)
# Calculate scores of new combinations and add to scores_candidates_to_expand
# perturb each new candidate and get the score for each.
replaced_instances = []
for i in range(len(new_explanation_candidates)):
replaced_instances.append(
self._conditional_replace_fn(
x=new_explanation_candidates[i],
y=new_explanation_candidates_replacements[i],
inst=instance.copy(),
print_flag=1,
)
)
# -------------------------------------------
print("No of instances after perturbation: ", len(replaced_instances))
perturbed_instances = replaced_instances
print("\nExpanded sentences from the above chosen combination")
for item in perturbed_instances:
self._print_instance(item, instance.copy())
scores_perturbed_new = [cf(x, revert) for x in perturbed_instances]
## Append the newly created score array to the passes existing array
scores_candidates_to_expand_notpruned = (
scores_candidates_to_expand + scores_perturbed_new
)
# create a dictionary of scores dictionary where the
# keys are string representations of the candidates from candidates_to_expand_notpruned, and the
# values are the corresponding scores from scores_candidates_to_expand_notpruned
dictionary_scores = dict(
zip(
[str(x) for x in candidates_to_expand_notpruned],
scores_candidates_to_expand_notpruned,
)
)
# *** Pruning step: remove all candidates to expand that have an explanation as subset ***
candidates_to_expand_pruned_explanations = []
candidates_to_expand_pruned_replacements_explanations = []
# Write the above function using list indices
for i in range(len(candidates_to_expand_notpruned)):
pruning = 0
for explanation in explanations_sets:
if (explanation.issubset(candidates_to_expand_notpruned[i])) or (
explanation == candidates_to_expand_notpruned[i]
):
pruning = pruning + 1
if pruning == 0:
candidates_to_expand_pruned_explanations.append(
candidates_to_expand_notpruned[i]
)
candidates_to_expand_pruned_replacements_explanations.append(
candidates_to_expand_replacements_notpruned[i]
)
# Each element is frozen as a set
candidates_to_expand_pruned_explanations_frozen = [
frozenset(x) for x in candidates_to_expand_pruned_explanations
]
candidates_to_expand_pruned_replacements_explanations_frozen = [
frozenset(x) for x in candidates_to_expand_pruned_replacements_explanations
]
# But the total set f frozen sets are not frozen
candidates_to_expand_pruned_explanations_ = set(
candidates_to_expand_pruned_explanations_frozen
)
candidates_to_expand_pruned_replacements_explanations_ = set(
candidates_to_expand_pruned_replacements_explanations_frozen
)
expanded_combis_frozen = [frozenset(x) for x in expanded_combis]
expanded_combis_ = set(expanded_combis_frozen)
# *** Pruning step: remove all candidates to expand that are in expanded_combis *** -> Same as above
candidates_to_expand_pruned = (
candidates_to_expand_pruned_explanations_ - expanded_combis_
)
candidates_to_expand_pruned_replacements = (
candidates_to_expand_pruned_replacements_explanations_ - expanded_combis_
)
ind_dict = dict(
(k, i)
for i, k in enumerate(candidates_to_expand_pruned_explanations_frozen)
)
indices = [ind_dict[x] for x in candidates_to_expand_pruned]
candidates_to_expand = [
candidates_to_expand_pruned_explanations[i] for i in indices
]
candidates_to_expand_replacements = [
candidates_to_expand_pruned_replacements_explanations[i] for i in indices
]
# The new explanation candidates are the ones that are NOT in the old list of candidates to expand
new_explanation_candidates_pruned = (
candidates_to_expand_pruned - old_candidates_to_expand
)
candidates_to_expand_frozen = [frozenset(x) for x in candidates_to_expand]
candidates_to_expand_replacements_frozen = [
frozenset(x) for x in candidates_to_expand_replacements
]
ind_dict2 = dict((k, i) for i, k in enumerate(candidates_to_expand_frozen))
indices2 = [ind_dict2[x] for x in new_explanation_candidates_pruned]
explanation_candidates = [candidates_to_expand[i] for i in indices2]
explanation_candidates_replacements = [
candidates_to_expand_replacements[i] for i in indices2
]
# Get scores of the new candidates and explanations.
scores_candidates_to_expand = [
dictionary_scores[x] for x in [str(c) for c in candidates_to_expand]
]
scores_explanation_candidates = [
dictionary_scores[x] for x in [str(c) for c in explanation_candidates]
]
return (
explanation_candidates,
explanation_candidates_replacements,
candidates_to_expand,
candidates_to_expand_replacements,
expanded_combis,
scores_candidates_to_expand,
scores_explanation_candidates,
)
def __call__(self, text: str, search_space: int = None) -> None:
"""Generates evidence counterfactual explanation for the instance.
ONLY IF THE CURRENT INSTANCE IS POSITIVE -> Limitation
Args:
instance: [numpy.array or sparse matrix] instance to explain
Returns:
A dictionary where:
explanation_set: explanation(s) ranked from high to low change
in predicted score or probability.
The number of explanations shown depends on the argument max_explained.
number_active_elements: number of active elements of
the instance of interest.
number_explanations: number of explanations found by algorithm.
minimum_size_explanation: number of features in the smallest explanation.
time_elapsed: number of seconds passed to generate explanation(s).
explanations_score_change: change in predicted score/probability
when removing the features in the explanation, ranked from
high to low change.
"""
# *** INITIALIZATION ***
print("Start initialization...")
tic = time.time()
input_string = "Given a vector"
if isinstance(text, str):
input_string = str(text)
text = self._text_vectorizer(text)
text = lil_matrix(text)
print("initial sentence is ... ")
print(text.get_shape())
self._print_ref_instance(text)
iteration = 0
nb_explanations = 0
minimum_size_explanation = np.nan
explanations = []
explanations_replacements = []
explanations_sets = []
explanation_replacement_sets = []
explanations_score_change = []
expanded_combis = []
score_predicted = self._classifier_fn(text) ## Returns Prediction Prob
# Intial class is 1 is score is greater than threshold
if score_predicted > self.threshold_classifier:
self.initial_class = [1]
else:
self.initial_class = [0]
self.revert = 1
print(
"score_predicted ",
score_predicted,
" initial_class ",
self.initial_class,
)
self._report_data["input"] = {
"text": input_string,
"score for positive": score_predicted[0],
"initial class": self.initial_class[0],
}
reference = np.reshape(
np.zeros(np.shape(text)[1]), (1, len(np.zeros(np.shape(text)[1])))
)
reference = csr_matrix(reference)
indices_active_elements = np.nonzero(text)[
1
] ## -> Gets non zero elements in the instance as an array [x, y, z]
number_active_elements = len(indices_active_elements)
indices_active_elements = indices_active_elements.reshape(
(number_active_elements, 1)
) ## -> Reshape to get a predictable
candidates_to_expand = (
[]
) # -> These combinations are further expanded -> These are the elements to be removed from the sentence
for features in indices_active_elements:
candidates_to_expand.append(OrderedSet(features))
## > Gets an array with each element in reshaped incides as an ordered set -> [OrderedSet([430]), OrderedSet([588]), OrderedSet([595])]
candidates_to_expand_replacements = []
for features in indices_active_elements:
candidates_to_expand_replacements.append(
OrderedSet(self._get_antonyms(self.feature_names[features[0]]))
)
for i in range(len(candidates_to_expand_replacements)):
if candidates_to_expand_replacements[i] == OrderedSet():
candidates_to_expand_replacements[i] = OrderedSet(["0"])
explanation_candidates = candidates_to_expand.copy()
explanation_candidates_replacements = candidates_to_expand_replacements.copy()
## Gets a copy of the above array -> Initially
feature_set = [
frozenset(x) for x in indices_active_elements
] ## Immutable -> can be used as keys in dictionary
## Used features in the current x-reference -> incides of the words in the review.
print("Initialization is complete.")
print("\n Elapsed time %d \n" % (time.time() - tic))
# *** WHILE LOOP ***
while (
(iteration < self.max_iter)
and (nb_explanations < self.max_explained)
and (len(candidates_to_expand) != 0)
and (len(explanation_candidates) != 0)
and ((time.time() - tic) < self.time_maximum)
):
## Stop if maximum iterations exceeded
# number of explanations generated is greater than the maximum explanations
# There are no candidates to expand
# There are no explanation candidates -> Used to force stop while loop below
# Or maximum allowed time exceeded
iteration += 1
print("\n Iteration %d \n" % iteration)
if iteration == 1:
print("Run in first iteration -> perturbation done \n")
replacements = [
self._get_antonyms(self.feature_names[x[0]])
for x in explanation_candidates
]
# convert each element in replacement to a OrderedSet
replacements = explanation_candidates_replacements
perturbed_instances = []
print("After changing or removing words, ")
replaced_instances = []
for i in range(len(explanation_candidates)):
if replacements[i] == OrderedSet(["0"]):
replaced_instances.append(
self._perturb_fn(
x=explanation_candidates[i], inst=text.copy()
)
)
else:
replaced_instances.append(
self._replace_fn(
x=explanation_candidates[i],
y=replacements[i],
inst=text.copy(),
)
)
# Remove the elements in the indices given by the ordered set x and return an array fo such elements
# Removes only one element in the first run -> Contains sentences with one word removed
perturbed_instances = replaced_instances
for instance_p in perturbed_instances:
self._print_instance(instance_p, text)
scores_explanation_candidates = [
self._classifier_fn(x, self.revert) for x in perturbed_instances
]
# Get predictions for each perturbed instance where one or more elements are removed from the initial instance
# It is in form of [[x], [y], [z]]
scores_candidates_to_expand = scores_explanation_candidates.copy()
scores_perturbed_new_combinations = [
x[0] for x in scores_explanation_candidates
]
# Therefore get it to the shape [x, y, z] by getting the [0] th element of each element array
# ***CHECK IF THERE ARE EXPLANATIONS***
new_explanations = list(
compress(
explanation_candidates,
scores_perturbed_new_combinations < self.threshold_classifier,
)
)
new_explanation_replacements = list(
compress(
explanation_candidates_replacements,
scores_perturbed_new_combinations < self.threshold_classifier,
)
)
# Get explanation candidates where their probability is less than the threshold classifier -> Positive becomes negative
explanations += list(
compress(
explanation_candidates,
scores_perturbed_new_combinations < self.threshold_classifier,
)
)
explanations_replacements += list(
compress(
explanation_candidates_replacements,
scores_perturbed_new_combinations < self.threshold_classifier,
)
)
nb_explanations += len(
list(
compress(
explanation_candidates,
scores_perturbed_new_combinations < self.threshold_classifier,
)
)
) # Update number of explanations which pass the required threshold
explanations_sets += list(
compress(
explanation_candidates,
scores_perturbed_new_combinations < self.threshold_classifier,
)
)
explanation_replacement_sets += list(
compress(
explanation_candidates_replacements,
scores_perturbed_new_combinations < self.threshold_classifier,
)
)
explanations_sets = [
set(x) for x in explanations_sets
] # Convert each array to a set -> to get the words
explanation_replacement_sets = [
set(x) for x in explanation_replacement_sets
]
explanations_score_change += list(
compress(
scores_explanation_candidates,
scores_perturbed_new_combinations < self.threshold_classifier,
)
)
# Adjust max_length
if self.BB == True:
if len(explanations) != 0:
lengths = [] # Record length of each explanation found
for explanation in explanations:
lengths.append(len(explanation))
lengths = np.array(lengths)
max_length = lengths.min()
# Get minimum length of the found explanations as max length -> Do not search for explanations with longer length
else:
max_length = number_active_elements # Else can find maximum length equal to number of words in instance
else:
max_length = number_active_elements
print("\n-------------Max length updated to - ", max_length)
# Eliminate combinations from candidates_to_expand ("best-first" candidates) that can not be expanded
# Pruning based on Branch & Bound=True, max. features allowed and number of active features
candidates_to_expand_updated = []
candidates_to_expand_updated_replacements = []
scores_candidates_to_expand_updated = (
[]
) # enumerate -> Find count of || to list one after another
for j, combination in enumerate(candidates_to_expand):
if (
(len(combination) < number_active_elements)
and (len(combination) < max_length)
and (len(combination) < self.max_features)
):
# Combination length should be less than the words in the input and max length of the required explanation and required maximum features
candidates_to_expand_updated.append(
combination
) # If the combination matches, it is further expanded
scores_candidates_to_expand_updated.append(
scores_candidates_to_expand[j]
)
# Add the prediction score to the new array
# get the score from the scores_candidates_to_expand using the current index
candidates_to_expand_updated_replacements.append(
candidates_to_expand_replacements[j]
)
# Add the replacement to the new array
print(
"\nlen(candidates_to_expand_updated)",
len(candidates_to_expand_updated),
"--- If 0, loop terminates ",
)
print(
"\n If nb_explanations",
nb_explanations,
" >= self.max_explained, loop terminates ",
self.max_explained,
)
# *** IF LOOP ***
# expanding the candidates to update will exceed the max length set in the earlier loop
if (len(candidates_to_expand_updated) == 0) or (
nb_explanations >= self.max_explained
):
## If the number of explanations exceeded the required number
## or no candidates
## no explanations present
# stop algorithm
explanation_candidates = []
## Found all the candidates
elif len(candidates_to_expand_updated) != 0:
## If there are possible candidates
explanation_candidates = []
it = 0 # Iteration of the while loop
indices = []
scores_candidates_to_expand2 = []
for score in scores_candidates_to_expand_updated:
if score[0] < self.threshold_classifier:
scores_candidates_to_expand2.append(2 * score_predicted)
else:
scores_candidates_to_expand2.append(score)
print(
len(explanation_candidates),
it,
"<",
len(scores_candidates_to_expand2),
)
# *** WHILE LOOP ***
while (
(len(explanation_candidates) == 0)
and (it < len(scores_candidates_to_expand2))
and ((time.time() - tic) < self.time_maximum)
):
# Stop if candidates are found or looped through more than there are candidates or maximum time reached
print("While loop iteration %d" % it)
if it != 0: # Because indices are not there in the first iteration
for index in indices:
scores_candidates_to_expand2[index] = 2 * score_predicted
# do elementwise subtraction between score_predicted and scores_candidates_to_expand2
subtractionList = []
for item in scores_candidates_to_expand2:
subtractionList.append(item - score_predicted)
# Do element wise subtraction between the prediction score of the x_ref and every element of the scores_candidates_to_expand2
index_combi_max = np.argmax(subtractionList)
if self.revert == 0:
index_combi_max = np.argmax(subtractionList)
else:
index_combi_max = np.argmin(subtractionList)
# Get the index of the maximum value -> Expand it
indices.append(index_combi_max)
expanded_combis.append(
candidates_to_expand_updated[index_combi_max]
)
# Add this combination to already expanded combinations as it will be expanded next by expand and prune function
comb_to_expand = candidates_to_expand_updated[index_combi_max]
replacement_comb_to_expand = (
candidates_to_expand_updated_replacements[index_combi_max]
)
words_comb_selected = []
for item in candidates_to_expand_updated[index_combi_max]:
words_comb_selected.append(self.feature_names[item])
print("The chosen combination of words is ", words_comb_selected)
self._print_instance(
self._conditional_replace_fn(
candidates_to_expand_updated[index_combi_max],
candidates_to_expand_updated_replacements[index_combi_max],
text.copy(),
),
text,
)
print(
"It has a score of ",
scores_candidates_to_expand_updated[index_combi_max][0],
)
# Expand the found combination with highest difference
func = self._expand_and_prune(
comb_to_expand,
replacement_comb_to_expand,
expanded_combis,
feature_set,
candidates_to_expand_updated,
candidates_to_expand_updated_replacements,
explanations_sets,
explanation_replacement_sets,
scores_candidates_to_expand_updated,
text,
self._classifier_fn,
self.revert,
replacements,
)
"""Returns:
- explanation_candidates: combinations of features that are explanation
candidates to be checked in the next iteration
- candidates_to_expand: combinations of features that are candidates to
expanded in next iterations or candidates for "best-first"
- expanded_combis: [list] list of combinations of features that are already
expanded as "best-first"
- scores_candidates_to_expand: scores after perturbation for the candidate
combinations of features to be expanded
- scores_explanation_candidates: scores after perturbation of explanation candidates"""
explanation_candidates = func[0]
explanation_candidates_replacements = func[1]
candidates_to_expand = func[2]
candidates_to_expand_replacements = func[3]
expanded_combis = func[4]
scores_candidates_to_expand = func[5]
scores_explanation_candidates = func[6]
it += 1
print(
"\n\n\niteration - ", iteration, " self.max_iter - ", self.max_iter
)
print("\n Elapsed time %d \n" % (time.time() - tic))
# *** FINAL PART OF ALGORITHM ***
print("Iterations are done.")
explanation_set = []
explanation_feature_names = []
index_of_min_length_explanation = -1
for i in range(len(explanations)):
explanation_feature_names = []
for features in explanations[i]:
explanation_feature_names.append(self.feature_names[features])
explanation_set.append(explanation_feature_names)
if len(explanations) == 0:
self._report_data["output"] = {
"Replacements": None,
"final_text": "No counterfactuals found",
"final score for positive": score_predicted[0],
"final class": self.initial_class[0],
}
self._report_data["process"] = {
"final_exp": None,
"number active elements": number_active_elements,
"number explanations found": 0,
"size smallest explanation": int(0),
"time elapsed": time.time() - tic,
"differences score": 0,
"iterations": iteration,
"final_sentence": None,
}
print("No explanations Found")
return
else:
lengths_explanation = []
for explanation in explanations:
l = len(explanation)
lengths_explanation.append(l)
minimum_size_explanation = np.min(lengths_explanation)
index_of_min_length_explanation = np.argmin(lengths_explanation)
try:
print("argmin", explanations[index_of_min_length_explanation])
except:
pass
print("Final sentence")
final_sentence = self._conditional_replace_fn(
explanations[index_of_min_length_explanation],
explanations_replacements[index_of_min_length_explanation],
text.copy(),
)
final_sentence = self._print_instance(final_sentence, text)
new_instance = text.copy()
new_replacements = []
replacement_features = []
for feature in explanations[index_of_min_length_explanation]:
feature_replacement = self._get_antonyms(self.feature_names[feature])
print("feature_replacement", feature_replacement)
new_replacements.append(feature_replacement)
# print("new_replacements", replacement_dict)
print("replacementfeature", self.feature_names[new_replacements[0]])
replacementsDict = {}
output_removed_words = []
for item in explanations[index_of_min_length_explanation]:
output_removed_words.append(self.feature_names[item])
try:
replacementWords = []
for item_ind in range(len(new_replacements)):
print("new_replacements[item_ind]", new_replacements[item_ind])
if new_replacements[item_ind]:
replacementWords.append(
{
"feature": self.feature_names[
explanations[index_of_min_length_explanation][item_ind]
],
"replacement": self.feature_names[
new_replacements[item_ind]
][0],
}
)
replacementsDict[
self.feature_names[
explanations[index_of_min_length_explanation][item_ind]
]
] = self.feature_names[new_replacements[item_ind]][0]
else:
replacementWords.append(
{
"feature": self.feature_names[
explanations[index_of_min_length_explanation][item_ind]
],
"replacement": "--",
}
)
replacementsDict[
self.feature_names[
explanations[index_of_min_length_explanation][item_ind]
]
] = "--"
print("replacementWords", replacementWords)
except:
pass
new_insatnce = text.copy()
index_of_min_length_explanation = -1
for relpacement_feature_index in range(len(new_replacements)):
if new_replacements[relpacement_feature_index] != []:
new_insatnce = self._replace_fn(
x=explanations[index_of_min_length_explanation][
relpacement_feature_index
],
y=new_replacements[relpacement_feature_index],
inst=new_insatnce,
)
replacement_features.append(
self.feature_names[new_replacements[relpacement_feature_index]]
)
else:
new_insatnce = self._perturb_fn(
explanations[index_of_min_length_explanation][
relpacement_feature_index
],
new_insatnce,
)
replacement_features.append(
self.feature_names[relpacement_feature_index]
)
final_prob = self._classifier_fn(new_insatnce)
print("final_prob", final_prob)
final_exp = []
for i in range(len(explanations[index_of_min_length_explanation])):
if new_replacements[i] != []:
final_exp.append(
[
output_removed_words[i],
self.feature_names[new_replacements[i]][0],
]
)
else:
final_exp.append([output_removed_words[i], "---"])
number_explanations = len(explanations)
if np.size(explanations_score_change) > 1:
inds = np.argsort(explanations_score_change, axis=0)
inds = np.fliplr([inds])[0]
inds_2 = []
for i in range(np.size(inds)):
inds_2.append(inds[i][0])
explanation_set_adjusted = []
for i in range(np.size(inds)):
j = inds_2[i]
explanation_set_adjusted.append(explanation_set[j])
explanations_score_change_adjusted = []
for i in range(np.size(inds)):
j = inds_2[i]
explanations_score_change_adjusted.append(
float(explanations_score_change[j][0])
)
explanation_set = explanation_set_adjusted
explanations_score_change = explanations_score_change_adjusted
else:
explanations_score_change = [float(e[0]) for e in explanations_score_change]
time_elapsed = time.time() - tic
print("\n Total elapsed time %d \n" % time_elapsed)
indices_active_elements = np.nonzero(text)[1]
# Find the elements in indices_active_elements_explain that are not in indices_active_elements
print("indices_active_elements", indices_active_elements)
final_string = "Given a vector"
if input_string != "Given a vector":
words = input_string.split()
final_words = []
for word in words:
if word in replacementsDict:
final_words.append(replacementsDict[word])
else:
final_words.append(word)
# Join the words back into a final string
final_string = " ".join(final_words)
if final_prob > self.threshold_classifier:
final_class = [1]
else:
final_class = [0]
self._report_data["output"] = {
"Replacements": replacementWords,
"final_text": final_string,
"final score for positive": final_prob[0],
"final class": final_class[0],
}
self._report_data["process"] = {
"final_exp": final_exp,
"number active elements": number_active_elements,
"number explanations found": number_explanations,
"size smallest explanation": int(minimum_size_explanation),
"time elapsed": time_elapsed,
"differences score": explanations_score_change[0 : self.max_explained],
"iterations": iteration,
"final_sentence": final_sentence,
}
def explanation(self) -> str:
return json.dumps(self._report_data, indent=4)
def set_config(self, config: Dict[str, Any]) -> None:
"""
Config must contain the following keys:
{
"threshold_classifier": float,
"max_iter": int,
"time_maximum": int,
}
"""
self.threshold_classifier = np.float64(config["threshold_classifier"])
self.max_iter = config["max_iter"]
self.time_maximum = config["time_maximum"]
import time
import numpy as np
from scipy.sparse import lil_matrix, csr_matrix
from ordered_set import OrderedSet
import joblib
from itertools import compress
from ..processors import TextVectorizer
from .base import BaseAnalyzer
import json
from typing import Dict, Any
class RFAnalyzer(BaseAnalyzer):
"""Class for generating evidence counterfactuals for classifiers on behavioral/text data"""
def __init__(
self,
model_path,
vectorizer_path,
threshold_classifier,
max_iter=100,
max_explained=1,
BB=True,
max_features=30,
time_maximum=120,
):
"""Init function
Args:
classifier_fn: [function] classifier prediction probability function
or decision function. For ScikitClassifiers, this is classifier.predict_proba
or classifier.decision_function or classifier.predict_log_proba.
Make sure the function only returns one (float) value. For instance, if you
use a ScikitClassifier, transform the classifier.predict_proba as follows:
def classifier_fn(X):
c=classification_model.predict_proba(X)
y_predicted_proba=c[:,1]
return y_predicted_proba
threshold_classifier: [float] the threshold that is used for classifying
instances as positive or not. When score or probability exceeds the
threshold value, then the instance is predicted as positive.
We have no default value, because it is important the user decides
a good value for the threshold.
feature_names: [numpy.array] contains the interpretable feature names,
such as the words themselves in case of document classification or the names
of visited URLs.
max_iter: [int] maximum number of iterations in the search procedure.
Default is set to 50.
max_explained: [int] maximum number of EDC explanations generated.
Default is set to 1.
BB: [“True” or “False”] when the algorithm is augmented with
branch-and-bound (BB=True), one is only interested in the (set of)
shortest explanation(s). Default is "True".
max_features: [int] maximum number of features allowed in the explanation(s).
Default is set to 30.
time_maximum: [int] maximum time allowed to generate explanations,
expressed in minutes. Default is set to 2 minutes (120 seconds).
"""
self.threshold_classifier = np.float64(threshold_classifier)
self.max_iter = max_iter
self.max_explained = max_explained
self.BB = BB
self.max_features = max_features
self.time_maximum = time_maximum
self.revert = None
self.initial_class = None
input_encoder = joblib.load(vectorizer_path)
feature_names = input_encoder.get_feature_names_out()
self.feature_names = feature_names
loaded_vocab = input_encoder.vocabulary_
self.loaded_vocab = loaded_vocab
model = joblib.load(model_path)
self._model = model
text_vectorizer = TextVectorizer(vectorizer_path)
self._text_vectorizer = text_vectorizer
self._report_data = {}
def _print_ref_instance(self, ref_inst):
printable_array = []
indices_active_elements = np.nonzero(ref_inst)[1]
for item in indices_active_elements:
printable_array.append(".." + self.feature_names[item] + "..")
print(printable_array)
def _perturb_fn(self, x, inst, print_flag=0):
"""Function to perturb instance x -> Deform the array -> assign 0 to the x-th column"""
"""
Returns perturbed instance inst
"""
inst[:, x] = 0
return inst
def _replace_fn(self, x, y, inst, print_flag=0):
"""Function to perturb instance x -> Deform the array -> assign 0 to the x-th column"""
"""
Returns perturbed instance inst
"""
new_inst = inst.copy()
try:
temp_x = inst[:, x]
temp_y = inst[:, y]
new_inst[:, x] = temp_y
new_inst[:, y] = temp_x
except:
new_inst[:, x] = 0
return new_inst
def _classifier_fn(self, x, negative_to_positive=0):
"""Returns the prediction probability of class 1 -> Not class 0"""
prediction = self._model.predict_proba(x)
# If prediction is [1] retrurn the probability of class 1 else return probability of class 0
if negative_to_positive == 1:
return prediction[:, 0]
return prediction[:, 1]
def _print_instance(self, pert_inst, ref_inst):
"""Function to print the perturbed instance"""
"""
Returns perturbed instance inst
"""
feature_names = self.feature_names
indices_active_elements_ref = np.nonzero(ref_inst)[1]
indices_active_elements_pert = np.nonzero(pert_inst)[1]
ref_set = set(indices_active_elements_ref)
pert_set = set(indices_active_elements_pert)
# elements in ref_set but not in pert_set
removed_word_indices = ref_set - pert_set
# elements in pert_set but not in ref_set
added_word_indices = pert_set - ref_set
printable_array = []
for item in indices_active_elements_ref:
printable_array.append(".." + feature_names[item] + "..")
# Change formatting of removed words
for item in removed_word_indices:
printable_array[
printable_array.index(".." + feature_names[item] + "..")
] = ("--" + feature_names[item] + "--")
# change formatting of added words
for item in added_word_indices:
printable_array.append("++" + feature_names[item] + "++")
printable_array.append(" --> class 1 Score = ")
printable_array.append(self._classifier_fn(pert_inst)[0])
print(printable_array)
return printable_array
def _conditional_replace_fn(self, x, y, inst, print_flag=0):
for i in range(len(x)):
if isinstance(y[i], str):
inst[:, x[i]] = 0
else:
temp_x = inst[:, x[i]]
temp_y = inst[:, y[i]]
inst[:, x[i]] = temp_y
inst[:, y[i]] = temp_x
return inst
def _get_featues_importances(self, instance):
"""Get feature importances with the sign of the change in prediction probability for a given instance.
Uses the gini impurity in the RF model.
Fast calculation as values are calculated during training period.
reference: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html#sklearn.ensemble.RandomForestClassifier.feature_importances_
Args:
antonyms_indices: indices of antonyms in the feature vector
model: trained model with feature_importances_
Returns:
tuple of features and their indices in the feature vector
"""
feature_importance = self._model.feature_importances_
initial_score = self._model.predict_proba(instance)[0][1]
indices_active_elements = np.array(np.nonzero(instance)[1]).reshape(
len(np.nonzero(instance)[1]), 1
)
feature_set = [frozenset(x) for x in indices_active_elements]
candidates_to_expand = []
for features in indices_active_elements:
candidates_to_expand.append(OrderedSet(features))
explanation_candidates = candidates_to_expand.copy()
perturbed_instances = [
self._perturb_fn(x, inst=instance.copy()) for x in explanation_candidates
]
scores_explanation_candidates = [
self._classifier_fn(x) for x in perturbed_instances
]
sign_change = [
1 if (x - initial_score) > 0 else -1 for x in scores_explanation_candidates
]
# if sign change is 0, feature_importance value set to -value
feature_importance = [x if x > 0 else -x for x in feature_importance]
return feature_importance
def _expand_and_prune(
self,
comb,
expanded_combis,
feature_set,
candidates_to_expand,
explanations_sets,
scores_candidates_to_expand,
instance,
cf,
feature_names,
revert=0,
):
"""Function to expand "best-first" feature combination and prune explanation_candidates and candidates_to_expand"""
comb = OrderedSet(comb)
expanded_combis.append(comb)
print("\n\n")
old_candidates_to_expand = [frozenset(x) for x in candidates_to_expand]
old_candidates_to_expand = set(old_candidates_to_expand)
feature_set_new = []
## If the feature is not in the current combination -> add it to a new list
for feature in feature_set:
if len(comb & feature) == 0: # set operation: intersection
feature_set_new.append(
feature
) # If the feature is not in the current combination to remove from the instance
# Add each element in the new set -> which were initially not present -> to the accepted combination -> create new combinations -> (EXPANSION)
new_explanation_candidates = []
for element in feature_set_new:
union = comb | element # set operation: union
new_explanation_candidates.append(
union
) # Create new combinations to remove from the instance
# Add new explanation candidates to the list of candidates to expand
candidates_to_expand_notpruned = candidates_to_expand.copy()
for new_candidate in new_explanation_candidates:
candidates_to_expand_notpruned.append(new_candidate)
# Calculate scores of new combinations and add to scores_candidates_to_expand
# perturb each new candidate and get the score for each.
perturbed_instances = [
self._perturb_fn(x, inst=instance.copy())
for x in new_explanation_candidates
]
for instance_p in perturbed_instances:
self._print_instance(instance_p, instance)
scores_perturbed_new = [cf(x, revert) for x in perturbed_instances]
## Append the newly created score array to the passes existing array
scores_candidates_to_expand_notpruned = (
scores_candidates_to_expand + scores_perturbed_new
)
# create a dictionary of scores dictionary where the
# keys are string representations of the candidates from candidates_to_expand_notpruned, and the
# values are the corresponding scores from scores_candidates_to_expand_notpruned
dictionary_scores = dict(
zip(
[str(x) for x in candidates_to_expand_notpruned],
scores_candidates_to_expand_notpruned,
)
)
# *** Pruning step: remove all candidates to expand that have an explanation as subset ***
candidates_to_expand_pruned_explanations = []
# take one combination from candidates
for combi in candidates_to_expand_notpruned:
pruning = 0
for (
explanation
) in (
explanations_sets
): # if an explanation is present as a subser in combi, does not add it to the to be expanded list -> because solution with a smaller size exists
if (explanation.issubset(combi)) or (explanation == combi):
pruning = pruning + 1
if (
pruning == 0
): # If it is not a superset of a present explanation -> add it to the list
candidates_to_expand_pruned_explanations.append(combi)
# Each element is frozen as a set
candidates_to_expand_pruned_explanations_frozen = [
frozenset(x) for x in candidates_to_expand_pruned_explanations
]
# But the total set f frozen sets are not frozen
candidates_to_expand_pruned_explanations_ = set(
candidates_to_expand_pruned_explanations_frozen
)
expanded_combis_frozen = [frozenset(x) for x in expanded_combis]
expanded_combis_ = set(expanded_combis_frozen)
# *** Pruning step: remove all candidates to expand that are in expanded_combis *** -> Same as above
candidates_to_expand_pruned = (
candidates_to_expand_pruned_explanations_ - expanded_combis_
)
ind_dict = dict(
(k, i)
for i, k in enumerate(candidates_to_expand_pruned_explanations_frozen)
)
indices = [ind_dict[x] for x in candidates_to_expand_pruned]
candidates_to_expand = [
candidates_to_expand_pruned_explanations[i] for i in indices
]
# The new explanation candidates are the ones that are NOT in the old list of candidates to expand
new_explanation_candidates_pruned = (
candidates_to_expand_pruned - old_candidates_to_expand
)
candidates_to_expand_frozen = [frozenset(x) for x in candidates_to_expand]
ind_dict2 = dict((k, i) for i, k in enumerate(candidates_to_expand_frozen))
indices2 = [ind_dict2[x] for x in new_explanation_candidates_pruned]
explanation_candidates = [candidates_to_expand[i] for i in indices2]
# Get scores of the new candidates and explanations.
scores_candidates_to_expand = [
dictionary_scores[x] for x in [str(c) for c in candidates_to_expand]
]
scores_explanation_candidates = [
dictionary_scores[x] for x in [str(c) for c in explanation_candidates]
]
return (
explanation_candidates,
candidates_to_expand,
expanded_combis,
scores_candidates_to_expand,
scores_explanation_candidates,
)
def __call__(self, instance, someNumber):
"""Generates evidence counterfactual explanation for the instance.
ONLY IF THE CURRENT INSTANCE IS POSITIVE -> Limitation
Args:
instance: [numpy.array or sparse matrix] instance to explain
Returns:
None
"""
# *** INITIALIZATION ***
print("Start initialization...")
tic = time.time()
input_string = "Given a vector"
if isinstance(instance, str):
input_string = str(instance)
text = self._text_vectorizer(instance)
text = lil_matrix(text)
print("initial sentence is ... ")
print(text.get_shape())
self._print_ref_instance(text)
iteration = 0
nb_explanations = 0
minimum_size_explanation = np.nan
explanations = []
explanations_sets = []
explanations_score_change = []
expanded_combis = []
score_predicted = self._classifier_fn(text) ## Returns Prediction Prob
# Intial class is 1 is score is greater than threshold
if score_predicted > self.threshold_classifier:
self.initial_class = [1]
else:
self.initial_class = [0]
self.revert = 1
print(
"score_predicted ",
score_predicted,
" initial_class ",
self.initial_class,
)
self._report_data["input"] = {
"text": input_string,
"score for positive": score_predicted[0],
"initial class": self.initial_class[0],
}
importances = self._get_featues_importances(text)
features = []
for ind in range(len(importances)):
if importances[ind] != 0:
features.append(
{
"feature": ind,
"word": self.feature_names[ind],
"importance": importances[ind],
}
)
sorted_data_in = sorted(features, key=lambda x: x["importance"], reverse=True)
inverse_sorted_data_in = sorted(features, key=lambda x: x["importance"])
self._report_data["feature_importances"] = features
if self.revert == 1:
sorted_data_in = inverse_sorted_data_in
indices_active_elements = np.nonzero(text)[
1
] ## -> Gets non zero elements in the instance as an array [x, y, z]
sorted_indices = sorted(
indices_active_elements, key=lambda x: importances[x], reverse=True
)
indices_active_elements = np.array(sorted_indices)
number_active_elements = len(indices_active_elements)
indices_active_elements = indices_active_elements.reshape(
(number_active_elements, 1)
) ## -> Reshape to get a predictable
candidates_to_expand = (
[]
) # -> These combinations are further expanded -> These are the elements to be removed from the sentence
for features in indices_active_elements:
candidates_to_expand.append(OrderedSet(features))
print("candidates_to_expand ", candidates_to_expand)
## > Gets an array with each element in reshaped incides as an ordered set -> [OrderedSet([430]), OrderedSet([588]), OrderedSet([595])]
explanation_candidates = candidates_to_expand.copy()
print("explanation_candidates ", explanation_candidates)
## Gets a copy of the above array -> Initially
feature_set = [
frozenset(x) for x in indices_active_elements
] ## Immutable -> can be used as keys in dictionary
## Used features in the current x-reference -> incides of the words in the review.
print("Initialization is complete.")
print("\n Elapsed time %d \n" % (time.time() - tic))
# *** WHILE LOOP ***
while (
(iteration < self.max_iter)
and (nb_explanations < self.max_explained)
and (len(candidates_to_expand) != 0)
and (len(explanation_candidates) != 0)
and ((time.time() - tic) < self.time_maximum)
):
## Stop if maximum iterations exceeded
# number of explanations generated is greater than the maximum explanations
# There are no candidates to expand
# There are no explanation candidates -> Used to force stop while loop below
# Or maximum allowed time exceeded
iteration += 1
print("\n Iteration %d \n" % iteration)
if iteration == 1:
print("Run in first iteration -> perturbation done \n")
# Print the word in each index in the explanation candidates
# for item in explanation_candidates:
# print([self.feature_names[x] for x in item])
print("explanation_candidates \n", explanation_candidates, "\n")
perturbed_instances = [
self._perturb_fn(x, inst=text.copy())
for x in explanation_candidates
]
for instance_p in perturbed_instances:
self._print_instance(instance_p, text)
scores_explanation_candidates = [
self._classifier_fn(x, self.revert) for x in perturbed_instances
]
# Get predictions for each perturbed instance where one or more elements are removed from the initial instance
# It is in form of [[x], [y], [z]]
print(
"scores_explanation_candidates \n",
scores_explanation_candidates,
"\n",
)
scores_candidates_to_expand = scores_explanation_candidates.copy()
scores_perturbed_new_combinations = [
x[0] for x in scores_explanation_candidates
]
# Therefore get it to the shape [x, y, z] by getting the [0] th element of each element array
# print(
# "scores_perturbed_new_combinations ", scores_perturbed_new_combinations
# )
# ***CHECK IF THERE ARE EXPLANATIONS***
new_explanations = list(
compress(
explanation_candidates,
scores_perturbed_new_combinations < self.threshold_classifier,
)
)
# Get explanation candidates where their probability is less than the threshold classifier -> Positive becomes negative
# print("New Explanations \n", new_explanations)
explanations += list(
compress(
explanation_candidates,
scores_perturbed_new_combinations < self.threshold_classifier,
)
)
# print("\n explanations, explanations_score_change", explanations)
nb_explanations += len(
list(
compress(
explanation_candidates,
scores_perturbed_new_combinations < self.threshold_classifier,
)
)
) # Update number of explanations which pass the required threshold
explanations_sets += list(
compress(
explanation_candidates,
scores_perturbed_new_combinations < self.threshold_classifier,
)
)
explanations_sets = [
set(x) for x in explanations_sets
] # Convert each array to a set -> to get the words
explanations_score_change += list(
compress(
scores_explanation_candidates,
scores_perturbed_new_combinations < self.threshold_classifier,
)
)
# print('explanations_score_change', explanations_score_change)
# Adjust max_length
if self.BB == True:
if len(explanations) != 0:
lengths = [] # Record length of each explanation found
for explanation in explanations:
lengths.append(len(explanation))
lengths = np.array(lengths)
max_length = lengths.min()
# Get minimum length of the found explanations as max length -> Do not search for explanations with longer length
else:
max_length = number_active_elements # Else can find maximum length equal to number of words in instance
else:
max_length = number_active_elements
print("\n-------------Max length updated to - ", max_length)
# Eliminate combinations from candidates_to_expand ("best-first" candidates) that can not be expanded
# Pruning based on Branch & Bound=True, max. features allowed and number of active features
candidates_to_expand_updated = []
scores_candidates_to_expand_updated = (
[]
) # enumerate -> Find count of || to list one after another
for j, combination in enumerate(candidates_to_expand):
if (
(len(combination) < number_active_elements)
and (len(combination) < max_length)
and (len(combination) < self.max_features)
):
# Combination length should be less than the words in the input and max length of the required explanation and required maximum features
candidates_to_expand_updated.append(
combination
) # If the combination matches, it is further expanded
scores_candidates_to_expand_updated.append(
scores_candidates_to_expand[j]
)
# Add the prediction score to the new array
# get the score from the scores_candidates_to_expand using the current index
print(
"\nlen(candidates_to_expand_updated)",
len(candidates_to_expand_updated),
" 0 ",
)
print(
"\nnb_explanations",
nb_explanations,
" >= self.max_explained ",
self.max_explained,
)
# *** IF LOOP ***
# expanding the candidates to update will exceed the max length set in the earlier loop
if (len(candidates_to_expand_updated) == 0) or (
nb_explanations >= self.max_explained
):
## If the number of explanations exceeded the required number
## or no candidates
## no explanations present
print("nb_explanations Stop iterations...")
explanation_candidates = [] # stop algorithm
## Found all the candidates
print(
"scores_candidates_to_expand_updated ",
scores_candidates_to_expand_updated,
)
# print("candidates_to_expand_updated ", candidates_to_expand_updated)
elif len(candidates_to_expand_updated) != 0:
## If there are possible candidates
explanation_candidates = []
it = 0 # Iteration of the while loop
indices = []
scores_candidates_to_expand2 = []
for score in scores_candidates_to_expand_updated:
if score[0] < self.threshold_classifier:
scores_candidates_to_expand2.append(2 * score_predicted)
else:
scores_candidates_to_expand2.append(score)
# update candidate scores if they have score less than threshold -> To expand them further
shap_candidates_to_expand2 = []
for candidate in candidates_to_expand_updated:
importancess = 0
for word in candidate:
# find word in feature column in sorted_data
for ind in range(len(sorted_data_in)):
if sorted_data_in[ind]["feature"] == word:
importancess += sorted_data_in[ind]["importance"]
break
shap_candidates_to_expand2.append(importancess)
# print(
# "\n scores_candidates_to_expand2 before loop",
# scores_candidates_to_expand2,
# )
# *** WHILE LOOP ***
while (
(len(explanation_candidates) == 0)
and (it < len(scores_candidates_to_expand2))
and ((time.time() - tic) < self.time_maximum)
):
# Stop if candidates are found or looped through more than there are candidates or maximum time reached
print("While loop iteration %d" % it)
if it != 0: # Because indices are not there in the first iteration
for index in indices:
scores_candidates_to_expand2[index] = 2 * score_predicted
# print(
# "\n scores_candidates_to_expand2 after loop",
# scores_candidates_to_expand2,
# )
# print("\n indices", indices)
# do elementwise subtraction between score_predicted and scores_candidates_to_expand2
subtractionList = []
for x, y in zip(score_predicted, scores_candidates_to_expand2):
print("\n x, y", x - y)
subtractionList.append(x - y)
# Do element wise subtraction between the prediction score of the x_ref and every element of the scores_candidates_to_expand2
index_combi_max = np.argmax(subtractionList)
index_importance_max = np.argmax(shap_candidates_to_expand2)
index_importance_min = np.argmin(shap_candidates_to_expand2)
print(
"subtrac max ",
index_combi_max,
" index_shap_max ",
index_importance_max,
)
if iteration < 3:
print("---------USING IMPORTANCE----------")
if self.revert == 0:
index_combi_max = index_importance_max
else:
index_combi_max = index_importance_min
# Get the index of the maximum value -> Expand it
else:
print("++++++++USING DIFFERENCE+++++++++")
print(
"\n index_combi_max",
candidates_to_expand_updated[np.argmax(subtractionList)],
"\n index_importance_max",
candidates_to_expand_updated[index_importance_max],
"\n using combination",
candidates_to_expand_updated[index_combi_max],
)
indices.append(index_combi_max)
expanded_combis.append(
candidates_to_expand_updated[index_combi_max]
)
# Add this combination to already expanded combinations as it will be expanded next by expand and prune function
comb_to_expand = candidates_to_expand_updated[index_combi_max]
# Expand the found combination with highest difference
func = self._expand_and_prune(
comb_to_expand,
expanded_combis,
feature_set,
candidates_to_expand_updated,
explanations_sets,
scores_candidates_to_expand_updated,
text,
self._classifier_fn,
self.feature_names,
self.revert,
)
"""Returns:
- explanation_candidates: combinations of features that are explanation
candidates to be checked in the next iteration
- candidates_to_expand: combinations of features that are candidates to
expanded in next iterations or candidates for "best-first"
- expanded_combis: [list] list of combinations of features that are already
expanded as "best-first"
- scores_candidates_to_expand: scores after perturbation for the candidate
combinations of features to be expanded
- scores_explanation_candidates: scores after perturbation of explanation candidates"""
explanation_candidates = func[0]
candidates_to_expand = func[1]
expanded_combis = func[2]
scores_candidates_to_expand = func[3]
scores_explanation_candidates = func[4]
it += 1
print(
"\n\n\niteration - ", iteration, " self.max_iter - ", self.max_iter
)
print(
"\n\nlen(candidates_to_expand) - ",
len(candidates_to_expand),
" != 0 ",
)
print(
"\n\nlen(explanation_candidates) - ",
len(explanation_candidates),
" !=0 ",
)
print(
"\n\n(time.time() - tic) - ",
(time.time() - tic),
" self.time_maximum - ",
self.time_maximum,
)
print("\n Elapsed time %d \n" % (time.time() - tic))
# *** FINAL PART OF ALGORITHM ***
print("Iterations are done.")
explanation_set = []
explanation_feature_names = []
index_of_min_length_explanation = -1
for i in range(len(explanations)):
explanation_feature_names = []
for features in explanations[i]:
explanation_feature_names.append(self.feature_names[features])
explanation_set.append(explanation_feature_names)
if len(explanations) == 0:
self._report_data["output"] = {
"Removed_words": None,
"final_text": "No counterfactuals found",
"final score for positive": score_predicted[0],
"final class": self.initial_class[0],
}
self._report_data["process"] = {
"explanation set": None,
"number active elements": number_active_elements,
"number explanations found": 0,
"size smallest explanation": 0,
"time elapsed": time.time() - tic,
"differences score": 0,
"iterations": iteration,
"final_sentence": None,
}
print("No explanations Found")
return
if len(explanations) != 0:
lengths_explanation = []
for explanation in explanations:
l = len(explanation)
lengths_explanation.append(l)
minimum_size_explanation = np.min(lengths_explanation)
index_of_min_length_explanation = np.argmin(lengths_explanation)
final_sentence = text.copy()
print("len explanations", len(explanations))
if len(explanations) != 0:
final_sentence = self._perturb_fn(
explanations[index_of_min_length_explanation],
text.copy(),
)
final_prob = self._classifier_fn(final_sentence)
print("final_prob", final_prob)
number_explanations = len(explanations)
if np.size(explanations_score_change) > 1:
inds = np.argsort(explanations_score_change, axis=0)
inds = np.fliplr([inds])[0]
inds_2 = []
for i in range(np.size(inds)):
inds_2.append(inds[i][0])
explanation_set_adjusted = []
for i in range(np.size(inds)):
j = inds_2[i]
explanation_set_adjusted.append(explanation_set[j])
explanations_score_change_adjusted = []
for i in range(np.size(inds)):
j = inds_2[i]
explanations_score_change_adjusted.append(explanations_score_change[j])
explanation_set = explanation_set_adjusted
explanations_score_change = explanations_score_change_adjusted
time_elapsed = time.time() - tic
print("\n Total elapsed time %d \n" % time_elapsed)
removed_words = [
item
for sublist in explanation_set[0 : self.max_explained]
for item in sublist
]
print(
"If we remove the words ",
removed_words,
"From the review, the prediction will be reversed",
)
final_string = "Given a vector"
if input_string != "Given a vector":
words = input_string.split()
final_words = []
for word in words:
if word.lower() in removed_words:
final_words.append("-" * len(word))
else:
final_words.append(word)
# Join the words back into a final string
final_string = " ".join(final_words)
if final_prob > self.threshold_classifier:
final_class = [1]
else:
final_class = [0]
self._report_data["output"] = {
"Removed_words": removed_words,
"final_text": final_string,
"final score for positive": final_prob[0],
"final class": final_class[0],
}
print(self._report_data["input"])
print(self._report_data["output"])
diff_score = [
int(sc[0]) for sc in explanations_score_change[0 : self.max_explained]
]
self._report_data["process"] = {
"explanation set": explanation_set[0 : self.max_explained],
"number active elements": number_active_elements,
"number explanations found": number_explanations,
"size smallest explanation": int(minimum_size_explanation),
"time elapsed": time_elapsed,
"differences score": diff_score,
"iterations": iteration,
}
def explanation(self) -> str:
exp_dict = self._report_data
exp = json.dumps(exp_dict, indent=4)
return exp
def set_config(self, config: Dict[str, Any]) -> None:
"""
Config must contain the following keys:
{
"threshold_classifier": float,
"max_iter": int,
"time_maximum": int,
}
"""
self.threshold_classifier = np.float64(config["threshold_classifier"])
self.max_iter = config["max_iter"]
self.time_maximum = config["time_maximum"]
from sklearn.svm import SVC
import numpy as np
import os
from ..datasets import IMDBDataset
from ..processors import TextVectorizer
from ..cf_generators import T5Generator, WordFlippingGenerator
import json
from typing import Union, List, Dict
from scipy.sparse import csr_matrix, vstack
import yaml
from .base import BaseAnalyzer
import joblib
class SVMDistanceAnalyzer(BaseAnalyzer):
valid_sets = ["train", "val", "test"]
def __init__(
self,
model: SVC,
dataset: IMDBDataset,
buffer_path: str,
active_set: str = "test",
) -> None:
self.model = model
self.dataset = dataset
self.preprocessor = dataset.preprocessor
self.input_encoder = dataset.input_encoder
self.buffer_path = buffer_path
self.active_set = active_set
self._load_ds_distances()
def _load_ds_distances(self) -> None:
active_set = self.active_set
assert active_set in self.valid_sets
distances = []
buffer = {}
if os.path.exists(self.buffer_path):
with open(self.buffer_path) as handler:
buffer = json.load(handler)
if "svm_distances" in buffer.keys():
if active_set in buffer["svm_distances"].keys():
distances = buffer["svm_distances"][active_set]
if len(distances) == 0:
x = getattr(self.dataset, f"x_{active_set}")
distances = list(self.model.decision_function(x))
if "svm_distances" in buffer.keys():
buffer["svm_distances"][active_set] = distances
else:
buffer["svm_distances"] = {active_set: distances}
with open(self.buffer_path, "w") as handler:
json.dump(buffer, handler, indent=4)
self.ds_distances = np.array(buffer["svm_distances"][active_set])
def _get_distances(self, sentences: List[str]) -> np.ndarray:
processed_texts = [self.preprocessor(sentence) for sentence in sentences]
input_vectors = self.input_encoder.transform(processed_texts)
distances = self.model.decision_function(input_vectors)
return distances
def _get_indices_from_distances(self, distances: np.ndarray) -> np.ndarray:
n = distances.size
ds_distances = np.vstack([self.ds_distances] * n)
distances = distances.reshape(n, 1)
diff = (ds_distances - distances).__abs__()
indices = np.argmin(diff, axis=1)
return indices
def get_counterfactual_examples(
self, sentences: Union[str, List[str]]
) -> List[Dict[str, Dict[str, str]]]:
if type(sentences) == str:
sentences = [sentences]
distances = self._get_distances(sentences)
inverse_distances_query = -1 * distances
indices = self._get_indices_from_distances(inverse_distances_query)
inverse_sentences = self.dataset.get_reviews(self.active_set, indices)
inverse_sentences = [rev["original"] for rev in inverse_sentences]
inverse_distances = self._get_distances(inverse_sentences)
analysis = [
{
"original": {"sentence": sentence, "distance": distance},
"opposite": {
"sentence": inverse_sentence,
"distance": inverse_distance,
},
}
for (sentence, inverse_sentence, distance, inverse_distance) in zip(
sentences, inverse_sentences, distances, inverse_distances
)
]
return analysis
class SVMMirrorAnalyzer(BaseAnalyzer):
def __init__(
self,
svm_path: str,
vectorizer_path: str,
cf_generator_config: Union[str, dict] = None,
cf_generator_root: str = None,
) -> None:
supported_cf_gens = ("t5-cf-generator", "wf-cf-generator")
if type(cf_generator_config) == str:
with open(cf_generator_config) as handler:
cf_gen_nm = yaml.load(handler, yaml.FullLoader)["name"]
else:
cf_gen_nm = cf_generator_config["name"]
if cf_gen_nm not in supported_cf_gens:
raise ValueError(
f"Unsupported Counterfactual Generator definition. Supported generators are {supported_cf_gens}"
)
model = joblib.load(svm_path)
text_vectorizer = TextVectorizer(vectorizer_path)
if cf_gen_nm == "t5-cf-generator":
assert (
type(cf_generator_config) == str
), "'cf_generator_config' must be a path object"
cf_generator = T5Generator(cf_generator_config, cf_generator_root)
else:
cf_generator = WordFlippingGenerator(cf_generator_config)
self._model = model
self._text_vectorizer = text_vectorizer
self._cf_generator = cf_generator
self._report_data = {}
def _get_plane(self) -> List[float]:
svm = self._model
kernel = svm.kernel
if kernel == "rbf":
w = svm._dual_coef_.toarray()[0]
b = svm.intercept_[0]
plane = [*w, b]
else:
raise NotImplementedError(
"Plane extraction is only implemented for 'rbf' kernel"
)
return plane
def _project_to_kernel_space(self, vect: csr_matrix) -> np.ndarray:
"""
Applies the RBF kernel function
Args:
1. vect: Input vectors with shape (n, tfidf_n); where n is the number of vectors, tfidf_n is the size of a tfidf vector
2: svm
"""
svm = self._model
kernel = svm.kernel
if kernel == "rbf":
gamma = svm._gamma # 1/(2*(sigma**2))
sv = svm.support_vectors_ # shape: (m, tfidf_n)
vect = vect.toarray()
k_arr = []
for i in range(vect.shape[0]):
v = vect[i]
diff = np.array(sv - v)
k = np.exp(-gamma * (diff**2).sum(axis=1))
k_arr.append(k)
k_arr = np.array(k_arr)
else:
raise NotImplementedError(
"Vector projection is only implemented for 'rbf' kernel"
)
return k_arr
def _project_to_kernel_space_2(self, vect: csr_matrix) -> csr_matrix:
"""
Applies the RBF kernel function
Args:
1. vect: Input vectors with shape (n, tfidf_n); where n is the number of vectors, tfidf_n is the size of a tfidf vector
2: svm
"""
svm = self._model
kernel = svm.kernel
if kernel == "rbf":
gamma = svm._gamma # 1/(2*(sigma**2))
sv = svm.support_vectors_ # shape: (m, tfidf_n)
k_arr = []
for i in range(vect.shape[0]):
v = vect[i]
diff = sv - vstack([v] * sv.shape[0])
k = csr_matrix(np.exp(-gamma * diff.multiply(diff).sum(axis=1).T))
k_arr.append(k)
k_arr = vstack(k_arr)
else:
raise NotImplementedError(
"Vector projection is only implemented for 'rbf' kernel"
)
return k_arr
def _get_mirror_point(self, query_point: np.ndarray) -> List[float]:
plane = self._get_plane()
*w, b = plane
w = np.array(w)
t_0 = -(b + w.dot(query_point)) / (np.linalg.norm(w) ** 2)
mp = query_point + 2 * t_0 * w
return mp
def _get_dist(self, vector1, vector2, method="cos-sim"):
if method == "cos-sim":
dot_product = np.dot(vector1, vector2)
norm1 = np.linalg.norm(vector1)
norm2 = np.linalg.norm(vector2)
similarity = dot_product / (norm1 * norm2)
else:
raise NotImplementedError(
"Similarity checking only implemented for cosine similarity ('cos-sim')"
)
return similarity
def explanation(self) -> str:
if type(self._report_data["output"]) == str:
report = self._report_data["output"]
else:
tabbed_newline = "\n\t"
report = f"""
======== Analysis Report ========
Input text : {self._report_data["input"]["text"]}
Generated contradictory texts : {tabbed_newline+tabbed_newline.join(self._report_data["output"]["generated_text"])}
Distances to the mirror point : {tabbed_newline+tabbed_newline.join([str(dist) for dist in self._report_data["output"]["distances"]])}
Closest contradictory text : {self._report_data["output"]["generated_text"][self._report_data["output"]["closest_id"]]}
"""
return report
def __call__(self, text: str, search_space: int) -> str:
self._report_data["input"] = {"text": text, "search_space": search_space}
# 1. generate contradictions
contradictions = self._cf_generator(text, search_space)
# 2. project to the vectorizer's vector space (X)
vect = self._text_vectorizer([text, *contradictions]) # (1+search_space,n)
# 3. project to the svm kernel's vector space (K)
vect_ks = self._project_to_kernel_space(vect)
(review_vect_ks, *contra_vect_ks) = vect_ks
# 4. find the mirror point of 'review' on the kernel space (C)
review_vect_ks_mp = self._get_mirror_point(review_vect_ks)
if len(contra_vect_ks) > 0:
# 5. find the closest point to C out of the contradictory points
dists = []
for cv in contra_vect_ks:
dist = self._get_dist(cv, review_vect_ks_mp)
dists.append(dist)
i = np.argmin(dists)
clst_txt = contradictions[i]
# reporting
self._report_data["output"] = {"generated_text": contradictions}
self._report_data["output"]["distances"] = dists
self._report_data["output"]["closest_id"] = i
return clst_txt
else:
# no contradictions were available for the given configuration
self._report_data[
"output"
] = "No contradictions possible for the given test case configuration"
def set_config(self, config) -> None:
self._cf_generator.set_config(config["generator_config"])
from .t5 import T5Generator
from .wf import WordFlippingGenerator
from typing import List, Dict, Any
from ..models import DownloadableModel
class BaseGenerator(DownloadableModel):
def __call__(self, inp: str, variations: int = 4) -> List[str]:
raise NotImplementedError("Method not implemented yet.")
def set_config(self, config: Dict[str, Any]) -> None:
raise NotImplementedError("Method not implemented yet.")
from .base import BaseGenerator
from transformers import T5Tokenizer, T5ForConditionalGeneration
import torch
from typing import List, Dict
import nltk
class T5Generator(BaseGenerator):
def __init__(
self, config_path: str, root: str = None, download: bool = False
) -> None:
super().__init__(config_path, root, download)
tokenizer = T5Tokenizer.from_pretrained(self.config["model_config"])
model = T5ForConditionalGeneration.from_pretrained(self.config["model_config"])
state = torch.load(self.config["paths"]["model"])
model.load_state_dict(state)
self.tokenizer = tokenizer
self.model = model
def __call__(self, inp: str, variations: int) -> List[str]:
# format input
inp = nltk.sent_tokenize(inp)
inp = ["contradict: " + sent for sent in inp]
# generate
sentence_sets = []
for sent in inp:
input_ids = self.tokenizer(sent, return_tensors="pt").input_ids
label_ids = self.model.generate(
input_ids, num_return_sequences=variations, num_beams=variations
)
sents = [
self.tokenizer.decode(label_id_row, skip_special_tokens=True)
for label_id_row in label_ids
]
sentence_sets.append(sents)
paras = []
for sentences in zip(*sentence_sets):
para = " ".join(sentences)
paras.append(para)
return paras
def set_config(self, config: Dict) -> None:
print("WARN: Nothing to do")
import numpy as np
import random
import nltk
from nltk.corpus import wordnet
from nltk.tokenize import word_tokenize
from nltk.tag import pos_tag
from collections import Counter
from typing import List, Tuple, Union, Dict, Any
import yaml
from .base import BaseGenerator
class WordFlippingGenerator(BaseGenerator):
"""
Randomly flips words with defined POS tags to their antonyms
"""
# This uses the following algorithm
# Note: the word "flip" will be used to define the operation of getting an antonym of a specific word
# 1. User should define the POS tags relevant to the words that must be flipped.
# 2. The user should invoke the functionality by specifying an original sentence and the number of variations they want
# 3. The algorithm will tokenize the words
# 4. The algorithm will generate a mask list corresponding to these tokens by refering to the POS tags previously defined by the user.
# A truth value in this mask will represent that the word must be flipped and false will mean otherwise
# 5. The algorithm will generate a set of lists of antonyms for the words to be flipped by refering to the mask above
# These lists will have words ordered in the descending order of the probabilities of their occurence
# 6. New sentences will be generated by merging the original words and antonyms appropriately
# Throughout the implementation, following conventions will be used
# 1. Number of variations defined by the user: n
# 2. Number of flipping words in a given phrase: m
def __init__(self, config: str) -> None:
# read the config file
if type(config) == str:
with open(config) as handler:
config = yaml.load(handler, yaml.FullLoader)
sample_prob_decay_factor = config["sample_prob_decay_factor"]
flip_prob = config["flip_prob"]
flipping_tags = config["flipping_tags"]
name = config["name"]
# validate the values
if (flip_prob < 0) or (flip_prob > 1):
raise ValueError("'flip_prob' should be in range [0,1]")
if sample_prob_decay_factor < 0:
raise ValueError("'sample_prob_decay_factor' must be a positive value")
if len(flipping_tags) == 0:
raise ValueError("'flipping_tags' array should be longer than 0")
self.name = name
self.flip_prob = flip_prob
self.sample_prob_decay_factor = sample_prob_decay_factor
self.flipping_tags = flipping_tags
def _get_antonyms(self, word: str) -> List[str]:
"""
Returns a set of antonyms of a given word.
These words are ordered in the dscending order of the probability of their occurence
"""
antonym_count = Counter()
synsets = wordnet.synsets(word)
for syn in synsets:
for lemma in syn.lemmas():
if lemma.antonyms():
for ant in lemma.antonyms():
antonym_count[ant.name()] += 1
sorted_antonyms = [ant for ant, _ in antonym_count.most_common()]
sorted_antonyms = [ant.replace("_", " ") for ant in sorted_antonyms] # "is_not"
return sorted_antonyms
def _get_flip_mask(self, tag: str) -> bool:
"""
Refers to a lookup table of tags and returns if the current tag corresponds to a
flipping word or not
"""
can_flip = tag in self.flipping_tags
return can_flip
def _sample_list(self, word_list: List[str], m: int) -> List[str]:
"""
Samples words from the given word list randomly m times and returns them
This sampling is done by refering to a exponentially decaying distribution
Arguments:
1. word_list: List[str]: The word list in the decreasing probabilistic order
2. m: int: The number of times the sampling must be done
Returns
1. m number of sampled words
"""
num_words = len(word_list)
probabilities = [
np.exp(-self.sample_prob_decay_factor * i) for i in range(num_words)
]
total_prob = sum(probabilities)
normalized_probs = [p / total_prob for p in probabilities]
sampled_indices = [
np.random.choice(num_words, p=normalized_probs) for i in range(m)
]
sampled_words = [word_list[idx] for idx in sampled_indices]
return sampled_words
def _find_nth_true_index(self, boolean_list: List[bool], n: int) -> int:
"""
Finds the position of the nth truth in a given boolean array
"""
count = 0
for index, value in enumerate(boolean_list):
if value:
count += 1
if count == n:
return index
return -1
def _clean_masks(
self, masks: List[bool], flipped_words: [List[List[str]]]
) -> Tuple[Union[bool, List[List[str]]]]:
"""
Cleans the mask list
When a set of antonyms are given, empty antonym lists might be observed
In such cases, the relevant word will not be flipped.
This method will remove such empty lists and then update the mask list.
"""
new_masks = masks.copy()
new_flipped_words = []
for i, words in enumerate(flipped_words):
if len(words) > 0:
new_flipped_words.append(words)
else:
mask_i = self._find_nth_true_index(masks, i + 1)
new_masks[mask_i] = False
return (new_masks, new_flipped_words)
def _merge_words(
self,
originals: List[str],
opposites: List[List[str]], # has the shape (m, z), z ∈ {non-negative integers}
masks: List[bool],
variations: int, # n
) -> List[str]:
"""
Merges original words and antonyms appropriately to create sentences
"""
opposites = [
self._sample_list(single_list, variations) for single_list in opposites
] # has the shape (m, n)
weights = (self.flip_prob, 1 - self.flip_prob)
new_sentences = []
# iterate parallely over all the opposite word lists
for opposite in zip(*opposites):
opposite = list(opposite)
new_sent = []
for i, mask in enumerate(masks):
if mask:
word = opposite.pop(0)
word = random.choices((word, originals[i]), weights=weights, k=1)[0]
else:
word = originals[i]
new_sent.append(word)
new_sent = " ".join(new_sent)
new_sentences.append(new_sent)
return new_sentences
def __call__(self, inp: str, variations: int = 4) -> List[str]:
words = word_tokenize(inp)
word_tags = pos_tag(words)
masks = [self._get_flip_mask(tag) for (word, tag) in word_tags]
flipping_words = [word for (word, mask) in zip(words, masks) if mask]
flipped_words = [self._get_antonyms(word) for word in flipping_words]
masks, flipped_words = self._clean_masks(masks, flipped_words)
opposite_sentences = self._merge_words(words, flipped_words, masks, variations)
return opposite_sentences
def describe_tags(self) -> None:
nltk.help.upenn_tagset()
def set_tags(self, flipping_tags: List[str]) -> None:
if len(flipping_tags) == 0:
raise ValueError("'flipping_tags' array should be longer than 0")
self.flipping_tags = flipping_tags
def set_flipping_prob(self, flip_prob: float) -> None:
if (flip_prob < 0) or (flip_prob > 1):
raise ValueError("'flip_prob' should be in range [0,1]")
self.flip_prob = flip_prob
def set_sample_prob_decay_factor(self, sample_prob_decay_factor: float) -> None:
if sample_prob_decay_factor < 0:
raise ValueError("'sample_prob_decay_factor' must be a positive value")
self.sample_prob_decay_factor = sample_prob_decay_factor
def set_config(self, config: Dict[str, Any]) -> None:
required_keys = ["sample_prob_decay_factor", "flip_prob", "flipping_tags"]
if any([val not in config.keys() for val in required_keys]):
raise ValueError(
f"Invalid configuration file. The keys {required_keys} are required in the configuration."
)
sample_prob_decay_factor = config["sample_prob_decay_factor"]
flip_prob = config["flip_prob"]
flipping_tags = config["flipping_tags"]
# validate the values
if (flip_prob < 0) or (flip_prob > 1):
raise ValueError("'flip_prob' should be in range [0,1]")
if sample_prob_decay_factor < 0:
raise ValueError("'sample_prob_decay_factor' must be a positive value")
if len(flipping_tags) == 0:
raise ValueError("'flipping_tags' array should be longer than 0")
self.flip_prob = flip_prob
self.sample_prob_decay_factor = sample_prob_decay_factor
self.flipping_tags = flipping_tags
import os, shutil
import yaml
import joblib
import pandas as pd
import numpy as np
import wget
from typing import List, Dict
from sklearn.feature_extraction.text import TfidfVectorizer
from torch.utils.data import Dataset
from transformers import T5Tokenizer
from torch import Tensor
from typing import Tuple
try:
from .processors import Preprocessor, LUTLabelEncoder
except ImportError:
from processors import Preprocessor, LUTLabelEncoder
class BaseDataset(Dataset):
def __init__(
self, config_path: str, root: str = None, download: bool = False
) -> None:
with open(config_path) as handler:
config = yaml.load(handler, Loader=yaml.FullLoader)
self.name = config["name"]
if root is None:
self.root = os.path.abspath(os.path.split(config_path)[0])
else:
self.root = os.path.abspath(root)
self.source_url = config["source_url"]
self.config = config
self._parse_paths()
if not self._validate_dataset():
if download:
self._download_n_extract()
else:
raise FileNotFoundError(
f"Dataset files not found in {self.root}. Use 'download=True' to download from source"
)
def _parse_paths(self) -> None:
for k, v in self.config["paths"].items():
self.config["paths"][k] = os.path.join(self.root, v)
def _validate_dataset(self) -> bool:
paths = self.config["paths"].values()
paths_exist = [os.path.exists(path) for path in paths]
valid = all(paths_exist)
return valid
def _download_n_extract(self) -> List[str]:
os.makedirs(self.root, exist_ok=True)
download_path = os.path.join(self.root, os.path.split(self.source_url)[1])
print(f"Downloading from source ({self.source_url}) to {download_path}")
download_path = wget.download(self.source_url, download_path)
shutil.unpack_archive(download_path, self.root)
class IMDBDataset(BaseDataset):
valid_splits = ["train", "val", "test"]
def _encode_n_split(self):
# encode
x = self.processed_data["review"]
y = self.processed_data["sentiment"]
print("Encoding")
feature_names = self.input_encoder.get_feature_names_out()
x = self.input_encoder.transform(x)
y = self.label_encoder.transform(y)
# split
ds_size = len(y)
start_idx = 0
split_ids = {"train": start_idx}
end_idx = int(ds_size * self.config["split"]["train"])
x_train = x[start_idx:end_idx]
y_train = y[start_idx:end_idx]
start_idx = int(ds_size * self.config["split"]["train"])
split_ids["val"] = start_idx
end_idx = start_idx + int(ds_size * self.config["split"]["val"])
x_val = x[start_idx:end_idx]
y_val = y[start_idx:end_idx]
start_idx = int(
ds_size * (self.config["split"]["train"] + self.config["split"]["val"])
)
split_ids["test"] = start_idx
end_idx = -1
x_test = x[start_idx:end_idx]
y_test = y[start_idx:end_idx]
self.x_train = x_train
self.y_train = y_train
self.x_val = x_val
self.y_val = y_val
self.x_test = x_test
self.y_test = y_test
self.feature_names = feature_names
self.split_ids = split_ids
def _fit_input_encoder(self):
print("Fitting encoder")
x = self.processed_data.review
self.input_encoder.fit(x)
def get_reviews(self, split: str, ids: np.ndarray) -> List[Dict[str, str]]:
ids = ids + self.split_ids[split]
originals = []
for id in ids:
originals.append(
{
"original": self.original_data.review[id],
"preprocessed": self.processed_data.review[id],
}
)
return originals
def __init__(
self,
config_path: str,
root: str = None,
download: bool = False,
vectorizer_fitted: bool = True,
):
print("Creating dataset")
super().__init__(config_path, root, download)
self.config["extras"]["input_encoder_path"] = os.path.join(
self.root, self.config["extras"]["input_encoder_path"]
)
# initialize
print("Initializing objects")
self.preprocessor = Preprocessor()
self.label_encoder = LUTLabelEncoder(self.config["labels"])
if vectorizer_fitted:
self.input_encoder = joblib.load(
self.config["extras"]["input_encoder_path"]
)
else:
self.input_encoder = TfidfVectorizer(min_df=self.config["extras"]["min_df"])
csv_path = self.config["paths"]["data"]
self.original_data = pd.read_csv(csv_path)
preproc_csv_path = csv_path.replace(".csv", "-preproc.csv")
# check for preprocessed data.
if os.path.exists(preproc_csv_path):
processed_data = pd.read_csv(preproc_csv_path)
else:
processed_data = pd.read_csv(csv_path)
print("Preprocessing")
processed_data.review = processed_data.review.apply(self.preprocessor)
processed_data.to_csv(preproc_csv_path, index=False)
self.processed_data = processed_data
if not vectorizer_fitted:
self._fit_input_encoder()
self._encode_n_split()
print("Dataset created")
def set_split(self, split: str) -> None:
if split not in self.valid_splits:
raise ValueError("Unsupported split definition")
if split == "train":
self.active_x = self.x_train.toarray()
self.active_y = self.y_train
elif split == "val":
self.active_x = self.x_val.toarray()
self.active_y = self.y_val
else:
self.active_x = self.x_test.toarray()
self.active_y = self.y_test
def __len__(self):
try:
y = self.active_y
return len(y)
except:
raise RuntimeError("Please set the split to use this method")
def __getitem__(self, idx: int) -> Tuple:
try:
x = self.active_x[idx]
y = self.active_y[idx]
return x, y
except:
raise RuntimeError("Please set the split to use this method")
class CFGenerativeDataset(BaseDataset):
supported_splits = ["train", "val", "test"]
def __init__(
self,
config_path: str,
root: str = None,
download: bool = False,
split: str = "train",
) -> None:
super().__init__(config_path, root, download)
if split not in self.supported_splits:
raise ValueError("Unsupported split")
tokenizer = T5Tokenizer.from_pretrained(self.config["model_name"])
text_data = pd.read_csv(self.config["paths"][split])
sent1_inp_ids = tokenizer(
list("contradict: " + text_data.sentence1),
return_tensors="pt",
padding="max_length",
truncation=True,
max_length=self.config["max_token_len"],
).input_ids
sent2_inp_ids = tokenizer(
list(text_data.sentence2),
return_tensors="pt",
padding="max_length",
truncation=True,
max_length=self.config["max_token_len"],
).input_ids
data = list(zip(sent1_inp_ids, sent2_inp_ids))
self.tokenizer = tokenizer
self.text_data = text_data
self.data = data
self.len = len(data)
def __len__(self) -> int:
return self.len
def __getitem__(self, idx) -> List[Tensor]:
return self.data[idx]
# class ParaphraseNMTDataset(BaseDataset):
# def __init__(
# self,
# config_path: str,
# root: str = None,
# download: bool = False,
# config_overrides: Dict = {},
# ):
# super().__init__(config_path, root, download)
# dataframe = pd.read_csv(self.config["paths"]["data"])
# tgt_col_nm = self.config["columns"]["tgt"]
# src_col_nm = self.config["columns"]["src"]
# tokenizer_name = (
# config_overrides["tokenizer"]
# if "tokenizer" in config_overrides
# else self.config["default_params"]["tokenizer"]
# )
# source_len = (
# config_overrides["source_len"]
# if "source_len" in config_overrides
# else self.config["default_params"]["source_len"]
# )
# target_len = (
# config_overrides["target_len"]
# if "target_len" in config_overrides
# else self.config["default_params"]["target_len"]
# )
# tokenizer = T5Tokenizer.from_pretrained(tokenizer_name)
# self.tokenizer = tokenizer
# self.data = dataframe
# self.source_len = source_len
# self.summ_len = target_len
# self.target_text = self.data[tgt_col_nm]
# self.source_text = self.data[src_col_nm]
# def __len__(self):
# return len(self.target_text)
# def __getitem__(self, index):
# source_text = str(self.source_text[index])
# target_text = str(self.target_text[index])
# # cleaning data so as to ensure data is in string type
# source_text = " ".join(source_text.split())
# target_text = " ".join(target_text.split())
# source = self.tokenizer.batch_encode_plus(
# [source_text],
# max_length=self.source_len,
# pad_to_max_length=True,
# truncation=True,
# padding="max_length",
# return_tensors="pt",
# )
# target = self.tokenizer.batch_encode_plus(
# [target_text],
# max_length=self.summ_len,
# pad_to_max_length=True,
# truncation=True,
# padding="max_length",
# return_tensors="pt",
# )
# source_ids = source["input_ids"].squeeze()
# source_mask = source["attention_mask"].squeeze()
# target_ids = target["input_ids"].squeeze()
# target_mask = target["attention_mask"].squeeze()
# return {
# "source_ids": source_ids.to(dtype=long),
# "source_mask": source_mask.to(dtype=long),
# "target_ids": target_ids.to(dtype=long),
# "target_ids_y": target_ids.to(dtype=long),
# }
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
import wget
import os, shutil
import yaml
import joblib
from typing import List, Tuple, Dict
try:
from .processors import Preprocessor, LUTLabelEncoder
except ImportError:
from processors import Preprocessor, LUTLabelEncoder
class BaseAnalysisModel:
def fit(self, x, y):
self.model.fit(x, y)
def save(self, save_path: str):
joblib.dump(self.model, save_path)
class RFModel(BaseAnalysisModel):
def __init__(self) -> None:
self.model = RandomForestClassifier(
bootstrap=False,
ccp_alpha=0.0,
class_weight=None,
criterion="gini",
max_depth=None,
max_features="sqrt",
max_leaf_nodes=None,
max_samples=None,
min_impurity_decrease=0.0,
min_samples_leaf=4,
min_samples_split=5,
min_weight_fraction_leaf=0.0,
n_estimators=100,
n_jobs=None,
oob_score=False,
random_state=None,
verbose=0,
warm_start=False,
)
class SVCModel(BaseAnalysisModel):
def __init__(self) -> None:
self.model = SVC(
C=10,
break_ties=False,
cache_size=200,
class_weight=None,
coef0=0.0,
decision_function_shape="ovr",
degree=3,
gamma=1,
kernel="rbf",
max_iter=-1,
probability=True,
random_state=None,
shrinking=True,
tol=0.001,
verbose=False,
)
class LRModel(BaseAnalysisModel):
def __init__(self) -> None:
self.model = LogisticRegression(
C=4.281332398719396,
class_weight=None,
dual=False,
fit_intercept=True,
intercept_scaling=1,
l1_ratio=None,
max_iter=100,
multi_class="auto",
n_jobs=None,
penalty="l2",
random_state=None,
solver="lbfgs",
tol=0.0001,
verbose=0,
warm_start=False,
)
class KNNModel(BaseAnalysisModel):
def __init__(self) -> None:
self.model = KNeighborsClassifier(
algorithm="auto",
leaf_size=30,
metric="minkowski",
metric_params=None,
n_jobs=None,
n_neighbors=90,
p=2,
weights="distance",
)
class AnalysisModelWrapper:
def __init__(self, config: Dict, model) -> None:
self.model = model
self.preprocessor = Preprocessor()
input_encoder_name = config["encoders"]["input_encoder_name"]
input_encoder_path = config["paths"][input_encoder_name]
self.input_encoder = joblib.load(input_encoder_path)
self.output_decoder = LUTLabelEncoder(config["encoders"]["output_labels"])
def __call__(self, txt_lst: List[str]) -> Tuple[List[float], List[str]]:
if type(txt_lst) == str:
txt_lst = [txt_lst]
txt_lst = [self.preprocessor(txt) for txt in txt_lst]
input_arr = self.input_encoder.transform(txt_lst)
prob = self.model.predict_proba(input_arr)
scores = prob[:, 1].tolist()
pred = prob.argmax(axis=1)
output = self.output_decoder.inverse_tranform(pred)
return scores, output
class DownloadableModel:
def __init__(
self, config_path: str, root: str = None, download: bool = False
) -> None:
with open(config_path) as handler:
config = yaml.load(handler, Loader=yaml.FullLoader)
self.name = config["name"]
if root is None:
self.root = os.path.abspath(os.path.split(config_path)[0])
else:
self.root = os.path.abspath(root)
self.source_url = config["source_url"]
self.config = config
self._parse_paths()
if not self._validate():
if download:
self._download_n_extract()
else:
raise FileNotFoundError(
f"Files not found in {self.root}. Use 'download=True' to download from source"
)
def _parse_paths(self) -> None:
for k, v in self.config["paths"].items():
self.config["paths"][k] = os.path.join(self.root, v)
def _validate(self) -> bool:
paths = self.config["paths"].values()
paths_exist = [os.path.exists(path) for path in paths]
valid = all(paths_exist)
return valid
def _download_n_extract(self) -> List[str]:
os.makedirs(self.root, exist_ok=True)
download_path = os.path.join(self.root, os.path.split(self.source_url)[1])
print(f"Downloading from source ({self.source_url}) to {download_path}")
download_path = wget.download(self.source_url, download_path)
shutil.unpack_archive(download_path, self.root)
class AnalysisModels(DownloadableModel):
def __init__(
self, config_path: str, root: str = None, download: bool = False
) -> None:
super().__init__(config_path, root, download)
self._load_models()
def _load_models(self) -> None:
for name in self.config["models"].keys():
path = self.config["paths"][name]
model = joblib.load(path)
model = AnalysisModelWrapper(self.config, model)
setattr(self, name, model)
def __str__(self) -> str:
return f"A collection of pretrained sklearn models.\nContains the models {list(self.config['models'].keys())}"
import pandas as pd
import numpy as np
from bs4 import BeautifulSoup
import re
import nltk
from nltk.tokenize import word_tokenize, toktok
from nltk.stem import WordNetLemmatizer
from nltk.corpus import stopwords
from nltk import pos_tag
from typing import List, Union
import joblib
from scipy.sparse._csr import csr_matrix
class LUTLabelEncoder:
def __init__(self, labels: List[str]) -> None:
self.lut = labels
def transform(self, df: pd.DataFrame) -> np.array:
enc_lbls = df.apply(lambda st: self.lut.index(st)).to_numpy()
return enc_lbls
def inverse_tranform(self, labels: List[int]) -> List[str]:
labels = [self.lut[lbl] for lbl in labels]
return labels
class Preprocessor:
def _strip_html(self, text):
soup = BeautifulSoup(text, "html.parser")
text = soup.get_text()
return text
def _remove_special_characters(self, text, remove_digits=True):
pattern = r"[^a-zA-z0-9\s]"
text = re.sub(pattern, "", text)
return text
def _remove_stopwords(self, text, is_lower_case=False):
tokens = self.tokenizer.tokenize(text)
tokens = [token.strip() for token in tokens]
if is_lower_case:
filtered_tokens = [
token for token in tokens if token not in self.stop_words
]
else:
filtered_tokens = [
token for token in tokens if token.lower() not in self.stop_words
]
filtered_text = " ".join(filtered_tokens)
return filtered_text
def _get_wordnet_pos(self, treebank_tag):
if treebank_tag.startswith("J"):
return "a" # Adjective
elif treebank_tag.startswith("V"):
return "v" # Verb
elif treebank_tag.startswith("N"):
return "n" # Noun
elif treebank_tag.startswith("R"):
return "r" # Adverb
else:
return "n" # Default to noun
def _lemmatize_text(self, text):
words = word_tokenize(text)
pos_tags = pos_tag(words) # Perform POS tagging
lemmatized_words = [
self.lemmatizer.lemmatize(word, pos=self._get_wordnet_pos(pos_tag))
for word, pos_tag in pos_tags
] # Lemmatize words with their respective POS tags
lemmatized_text = " ".join(lemmatized_words)
# cleaned_text = re.sub(
# r"\s*([.,!?:;])", r"\1", lemmatized_text
# ) # Apply regex to clean the text ("Hello world !" -> "Hello world!")
return lemmatized_text
def __init__(self):
# download the corpus
try:
nltk.download("wordnet", quiet=True)
nltk.download("punkt", quiet=True)
nltk.download("stopwords", quiet=True)
nltk.download("averaged_perceptron_tagger", quiet=True)
nltk.download("tagsets", quiet=True)
except:
pass
# initialize
self.stop_words = set(stopwords.words("english"))
self.tokenizer = toktok.ToktokTokenizer()
self.lemmatizer = WordNetLemmatizer()
def __call__(self, txt: str) -> str:
processed_txt = txt.lower()
processed_txt = self._strip_html(processed_txt)
processed_txt = self._remove_special_characters(processed_txt)
processed_txt = self._remove_stopwords(processed_txt)
processed_txt = self._lemmatize_text(processed_txt)
return processed_txt
class TextVectorizer:
def __init__(self, tfidf_path: str) -> None:
self.preproc = Preprocessor()
self.vectorizer = joblib.load(tfidf_path)
def __call__(self, txts: Union[List[str], str]) -> csr_matrix:
if type(txts) == str:
txts = [txts]
preprocs = [self.preproc(txt) for txt in txts] # l sentences
vects = self.vectorizer.transform(preprocs) # matrix of shape (l, n)
return vects
bs4
scikit-learn
nltk
wget
sentencepiece
transformers
accelerate>=0.20.3
\ No newline at end of file
import os
import joblib
from datasets import IMDBDataset
from models import SVCModel, RFModel, KNNModel, LRModel
if __name__ == "__main__":
# Directory variables
data_dir = os.environ["SM_CHANNEL_TRAIN"]
intermediate_data_dir = os.environ["SM_OUTPUT_INTERMEDIATE_DIR"]
model_output_dir = os.environ["SM_MODEL_DIR"]
output_data_dir = os.environ["SM_OUTPUT_DATA_DIR"]
ds_config_path = f"{data_dir}/imdb.yaml"
ds = IMDBDataset(ds_config_path, vectorizer_fitted=False)
print("Dataset instantiated")
rf_model = RFModel()
svc_model = SVCModel()
knn_model = KNNModel()
lr_model = LRModel()
print("Models instantiated")
rf_model.fit(ds.x_train, ds.y_train)
print("RF completed")
svc_model.fit(ds.x_train, ds.y_train)
print("SVC completed")
knn_model.fit(ds.x_train, ds.y_train)
print("KNN completed")
lr_model.fit(ds.x_train, ds.y_train)
print("LR completed")
rf_model.save(f"{model_output_dir}/rf.pkl")
svc_model.save(f"{model_output_dir}/svm.pkl")
knn_model.save(f"{model_output_dir}/knn.pkl")
lr_model.save(f"{model_output_dir}/lr.pkl")
joblib.dump(ds.input_encoder, f"{model_output_dir}/tfidf.pkl")
print("Models saved")
from train.t5 import fit
from datasets import CFGenerativeDataset
from torch.utils.data import DataLoader
from transformers import T5ForConditionalGeneration
import os
if __name__ == "__main__":
# Directory variables
data_dir = os.environ["SM_CHANNEL_TRAIN"]
intermediate_data_dir = os.environ["SM_OUTPUT_INTERMEDIATE_DIR"]
model_output_dir = os.environ["SM_MODEL_DIR"]
output_data_dir = os.environ["SM_OUTPUT_DATA_DIR"]
BATCH_SIZE = 4
EPOCHS = 100
PATIENCE = 20
SAVE_DIR = model_output_dir
MODEL_NAME = "t5-small"
train_ds = CFGenerativeDataset(f"{data_dir}/snli_1.0_contra.yaml", split="train")
val_ds = CFGenerativeDataset(f"{data_dir}/snli_1.0_contra.yaml", split="val")
train_dl = DataLoader(train_ds, batch_size=BATCH_SIZE, shuffle=True)
val_dl = DataLoader(val_ds, batch_size=32)
model = T5ForConditionalGeneration.from_pretrained(MODEL_NAME)
fit(
train_dl,
val_dl,
model,
epochs=EPOCHS,
patience=PATIENCE,
save_dir=SAVE_DIR,
)
import argparse
import yaml
import boto3
import pandas as pd
import os
import wget
import shutil
from typing import Dict
def download_n_extract(dir: str) -> pd.DataFrame:
url = "https://nlp.stanford.edu/projects/snli/snli_1.0.zip"
down_path = wget.download(url, f"{dir}/snli_1.0.zip")
shutil.unpack_archive(down_path, dir)
return f"{dir}/snli_1.0"
def get_sentence_pairs(dir: str) -> pd.DataFrame:
dfs = [
{"path": os.path.join(dir, fname), "name": fname.split(".")[0]}
for fname in os.listdir(dir)
if fname.endswith(".txt") and fname != "README.txt"
]
dfs = [{**df, "df": pd.read_csv(df["path"], delimiter="\t")} for df in dfs]
sentence_pairs = pd.DataFrame({"sentence1": [], "sentence2": []})
for df_obj in dfs:
df = df_obj["df"]
df = df[df.gold_label == "contradiction"]
df = df[["sentence1", "sentence2"]]
sentence_pairs = pd.concat([sentence_pairs, df], ignore_index=True)
sentence_pairs.dropna(inplace=True)
return sentence_pairs
def split(
df: pd.DataFrame, train_f: float = 0.8, val_f: float = 0.1, test_f: float = 0.1
) -> Dict[str, pd.DataFrame]:
assert train_f + val_f + test_f == 1.0
dfs = {}
sz = len(df)
start_idx = 0
end_idx = int(sz * train_f)
dfs["train"] = df[start_idx:end_idx].reset_index(drop=True)
start_idx = end_idx
end_idx = start_idx + int(sz * val_f)
dfs["val"] = df[start_idx:end_idx].reset_index(drop=True)
start_idx = end_idx
dfs["test"] = df[start_idx:].reset_index(drop=True)
return dfs
def save(splitted_dfs: Dict[str, pd.DataFrame], dir: str) -> None:
splits = ["train", "val", "test"]
for split in splits:
df = splitted_dfs[split]
path = f"{dir}/snli_1.0_contra_{split}.csv"
df.to_csv(path, index=False)
def make_ds_config(dir: str) -> None:
datasets_key = "datasets"
ds_name = os.path.split(dir)[1]
paths = os.listdir(dir)
paths = {path.split("_")[-1].split(".")[0]: path for path in paths}
source_url = (
f"https://sliit-xai.s3.ap-south-1.amazonaws.com/{datasets_key}/{ds_name}.zip"
)
model_name = "t5-small"
max_token_len = 64
config = {
"name": ds_name,
"source_url": source_url,
"paths": paths,
"model_name": model_name,
"max_token_len": max_token_len,
}
with open(f"{dir}/{ds_name}.yaml", "w") as handler:
yaml.dump(config, handler)
def upload_to_s3(dir: str) -> None:
s3_bucket = "sliit-xai"
datasets_key = "datasets"
ds_name = os.path.split(dir)[1]
s3 = boto3.client("s3")
# Upload files
for root, _, filenames in os.walk(dir):
for filename in filenames:
src_path = os.path.join(root, filename)
dst_key = src_path.replace(dir, f"{datasets_key}/{ds_name}", 1)
s3.upload_file(src_path, s3_bucket, dst_key)
# Upload archive
arch_path = shutil.make_archive(ds_name, "zip", dir)
s3.upload_file(arch_path, s3_bucket, f"{datasets_key}/{ds_name}.zip")
os.remove(arch_path)
def main(dir: str) -> None:
if os.path.exists(dir):
shutil.rmtree(dir)
os.makedirs(dir)
print("Downloading....")
extract_dir = download_n_extract(dir)
print("Processing....")
sentence_pairs = get_sentence_pairs(extract_dir)
splitted_dfs = split(sentence_pairs)
shutil.rmtree(dir)
os.makedirs(dir)
print("Exporting....")
save(splitted_dfs, dir)
make_ds_config(dir)
print("Uploading....")
upload_to_s3(dir)
print("Done!")
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("-d", "--directory", required=True)
args = parser.parse_args()
dir = args.directory
main(dir)
from .analyzers import KNNAnalyzer, SVMAnalyzer, RFAnalyzer, LRAnalyzer
import os
from typing import Dict, List
import yaml
import pandas as pd
import matplotlib.pyplot as plt
from sklearn import metrics
from scipy.sparse._csr import csr_matrix
class TestBench:
valid_analyzers = ["knn", "svm", "lr", "rf"]
def __init__(
self,
model_path: str,
vectorizer_path: str,
analyzer_name: str,
**kwargs,
) -> None:
# validate
if analyzer_name not in self.valid_analyzers:
raise ValueError(
f"Unsupported analyzer definition. Supported analyzer names are {self.valid_analyzers}"
)
if analyzer_name == "knn":
analyzer = KNNAnalyzer(
knn_path=model_path,
vectorizer_path=vectorizer_path,
**kwargs,
)
elif analyzer_name == "svm":
analyzer = SVMAnalyzer(
svm_path=model_path,
vectorizer_path=vectorizer_path,
**kwargs,
)
elif analyzer_name == "rf":
analyzer = RFAnalyzer(
model_path=model_path,
vectorizer_path=vectorizer_path,
**kwargs,
)
elif analyzer_name == "lr":
analyzer = LRAnalyzer(
model_path=model_path,
vectorizer_path=vectorizer_path,
**kwargs,
)
self._analyzer = analyzer
self._analyzer_name = analyzer_name
def __call__(
self,
configurations: Dict,
text: str,
variations: int = None,
log_dir: str = None,
) -> List[str]:
reports = []
for i, config in enumerate(configurations):
config_name = config["name"]
self._analyzer.set_config(config)
self._analyzer(text, variations)
report = self._analyzer.explanation()
if log_dir is not None:
config_log_dir = os.path.join(log_dir, f"{i+1}-config-{config_name}")
os.makedirs(config_log_dir, exist_ok=True)
with open(os.path.join(config_log_dir, "config.yaml"), "w") as handler:
yaml.dump(config, handler)
with open(os.path.join(config_log_dir, "report.txt"), "w") as handler:
handler.write(report)
report = f"==== Configuration {config_name} ({i+1}) ====\n" + report
reports.append(report)
return reports
def evaluate(self, x: csr_matrix, y: csr_matrix, save_dir: str = None) -> None:
model = self._analyzer._model
model_name = self._analyzer_name.upper()
prob = model.predict_proba(x)
pred = prob.argmax(axis=1)
report = metrics.classification_report(y, pred)
fig, ax = plt.subplots(1, 2, figsize=(8, 4))
plot_names = ["Confusion Matrix", "Label Correlogram"]
for i, name in enumerate(plot_names):
axis = ax[i]
axis.set_title(name)
if i == 0:
# confusion matrix
confusion_matrix = metrics.confusion_matrix(y, pred)
cm_display = metrics.ConfusionMatrixDisplay(
confusion_matrix=confusion_matrix,
display_labels=["Negative", "Positive"],
)
cm_display.plot(ax=axis)
else:
# label correlogram
correlation_matrix = pd.DataFrame(prob).corr().to_numpy()
cm_display = metrics.ConfusionMatrixDisplay(
confusion_matrix=correlation_matrix,
display_labels=["Negative", "Positive"],
)
cm_display.plot(ax=axis, cmap="coolwarm")
plt.tight_layout()
report = f" ---- Classification report for {model_name} ----\n{report}\n"
if save_dir is None:
print(report)
plt.show()
else:
os.makedirs(save_dir, exist_ok=True)
with open(os.path.join(save_dir, "evaluation.txt"), "w") as handler:
handler.write(report)
plt.savefig(os.path.join(save_dir, "evaluation.jpg"))
plt.close()
import torch
from torch.utils.data import DataLoader
from torch.optim import Adam
from transformers import T5ForConditionalGeneration
from tqdm.auto import tqdm
import os
def train_loop(
model: T5ForConditionalGeneration,
train_dl: DataLoader,
optimizer: Adam,
device: str,
) -> float:
avg_loss = 0
for X, y in tqdm(train_dl):
X, y = X.to(device), y.to(device)
loss = model(input_ids=X, labels=y).loss
avg_loss += loss.item()
optimizer.zero_grad()
loss.backward()
optimizer.step()
avg_loss /= len(train_dl)
return avg_loss
def val_loop(
model: T5ForConditionalGeneration, val_dl: DataLoader, device: str
) -> float:
avg_loss = 0
with torch.no_grad():
for X, y in tqdm(val_dl):
X, y = X.to(device), y.to(device)
loss = model(input_ids=X, labels=y).loss
avg_loss += loss.item()
avg_loss /= len(val_dl)
return avg_loss
def save(
model,
optimizer,
best_epoch,
best_loss,
best_model_weights,
best_optimizer_weights,
epoch,
final_loss,
save_dir,
):
save_obj = {
"best": {
"epoch": best_epoch,
"loss": best_loss,
"model": best_model_weights,
"optimizer": best_optimizer_weights,
},
"final": {
"epoch": epoch,
"loss": final_loss,
"model": model.state_dict(),
"optimizer": optimizer.state_dict(),
},
}
torch.save(save_obj, f"{save_dir}/states.tar")
def fit(
train_dl: DataLoader,
val_dl: DataLoader,
model: T5ForConditionalGeneration,
epochs: int = 100,
patience: int = 10,
save_dir: str = ".",
):
assert os.path.exists(save_dir), f"Save directory ({save_dir}) not found"
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using device: {device}")
model = model.to(device)
for param in model.encoder.parameters():
param.requires_grad = False
optimizer = Adam(filter(lambda p: p.requires_grad, model.parameters()))
# fine tume the model
best_epoch = -1
best_loss = float("inf")
best_model_weights = model.state_dict()
best_optimizer_weights = optimizer.state_dict()
for epoch in range(epochs):
print(f"----- Epoch {str(epoch+1).rjust(len(str(epochs)), '0')}/{epochs} -----")
train_loss = train_loop(model, train_dl, optimizer, device)
val_loss = val_loop(model, val_dl, device)
# update the best states
if val_loss < best_loss:
best_loss = val_loss
best_model_weights = model.state_dict()
best_optimizer_weights = optimizer.state_dict()
best_epoch = epoch
# save checkpoint
save(
model,
optimizer,
best_epoch,
best_loss,
best_model_weights,
best_optimizer_weights,
epoch,
val_loss,
save_dir,
)
# check for overfitting
if best_epoch + patience < epoch:
print(
f"Stoping early at epoch {epoch+1}, best loss ({best_loss}) observed at epcoh {best_epoch+1}"
)
break
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "UyNc81DoHzOU"
},
"source": [
"# Generate Counterfactuals by Distance"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "EJjH3xcvHzOY"
},
"outputs": [],
"source": [
"import numpy as np\n",
"from src.datasets import IMDBDataset\n",
"from src.models import AnalysisModels\n",
"from src.analyzers.svm import SVMDistanceAnalyzer"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "C0A-GJtyHzOZ",
"outputId": "afb57bcf-3879-4f4d-c245-0783c8c41747"
},
"outputs": [],
"source": [
"ds = IMDBDataset(config_path=\"./configs/datasets/imdb.yaml\", root=\"datasets/imdb\")\n",
"models = AnalysisModels(config_path=\"./configs/models/analysis-models.yaml\", root=\"models/analysis-models\")\n",
"model = models.svm.model\n",
"analyzer = SVMDistanceAnalyzer(model, ds, \"datasets/imdb/svm_buffer.json\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "i7HzsoGOHzOa",
"outputId": "b99d72cc-64b4-4bb7-c373-ed48fcc9e406"
},
"outputs": [],
"source": [
"import json\n",
"print(json.dumps(analyzer.get_counterfactual_examples([\n",
" \"I would like to remind that this movie was advertised as a real-life story. But what is this?. A waste of my good money!\",\n",
" \"This is the best movie I had watched so far. The marvelous CGI and super story line successfully kept the eyes of the audience fixed.\"\n",
"]), indent=4))"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "rYvdVcodHzOb"
},
"source": [
"# Generate Counterfactuals by Opposite Neighbourhood"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "XH3KTuE_HzOb"
},
"source": [
"## SVM Theory\n",
"What SVM does\n",
"$$\n",
"\\boxed{\"prompt\"}\\rightarrow\\boxed{Vector_{TFIDF}\\ (i.e., x)}\\rightarrow\\boxed{Vector_{SVM}\\ (i.e., \\phi(x))}\\\\\n",
"x\\in R^m,\\ \\phi(x)\\in R^n.\n",
"$$\n",
"Note that here $m$ is the number of dimensions in the vector space of the TFIDF Vectorizer and $n$ is the number of dimensions in the vector space learnt by the SVM. $\\phi(x)$ is known as the kernel function. Hence, the vector space learnt by the SVM is commonly known as the **output vector space of the kernel function**.\n",
"\n",
"Once the SVM learns the vector space of $\\phi(x)$, it finds the best hyperplane that satisfies $w^T.\\phi(x)+b=0$. Note that $w$ are the coefficients with size $n$. This equation can be expanded as $w_1\\phi_1(x)+w_2\\phi_2(x)+w_3\\phi_3(x)...+w_n\\phi_n(x)+b$\n",
"\n",
"## Our method\n",
"We will be using the following method to generate counter factuals for a given $prompt$.\n",
"1. Generate a set of contradictory prompts for the given prompt.\n",
"2. Vectorize all the prompts using the TFIDF vetorizer into the vector space $X$.\n",
"3. Project all the vectors into the SVM's kernel space $K$.\n",
"4. Find the mirror point of the given prompt's TFIDF vector on the hyperplane of the SVM ($C$).\n",
"5. Out of the vectors of the contradictory prompts, find the closest point to $C$. Then the prompt corresponding to this point will be returned\n",
"\n",
"### 1. Contradictory prompt generation\n",
"\n",
"We will generate contradictory prompts for a given prompt $prompt_0$ using a finetuned T5 model/ custom WordFlippingGenerator. These new prompts will be $[contradictory\\_prompt_i]$\n",
"\n",
"\n",
"### 2. Vectorize into $X$\n",
"$$\n",
"\\boxed{\"prompt_0\"}\\rightarrow\\boxed{Vector_{TFIDF}\\ (i.e., x_0)}\n",
"$$\n",
"The $prompt_0$ will be mapped to $x_0$ from the TFIFT vectorizer. The $[contradictory\\_prompt_i]$ s will be mapped to $[x_{c,i}]$\n",
"\n",
"### 3. Project into $K$\n",
"SVM is already learnt. i.e., we know the kernel function ($\\phi(.)$), coefficients ($w$), and the bias ($b$). Hence, we will project $x_0$ and $[x_{c,i}]$ into $K$\n",
"\n",
"$$\n",
"\\boxed{Vector_{TFIDF}\\ (i.e., x_0)}\\rightarrow\\boxed{Vector_{SVM}\\ (i.e., \\phi(x_0)\\in K)}\n",
"$$\n",
"We will call this $\\phi(x_0)$ as $A$ for simplicity\n",
"\n",
"##### $\\phi(.)$ when the kernel is RBF\n",
"Assume that a single TFIDF vector will have the size $n$ and the number of support vectors will be $m$. The RBF kernel is given by\n",
"$$\n",
"K( \\overrightarrow{x}, \\overrightarrow{l^m})=e^{-\\gamma{||\\overrightarrow{x}-\\overrightarrow{l^m}||}^2}\n",
"$$\n",
"Here $x$ is a vector in the TFIDF vector space with size $m$. $l^m$ is a collection of $m$ vectors (i.e., the support vectors) with each of size $n$.\n",
"\n",
"### 4. Find $A$'s mirror point ($C$)\n",
"Once we have $\\phi(x_0)$ for the given prompt, we find its opposite projection on the hyperplane of the SVM characterised by $w$ and $b$. For simplicity, we'll call $\\phi(x_0)$ as $A$ and $hyperplane$ as $h$.\n",
"$$\n",
"hyperplane=h\\equiv(w_1, w_2,...w_n, b) \\\\\n",
"\\phi(x_0)=A=(a_1,a_2,...a_n)\n",
"$$\n",
"Any line $l$ which is normal to the $h$ through $A$ will be given by the parametric equation\n",
"$$\n",
"l\\equiv A+tw=0\\\\\n",
"l\\equiv (a_1+tw_1, a_2+tw_2,...,a_n+tw_n)\n",
"$$\n",
"Let $t$ take the value $t_0$ at the point $B$ that lies on this line and the hyperplane.\n",
"$$\n",
"B\\equiv (a_1+t_0w_1, a_2+t_0w_2,...,a_n+t_0w_n)\n",
"$$\n",
"Since this point would also satisfy the hyperplane,\n",
"$$\n",
"w^T.l_0+b=0 \\\\\n",
"(w_1(a_1+t_0w_1), w_2(a_2+t_0w_2),...,w_n(a_n+t_0w_n))+b=0\\\\\n",
"t_0=-\\frac{(b+w^T.A)}{||w||_2^2}\n",
"$$\n",
"The mirror point $C$ will exist where $t=2t_0$. Hence,\n",
"$$\n",
"C\\equiv (a_1+2t_0w_1, a_2+2t_0w_2,...,a_n+2t_0w_n)\n",
"$$\n",
"\n",
"##### Example\n",
"\n",
"Reflection of point $A(3,1,2)$ on the hyperplane $x+2y+z=1$ (Note that here, $(3,1,2)=(a_1,a_2,a_3)$, $(1,2,1)=(w_1,w_2,w_3)$, and $-1=b$)\n",
"1. Construct the line normal to the plane that intersects point $A(3,1,2)$:\n",
" $$\n",
" line (x,y,z)=(3,1,2)+t(1,2,1)\n",
" $$\n",
" (Any line normal to the plane $x+2y+z=c$ will move in the direction $(1,2,1)$)\n",
"\n",
"2. Find the point B on the normal line that intersects the plane:\n",
" $$\n",
" (3+t)+2(1+2t)+(2+t)=1\n",
" $$\n",
"\n",
" Solving, we get $t=-1$. Hence the intersection point $B$ (on the plane) is at $(2,-1,1)$.\n",
"\n",
"3. Point $A$ is at $t=0$, and point $B$ is at $t=-1$, so the mirror image of $A$, say $\\hat{A}$ will be twice the distance, at $t=-2$:\n",
" $$\n",
" \\boxed{ \\hat{A} \\equiv (1,-3,0)}\n",
" $$\n",
"\n",
"### 5. Find the closest point to C and retreive the contradictory prompt\n",
"Now that the mirror point and contradictory points are all in the kerrnel space, the distance to the contradictory points from the point $C$ will be found. The prompt will be selected as the one which yields the smallest distance to the point $C$.\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Tests"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "liq_5eWNHzOb",
"outputId": "d05f17cc-1946-492b-daaf-92557bc717fc"
},
"outputs": [],
"source": [
"import numpy as np\n",
"from src.models import AnalysisModels\n",
"from src.datasets import IMDBDataset\n",
"from tqdm.auto import tqdm\n",
"\n",
"models = AnalysisModels(\"./configs/models/analysis-models.yaml\", \"./models/analysis-models\")\n",
"dataset = IMDBDataset(\"./configs/datasets/imdb.yaml\", \"./datasets/imdb/\")\n",
"\n",
"svc_rbf = models.svm.model\n",
"x = dataset.x_test.toarray()\n",
"p = np.random.randn(x.shape[1])"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "55XDlY_MHzOc"
},
"source": [
"### Linear kernel"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "rNSyZMDVHzOe",
"outputId": "5ab4ee92-a2d9-4d75-80bc-9bf41369b9b8"
},
"outputs": [],
"source": [
"import numpy as np\n",
"\n",
"# Your code here (with the provided function)\n",
"from sklearn.svm import SVC\n",
"import numpy as np\n",
"\n",
"X_train = np.array([[0, 0], [1, 1], [2, 2], [3, 3]])\n",
"y_train = np.array([0, 0, 1, 1])\n",
"\n",
"svm = SVC(kernel='linear', degree=3)\n",
"svm.fit(X_train, y_train)\n",
"\n",
"def get_mirror_point_linear(svm, qp):\n",
" w = svm.coef_[0]\n",
" b = svm.intercept_[0]\n",
" t = -(b+w.dot(qp))/(np.linalg.norm(w)**2)\n",
" mp = qp + 2*t*w\n",
" return mp\n",
"\n",
"query_point = np.array([13, 5])\n",
"mirror_point = get_mirror_point_linear(svm, query_point)\n",
"\n",
"print(f\"Query point: {query_point}, Distance: {svm.decision_function([query_point])}\")\n",
"print(f\"Mirror point: {mirror_point}, Distance: {svm.decision_function([mirror_point])}\")\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "hwMPqhByHzOe"
},
"source": [
"### Polynomial Kernel"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "lvqutm_9HzOe",
"outputId": "3425d936-793f-41b8-8822-cb9045319f7c"
},
"outputs": [],
"source": [
"import numpy as np\n",
"\n",
"# Your code here (with the provided function)\n",
"from sklearn.svm import SVC\n",
"import numpy as np\n",
"\n",
"X_train = np.array([[0, 0], [1, 1], [2, 2], [3, 3]])\n",
"y_train = np.array([0, 0, 1, 1])\n",
"\n",
"# Linear kernel SVM\n",
"svm_linear = SVC(kernel='linear', degree=3)\n",
"svm_linear.fit(X_train, y_train)\n",
"\n",
"# Polynomial kernel SVM\n",
"svm_poly = SVC(kernel='poly', degree=3)\n",
"svm_poly.fit(X_train, y_train)\n",
"\n",
"def get_k_poly(svm, x):\n",
" svs = svm.support_vectors_\n",
" d = svm.degree\n",
" gamma = svm._gamma\n",
" print(svs.shape, x.shape)\n",
" k = (gamma * np.dot(svs, x.T) + 1) ** d\n",
" return k\n",
"\n",
"def distance_to_hyperplane(svm_model, point):\n",
" # Get the support vectors, coefficients, and bias from the trained SVM model\n",
" support_vectors = svm_model.support_vectors_\n",
" dual_coefficients = svm_model.dual_coef_[0]\n",
" bias = svm_model.intercept_\n",
"\n",
" # Get the kernel coefficient and degree from the trained SVM model\n",
" gamma = svm_model._gamma\n",
" degree = svm_model.degree\n",
"\n",
" # Compute the distance to the hyperplane\n",
" distance = 0.0\n",
" for i in range(len(support_vectors)):\n",
" # Calculate the polynomial kernel between the support vector and the given point\n",
" kernel_value = (gamma * np.dot(support_vectors[i], point) + 1) ** degree\n",
"\n",
" # Update the distance using the support vector and kernel value\n",
" distance += dual_coefficients[i] * kernel_value\n",
"\n",
" # Add the bias term to the distance\n",
" distance += bias\n",
"\n",
" return distance\n",
"\n",
"p = np.array([\n",
" [-8,2],\n",
" [-3,3],\n",
" [-2,4],\n",
" [-1,5]\n",
"])\n",
"# svm_poly.decision_function(p), distance_to_hyperplane(svm_poly, p[0])\n",
"get_k_poly(svm_poly, p).shape\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "KfOXK5toHzOf"
},
"source": [
"### RBF Kernel Analysis"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "eHYR85aqHzOf"
},
"source": [
"#### Implementation"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "ijGzCIE_HzOf"
},
"source": [
"##### Method 1"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "mhZRjBr4HzOf"
},
"outputs": [],
"source": [
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import pandas as pd\n",
"from sklearn.datasets import make_circles\n",
"from sklearn.svm import SVC\n",
"from sklearn.metrics import accuracy_score\n",
"\n",
"X, y = make_circles(n_samples=500, noise=0.06, random_state=42)\n",
"\n",
"df = pd.DataFrame(dict(x1=X[:, 0], x2=X[:, 1], y=y))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "LzkSp168HzOf",
"outputId": "3ef68dd3-8ff6-4078-d0df-6f3d7b763311"
},
"outputs": [],
"source": [
"colors = {0:'blue', 1:'yellow'}\n",
"fig, ax = plt.subplots()\n",
"grouped = df.groupby('y')\n",
"for key, group in grouped:\n",
" group.plot(ax=ax, kind='scatter', x='x1', y='x2', label=key, color = colors[key])\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "XoYE52mcHzOg"
},
"outputs": [],
"source": [
"def RBF(X, gamma):\n",
"\n",
" # Free parameter gamma\n",
" if gamma == None:\n",
" gamma = 1.0/X.shape[1]\n",
"\n",
" # RBF kernel Equation\n",
" K = np.exp(-gamma * np.sum((X - X[:,np.newaxis])**2, axis = -1))\n",
"\n",
" return K"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "x30blrlKHzOg"
},
"outputs": [],
"source": [
"clf_rbf = SVC(kernel=\"rbf\")\n",
"clf_rbf.fit(X, y)\n",
"K = RBF(X, gamma=clf_rbf._gamma)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "GOzY_G2WHzOg",
"outputId": "947e6722-de29-4745-db34-e893657e2738"
},
"outputs": [],
"source": [
"clf = SVC(kernel=\"linear\")\n",
"\n",
"clf.fit(K, y)\n",
"\n",
"pred = clf.predict(K)\n",
"\n",
"print(\"Accuracy: \",accuracy_score(pred, y))"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "Y5n0Ksg8HzOg"
},
"source": [
"##### Method 2"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "1ycJT7iHHzOg",
"outputId": "3145d339-be04-4014-db1a-e77d7df34e81"
},
"outputs": [],
"source": [
"import numpy as np\n",
"\n",
"from sklearn.svm import SVC\n",
"import numpy as np\n",
"\n",
"X_train = np.array([[0, 0], [1, 1], [2, 2], [3, 3]])\n",
"y_train = np.array([0, 0, 1, 1])\n",
"\n",
"svm = SVC(kernel='rbf', gamma='scale', degree=3)\n",
"svm.fit(X_train, y_train)\n",
"\n",
"def get_distance_rbf(svm, qp):\n",
" # Calculate the distance from the query point to each support vector\n",
" sv = svm.support_vectors_\n",
" distances = np.linalg.norm(sv - qp, axis=1)\n",
"\n",
" # Use the decision function to get the weight (distance) for each support vector\n",
" decision_values = svm.decision_function([qp])[0]\n",
" weights = np.exp(-svm._gamma * (distances ** 2)) * decision_values\n",
"\n",
" # Calculate the weighted average of the support vectors to obtain the approximate mirror point\n",
" weighted_average = np.average(sv, axis=0, weights=weights)\n",
"\n",
" return weighted_average\n",
"\n",
"query_point = np.array([1, 5])\n",
"mirror_point = get_distance_rbf(svm, query_point)\n",
"\n",
"print(f\"Query point: {query_point}, Distance: {svm.decision_function([query_point])}\")\n",
"print(f\"Mirror point: {mirror_point}, Distance: {svm.decision_function([mirror_point])}\")\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "P58vL9l-HzOg"
},
"source": [
"##### Method 3"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "k1zjwQxYHzOh",
"outputId": "90ee7d90-abac-43cd-fc42-dc452f3da68e"
},
"outputs": [],
"source": [
"import numpy as np\n",
"from sklearn.svm import SVC\n",
"\n",
"def rbf_kernel(x, y, gamma):\n",
" return np.exp(-gamma * np.linalg.norm(x - y) ** 2)\n",
"\n",
"def distance_to_hyperplane(clf, x_i):\n",
" # Get support vectors and dual coefficients\n",
" support_vectors = clf.support_vectors_\n",
" dual_coefficients = clf.dual_coef_.ravel()\n",
" gamma = clf._gamma\n",
"\n",
" # Compute the decision function value\n",
" decision_function_value = 0\n",
" for i in range(len(support_vectors)):\n",
" decision_function_value += dual_coefficients[i] * rbf_kernel(support_vectors[i], x_i, gamma)\n",
"\n",
" decision_function_value += clf.intercept_\n",
"\n",
" # Compute the distance\n",
" norm_w = np.linalg.norm(clf.dual_coef_ @ support_vectors)\n",
" distance = decision_function_value / norm_w\n",
" return distance\n",
"\n",
"X_train = np.array([[0, 0], [1, 1], [2, 2], [3, 3]])\n",
"y_train = np.array([0, 0, 1, 1])\n",
"x_i = [-1,1]\n",
"\n",
"gamma = 0.1\n",
"clf = SVC(kernel='rbf', gamma=gamma)\n",
"clf.fit(X_train, y_train)\n",
"\n",
"distance_to_x_i = distance_to_hyperplane(clf, x_i)\n",
"distance_to_x_i, clf.decision_function([x_i])"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"id": "xobI4RCaHzOh"
},
"source": [
"### RBF Kernel"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"referenced_widgets": [
"79a306f1966c4dc4bd12a14f802fbe03"
]
},
"id": "t0x3ISf4HzOh",
"outputId": "ef70837b-ffa5-49af-fcdf-3d74edf5f67b"
},
"outputs": [],
"source": [
"def split_matrix(mat, max_sz):\n",
" num_rows = mat.shape[0]\n",
" num_splits = (num_rows - 1) // max_sz + 1\n",
" split_matrices = []\n",
"\n",
" for i in range(num_splits):\n",
" start_idx = i * max_sz\n",
" end_idx = min((i + 1) * max_sz, num_rows)\n",
" split_matrices.append(mat[start_idx:end_idx])\n",
"\n",
" return tuple(split_matrices)\n",
"\n",
"def calc_dif_norms(l_mat, s_mat, axis=1, max_sz=1000, show_prog=False):\n",
" mat_tup = split_matrix(l_mat, max_sz)\n",
" norms = []\n",
"\n",
" if show_prog: print(\"Calculating norms...\")\n",
" for m in tqdm(mat_tup, disable=not show_prog):\n",
" n_m = np.expand_dims(m, axis=1)\n",
" norm_batch = np.linalg.norm(n_m-s_mat, axis=1)\n",
" norms.extend(norm_batch)\n",
" norms = np.array(norms)\n",
" return norms\n",
"\n",
"def rbf(x, model):\n",
" gamma = model._gamma\n",
" svs = model.support_vectors_.toarray()\n",
" norms = calc_dif_norms(svs, x, show_prog=True)\n",
" k = np.exp(-gamma*norms)\n",
" return k\n",
"\n",
"\n",
"k = rbf(x[:2], svc_rbf)\n",
"k.shape"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Final Implementation"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Counterfactual Generator: T5"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from src.analyzers import SVMAnalyzer\n",
"analyzer = SVMAnalyzer(\n",
" svm_path=\"./models/analysis-models/svm.pkl\",\n",
" vectorizer_path=\"./models/analysis-models/tfidf.pkl\",\n",
" cf_generator_config_path=\"./configs/models/t5-cf-generator.yaml\",\n",
" cf_generator_root=\"./models/cf-generator\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"review = \"One of the other reviewers has mentioned that after watching just 1 Oz episode you'll be hooked. They are right, as this is exactly what happened with me. The first thing that struck me about Oz was its brutality and unflinching scenes of violence, which set in right from the word GO. Trust me, this is not a show for the faint hearted or timid. This show pulls no punches with regards to drugs, sex or violence. Its is hardcore, in the classic use of the word.<br /><br />It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary. It focuses mainly on Emerald City, an experimental section of the prison where all the cells have glass fronts and face inwards, so privacy is not high on the agenda. Em City is home to many..Aryans, Muslims, gangstas, Latinos, Christians, Italians, Irish and more....so scuffles, death stares, dodgy dealings and shady agreements are never far away.<br /><br />I would say the main appeal of the show is due to the fact that it goes where other shows wouldn't dare. Forget pretty pictures painted for mainstream audiences, forget charm, forget romance...OZ doesn't mess around. The first episode I ever saw struck me as so nasty it was surreal, I couldn't say I was ready for it, but as I watched more, I developed a taste for Oz, and got accustomed to the high levels of graphic violence. Not just violence, but injustice (crooked guards who'll be sold out for a nickel, inmates who'll kill on order and get away with it, well mannered, middle class inmates being turned into prison bitches due to their lack of street skills or prison experience) Watching Oz, you may become comfortable with what is uncomfortable viewing....thats if you can get in touch with your darker side.\"\n",
"search_space = 2\n",
"cf = analyzer(review, search_space)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"explanation = analyzer.explanation()\n",
"print(explanation)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Counterfactual Generator: WordFlipping"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Predefined configuration"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"======== Analysis Report ========\n",
"\n",
"Input text : One of the other reviewers has mentioned that after watching just 1 Oz episode you'll be hooked. They are right, as this is exactly what happened with me. The first thing that struck me about Oz was its brutality and unflinching scenes of violence, which set in right from the word GO. Trust me, this is not a show for the faint hearted or timid. This show pulls no punches with regards to drugs, sex or violence. Its is hardcore, in the classic use of the word.<br /><br />It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary. It focuses mainly on Emerald City, an experimental section of the prison where all the cells have glass fronts and face inwards, so privacy is not high on the agenda. Em City is home to many..Aryans, Muslims, gangstas, Latinos, Christians, Italians, Irish and more....so scuffles, death stares, dodgy dealings and shady agreements are never far away.<br /><br />I would say the main appeal of the show is due to the fact that it goes where other shows wouldn't dare. Forget pretty pictures painted for mainstream audiences, forget charm, forget romance...OZ doesn't mess around. The first episode I ever saw struck me as so nasty it was surreal, I couldn't say I was ready for it, but as I watched more, I developed a taste for Oz, and got accustomed to the high levels of graphic violence. Not just violence, but injustice (crooked guards who'll be sold out for a nickel, inmates who'll kill on order and get away with it, well mannered, middle class inmates being turned into prison bitches due to their lack of street skills or prison experience) Watching Oz, you may become comfortable with what is uncomfortable viewing....thats if you can get in touch with your darker side.\n",
"\n",
"Generated contradictory texts : \n",
"\tOne of the other reviewers has mentioned that after watching just 1 Oz episode you 'll differ undercharge . They are right , as this differ exactly what dematerialize with me . The first thing that struck me about Oz was its brutality and unflinching scenes of violence , which set in right from the word GO . Trust me , this differ not a show for the faint hearted or timid . This show repel no punches with regards to drugs , sex or violence . Its differ hardcore , in the classic use of the word. < br / > < br / > It differ called OZ as that differ the nickname take to the Oswald Maximum Security State Penitentary . It blur mainly on Emerald City , an experimental section of the prison where all the cells have glass fronts and face inwards , so privacy differ not high on the agenda . Em City is home to many .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and more .... so scuffles , death stares , dodgy dealings and shady agreements are never far away. < br / > < br / > I would say the main appeal of the show differ due to the fact that it goes where other shows would n't dare . Forget pretty pictures painted for mainstream audiences , forget charm , forget romance ... OZ unmake n't mess around . The first episode I ever saw miss me as so nasty it differ surreal , I could n't say I was ready for it , but as I watched more , I developed a taste for Oz , and got accustomed to the high levels of graphic violence . Not just violence , but injustice ( crooked guards who 'll be buy out for a nickel , inmates who 'll kill on order and get away with it , well mannered , middle class inmates being unturned into prison bitches due to their lack of street skills or prison experience ) Watching Oz , you may become comfortable with what is uncomfortable viewing .... thats if you can end in touch with your darker side .\n",
"\tOne of the other reviewers abstain mentioned that after watching just 1 Oz episode you 'll be hooked . They differ right , as this differ exactly what dematerialise with me . The first thing that struck me about Oz differ its brutality and unflinching scenes of violence , which rise in right from the word GO . Trust me , this differ not a show for the faint hearted or timid . This show pulls no punches with regards to drugs , sex or violence . Its differ hardcore , in the classic use of the word. < br / > < br / > It is called OZ as that differ the nickname starve to the Oswald Maximum Security State Penitentary . It blur mainly on Emerald City , an experimental section of the prison where all the cells abstain glass fronts and face inwards , so privacy is not high on the agenda . Em City is away to many .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and more .... so scuffles , death stares , dodgy dealings and shady agreements are never far away. < br / > < br / > I would say the main appeal of the show differ due to the fact that it be born where other shows would n't dare . Forget pretty pictures unpainted for mainstream audiences , forget charm , remember romance ... OZ unmake n't mess around . The first episode I ever saw struck me as so nasty it differ surreal , I could n't say I differ ready for it , but as I watched more , I undeveloped a taste for Oz , and take away accustomed to the high levels of graphic violence . Not just violence , but injustice ( crooked guards who 'll be sold out for a nickel , inmates who 'll kill on order and get away with it , well mannered , middle class inmates being turned into prison bitches due to their lack of street skills or prison experience ) Watching Oz , you may become comfortable with what is uncomfortable viewing .... thats if you can leave in touch with your darker side .\n",
"\tOne of the other reviewers has mentioned that after watching just 1 Oz episode you 'll differ unhook . They differ right , as this differ exactly what dematerialize with me . The first thing that struck me about Oz differ its brutality and unflinching scenes of violence , which set in right from the word GO . Trust me , this differ not a show for the faint hearted or timid . This show push no punches with regards to drugs , sex or violence . Its is hardcore , in the classic use of the word. < br / > < br / > It is called OZ as that differ the nickname given to the Oswald Maximum Security State Penitentary . It focuses mainly on Emerald City , an experimental section of the prison where all the cells have glass fronts and face inwards , so privacy is not high on the agenda . Em City is away to many .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and more .... so scuffles , death stares , dodgy dealings and shady agreements are never far away. < br / > < br / > I would say the main appeal of the show is due to the fact that it goes where other shows would n't dare . Forget pretty pictures painted for mainstream audiences , forget charm , remember romance ... OZ unmake n't mess around . The first episode I ever saw miss me as so nasty it differ surreal , I could n't say I differ ready for it , but as I watched more , I developed a taste for Oz , and got unaccustomed to the high levels of graphic violence . Not just violence , but injustice ( crooked guards who 'll be buy out for a nickel , inmates who 'll kill on order and take away away with it , well mannered , middle class inmates being turned into prison bitches due to their lack of street skills or prison experience ) Watching Oz , you may become comfortable with what is uncomfortable viewing .... thats if you can get in touch with your darker side .\n",
"\tOne of the other reviewers has mentioned that after watching just 1 Oz episode you 'll be hooked . They are right , as this is exactly what dematerialise with me . The first thing that miss me about Oz was its brutality and unflinching scenes of violence , which set in right from the word GO . Trust me , this is not a show for the faint hearted or timid . This show push no punches with regards to drugs , sex or violence . Its is hardcore , in the classic use of the word. < br / > < br / > It differ called OZ as that differ the nickname starve to the Oswald Maximum Security State Penitentary . It focuses mainly on Emerald City , an experimental section of the prison where all the cells have glass fronts and face inwards , so privacy differ not high on the agenda . Em City differ home to many .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and more .... so scuffles , death stares , dodgy dealings and shady agreements differ never far away. < br / > < br / > I would say the main appeal of the show differ due to the fact that it goes where other shows would n't dare . Forget pretty pictures painted for mainstream audiences , mind charm , mind romance ... OZ unmake n't mess around . The first episode I ever saw struck me as so nasty it differ surreal , I could n't say I differ ready for it , but as I watched more , I undeveloped a taste for Oz , and end unaccustomed to the high levels of graphic violence . Not just violence , but injustice ( crooked guards who 'll be unsold out for a nickel , inmates who 'll kill on order and get away with it , well mannered , middle class inmates being turned into prison bitches due to their lack of street skills or prison experience ) Watching Oz , you may become comfortable with what differ uncomfortable viewing .... thats if you can get in touch with your darker side .\n",
"\tOne of the other reviewers has mentioned that after watching just 1 Oz episode you 'll be unhook . They differ right , as this differ exactly what happened with me . The first thing that struck me about Oz was its brutality and unflinching scenes of violence , which rise in right from the word GO . Trust me , this differ not a show for the faint hearted or timid . This show pulls no punches with regards to drugs , sex or violence . Its is hardcore , in the classic use of the word. < br / > < br / > It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary . It focuses mainly on Emerald City , an experimental section of the prison where all the cells abstain glass fronts and face inwards , so privacy is not high on the agenda . Em City differ away to many .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and more .... so scuffles , death stares , dodgy dealings and shady agreements are never far away. < br / > < br / > I would say the main appeal of the show differ due to the fact that it stop where other shows would n't dare . Forget pretty pictures painted for mainstream audiences , mind charm , remember romance ... OZ does n't mess around . The first episode I ever saw struck me as so nasty it differ surreal , I could n't say I was ready for it , but as I watched more , I undeveloped a taste for Oz , and leave accustomed to the high levels of graphic violence . Not just violence , but injustice ( crooked guards who 'll be buy out for a nickel , inmates who 'll kill on order and get away with it , well mannered , middle class inmates nonexistence unturned into prison bitches due to their lack of street skills or prison experience ) Watching Oz , you may become comfortable with what differ uncomfortable viewing .... thats if you can get in touch with your darker side .\n",
"\n",
"Distances to the mirror point : \n",
"\t0.9996843344271062\n",
"\t0.999649545631532\n",
"\t0.9997204943525393\n",
"\t0.9996656959955381\n",
"\t0.9997478679173792\n",
"\n",
"Closest contradictory text : One of the other reviewers abstain mentioned that after watching just 1 Oz episode you 'll be hooked . They differ right , as this differ exactly what dematerialise with me . The first thing that struck me about Oz differ its brutality and unflinching scenes of violence , which rise in right from the word GO . Trust me , this differ not a show for the faint hearted or timid . This show pulls no punches with regards to drugs , sex or violence . Its differ hardcore , in the classic use of the word. < br / > < br / > It is called OZ as that differ the nickname starve to the Oswald Maximum Security State Penitentary . It blur mainly on Emerald City , an experimental section of the prison where all the cells abstain glass fronts and face inwards , so privacy is not high on the agenda . Em City is away to many .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and more .... so scuffles , death stares , dodgy dealings and shady agreements are never far away. < br / > < br / > I would say the main appeal of the show differ due to the fact that it be born where other shows would n't dare . Forget pretty pictures unpainted for mainstream audiences , forget charm , remember romance ... OZ unmake n't mess around . The first episode I ever saw struck me as so nasty it differ surreal , I could n't say I differ ready for it , but as I watched more , I undeveloped a taste for Oz , and take away accustomed to the high levels of graphic violence . Not just violence , but injustice ( crooked guards who 'll be sold out for a nickel , inmates who 'll kill on order and get away with it , well mannered , middle class inmates being turned into prison bitches due to their lack of street skills or prison experience ) Watching Oz , you may become comfortable with what is uncomfortable viewing .... thats if you can leave in touch with your darker side .\n",
"\n"
]
}
],
"source": [
"from src.analyzers import SVMAnalyzer\n",
"analyzer = SVMAnalyzer(\n",
" svm_path=\"./models/analysis-models/svm.pkl\",\n",
" vectorizer_path=\"./models/analysis-models/tfidf.pkl\",\n",
" cf_generator_config=\"./configs/models/wf-cf-generator.yaml\"\n",
")\n",
"review = \"One of the other reviewers has mentioned that after watching just 1 Oz episode you'll be hooked. They are right, as this is exactly what happened with me. The first thing that struck me about Oz was its brutality and unflinching scenes of violence, which set in right from the word GO. Trust me, this is not a show for the faint hearted or timid. This show pulls no punches with regards to drugs, sex or violence. Its is hardcore, in the classic use of the word.<br /><br />It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary. It focuses mainly on Emerald City, an experimental section of the prison where all the cells have glass fronts and face inwards, so privacy is not high on the agenda. Em City is home to many..Aryans, Muslims, gangstas, Latinos, Christians, Italians, Irish and more....so scuffles, death stares, dodgy dealings and shady agreements are never far away.<br /><br />I would say the main appeal of the show is due to the fact that it goes where other shows wouldn't dare. Forget pretty pictures painted for mainstream audiences, forget charm, forget romance...OZ doesn't mess around. The first episode I ever saw struck me as so nasty it was surreal, I couldn't say I was ready for it, but as I watched more, I developed a taste for Oz, and got accustomed to the high levels of graphic violence. Not just violence, but injustice (crooked guards who'll be sold out for a nickel, inmates who'll kill on order and get away with it, well mannered, middle class inmates being turned into prison bitches due to their lack of street skills or prison experience) Watching Oz, you may become comfortable with what is uncomfortable viewing....thats if you can get in touch with your darker side.\"\n",
"search_space = 5\n",
"cf = analyzer(review, search_space)\n",
"explanation = analyzer.explanation()\n",
"print(explanation)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Test bench"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"[nltk_data] Error loading wordnet: <urlopen error [SSL:\n",
"[nltk_data] CERTIFICATE_VERIFY_FAILED] certificate verify failed:\n",
"[nltk_data] Hostname mismatch, certificate is not valid for\n",
"[nltk_data] 'raw.githubusercontent.com'. (_ssl.c:1129)>\n",
"[nltk_data] Error loading punkt: <urlopen error [SSL:\n",
"[nltk_data] CERTIFICATE_VERIFY_FAILED] certificate verify failed:\n",
"[nltk_data] Hostname mismatch, certificate is not valid for\n",
"[nltk_data] 'raw.githubusercontent.com'. (_ssl.c:1129)>\n",
"[nltk_data] Error loading stopwords: <urlopen error [SSL:\n",
"[nltk_data] CERTIFICATE_VERIFY_FAILED] certificate verify failed:\n",
"[nltk_data] Hostname mismatch, certificate is not valid for\n",
"[nltk_data] 'raw.githubusercontent.com'. (_ssl.c:1129)>\n",
"[nltk_data] Error loading averaged_perceptron_tagger: <urlopen error\n",
"[nltk_data] [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify\n",
"[nltk_data] failed: Hostname mismatch, certificate is not valid\n",
"[nltk_data] for 'raw.githubusercontent.com'. (_ssl.c:1129)>\n",
"[nltk_data] Error loading tagsets: <urlopen error [SSL:\n",
"[nltk_data] CERTIFICATE_VERIFY_FAILED] certificate verify failed:\n",
"[nltk_data] Hostname mismatch, certificate is not valid for\n",
"[nltk_data] 'raw.githubusercontent.com'. (_ssl.c:1129)>\n"
]
}
],
"source": [
"from src.test_bench import TestBench\n",
"\n",
"configurations = [\n",
" {\n",
" \"name\": \"adjectives\",\n",
" \"generator_config\": {\n",
" \"sample_prob_decay_factor\": 0.2,\n",
" \"flip_prob\": 0.5,\n",
" \"flipping_tags\": [\"JJ\", \"JJR\", \"JJS\"],\n",
" },\n",
" },\n",
" {\n",
" \"name\": \"nouns\",\n",
" \"generator_config\": {\n",
" \"sample_prob_decay_factor\": 0.2,\n",
" \"flip_prob\": 0.5,\n",
" \"flipping_tags\": [\"NN\", \"NNP\", \"NNPS\", \"NNS\"],\n",
" },\n",
" },\n",
" {\n",
" \"name\": \"adverbs\",\n",
" \"generator_config\": {\n",
" \"sample_prob_decay_factor\": 0.2,\n",
" \"flip_prob\": 0.5,\n",
" \"flipping_tags\": [\"RB\", \"RBR\", \"RBS\", \"RP\"],\n",
" },\n",
" },\n",
" {\n",
" \"name\": \"verbs\",\n",
" \"generator_config\": {\n",
" \"sample_prob_decay_factor\": 0.2,\n",
" \"flip_prob\": 0.5,\n",
" \"flipping_tags\": [\"VB\", \"VBD\", \"VBG\", \"VBN\", \"VBP\", \"VBZ\"],\n",
" },\n",
" },\n",
"]\n",
"text=\"One of the other reviewers has mentioned that after watching just 1 Oz episode you'll be hooked. They are right, as this is exactly what happened with me.<br /><br />The first thing that struck me about Oz was its brutality and unflinching scenes of violence, which set in right from the word GO. Trust me, this is not a show for the faint hearted or timid. This show pulls no punches with regards to drugs, sex or violence. Its is hardcore, in the classic use of the word.<br /><br />It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary. It focuses mainly on Emerald City, an experimental section of the prison where all the cells have glass fronts and face inwards, so privacy is not high on the agenda. Em City is home to many..Aryans, Muslims, gangstas, Latinos, Christians, Italians, Irish and more....so scuffles, death stares, dodgy dealings and shady agreements are never far away.<br /><br />I would say the main appeal of the show is due to the fact that it goes where other shows wouldn't dare. Forget pretty pictures painted for mainstream audiences, forget charm, forget romance...OZ doesn't mess around. The first episode I ever saw struck me as so nasty it was surreal, I couldn't say I was ready for it, but as I watched more, I developed a taste for Oz, and got accustomed to the high levels of graphic violence. Not just violence, but injustice (crooked guards who'll be sold out for a nickel, inmates who'll kill on order and get away with it, well mannered, middle class inmates being turned into prison bitches due to their lack of street skills or prison experience) Watching Oz, you may become comfortable with what is uncomfortable viewing....thats if you can get in touch with your darker side.\"\n",
"\n",
"tb = TestBench(\n",
" model_path=\"./models/analysis-models/svm.pkl\",\n",
" vectorizer_path=\"./models/analysis-models/tfidf.pkl\",\n",
" analyzer_name=\"svm\",\n",
" cf_generator_config=\"./configs/models/wf-cf-generator.yaml\",\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"reports = tb(configurations, text, 2)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"==== Configuration adjectives (1) ====\n",
"\n",
"======== Analysis Report ========\n",
"\n",
"Input text : One of the other reviewers has mentioned that after watching just 1 Oz episode you'll be hooked. They are right, as this is exactly what happened with me.<br /><br />The first thing that struck me about Oz was its brutality and unflinching scenes of violence, which set in right from the word GO. Trust me, this is not a show for the faint hearted or timid. This show pulls no punches with regards to drugs, sex or violence. Its is hardcore, in the classic use of the word.<br /><br />It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary. It focuses mainly on Emerald City, an experimental section of the prison where all the cells have glass fronts and face inwards, so privacy is not high on the agenda. Em City is home to many..Aryans, Muslims, gangstas, Latinos, Christians, Italians, Irish and more....so scuffles, death stares, dodgy dealings and shady agreements are never far away.<br /><br />I would say the main appeal of the show is due to the fact that it goes where other shows wouldn't dare. Forget pretty pictures painted for mainstream audiences, forget charm, forget romance...OZ doesn't mess around. The first episode I ever saw struck me as so nasty it was surreal, I couldn't say I was ready for it, but as I watched more, I developed a taste for Oz, and got accustomed to the high levels of graphic violence. Not just violence, but injustice (crooked guards who'll be sold out for a nickel, inmates who'll kill on order and get away with it, well mannered, middle class inmates being turned into prison bitches due to their lack of street skills or prison experience) Watching Oz, you may become comfortable with what is uncomfortable viewing....thats if you can get in touch with your darker side.\n",
"\n",
"Generated contradictory texts : \n",
"\tOne of the same reviewers has mentioned that after watching just 1 Oz episode you 'll be hooked . They are improperly , as this is exactly what happened with me. < br / > < br / > The last thing that struck me about Oz was its brutality and unflinching scenes of violence , which set in right from the word GO . Trust me , this is not a show for the faint hearted or confident . This show pulls no punches with regards to drugs , sex or violence . Its is hardcore , in the classic use of the word. < br / > < br / > It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary . It focuses mainly on Emerald City , an experimental section of the prison where all the cells have glass fronts and face inwards , so privacy is not high on the agenda . Em City is home to few .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and less .... so scuffles , death stares , dodgy dealings and shady agreements are never far away. < br / > < br / > I would say the dependent appeal of the show is due to the fact that it goes where other shows would n't dare . Forget immoderately pictures painted for mainstream audiences , forget charm , forget romance ... OZ does n't mess around . The first episode I ever saw struck me as so nice it was surreal , I could n't say I was unready for it , but as I watched more , I developed a taste for Oz , and got accustomed to the high levels of graphic violence . Not just violence , but injustice ( crooked guards who 'll be sold out for a nickel , inmates who 'll kill on order and get away with it , well mannered , late class inmates being turned into prison bitches due to their lack of street skills or prison experience ) Watching Oz , you may become comfortable with what is comfortable viewing .... thats if you can get in touch with your darker side .\n",
"\tOne of the same reviewers has mentioned that after watching just 1 Oz episode you 'll be hooked . They are falsify , as this is exactly what happened with me. < br / > < br / > The first thing that struck me about Oz was its brutality and unflinching scenes of violence , which set in right from the word GO . Trust me , this is not a show for the faint hearted or bold . This show pulls no punches with regards to drugs , sex or violence . Its is hardcore , in the classic use of the word. < br / > < br / > It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary . It focuses mainly on Emerald City , an experimental section of the prison where all the cells have glass fronts and face inwards , so privacy is not high on the agenda . Em City is home to many .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and fewer .... so scuffles , death stares , dodgy dealings and shady agreements are never far away. < br / > < br / > I would say the dependent appeal of the show is undue to the fact that it goes where same shows would n't dare . Forget pretty pictures painted for mainstream audiences , forget charm , forget romance ... OZ does n't mess around . The middle episode I ever saw struck me as so nice it was surreal , I could n't say I was ready for it , but as I watched more , I developed a taste for Oz , and got accustomed to the low levels of graphic violence . Not just violence , but injustice ( crooked guards who 'll be sold out for a nickel , inmates who 'll kill on order and get away with it , well mannered , middle class inmates being turned into prison bitches undue to their lack of street skills or prison experience ) Watching Oz , you may become comfortable with what is comfortable viewing .... thats if you can get in touch with your darker side .\n",
"\n",
"Distances to the mirror point : \n",
"\t0.9999217045458116\n",
"\t0.9998999295788169\n",
"\n",
"Closest contradictory text : One of the same reviewers has mentioned that after watching just 1 Oz episode you 'll be hooked . They are falsify , as this is exactly what happened with me. < br / > < br / > The first thing that struck me about Oz was its brutality and unflinching scenes of violence , which set in right from the word GO . Trust me , this is not a show for the faint hearted or bold . This show pulls no punches with regards to drugs , sex or violence . Its is hardcore , in the classic use of the word. < br / > < br / > It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary . It focuses mainly on Emerald City , an experimental section of the prison where all the cells have glass fronts and face inwards , so privacy is not high on the agenda . Em City is home to many .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and fewer .... so scuffles , death stares , dodgy dealings and shady agreements are never far away. < br / > < br / > I would say the dependent appeal of the show is undue to the fact that it goes where same shows would n't dare . Forget pretty pictures painted for mainstream audiences , forget charm , forget romance ... OZ does n't mess around . The middle episode I ever saw struck me as so nice it was surreal , I could n't say I was ready for it , but as I watched more , I developed a taste for Oz , and got accustomed to the low levels of graphic violence . Not just violence , but injustice ( crooked guards who 'll be sold out for a nickel , inmates who 'll kill on order and get away with it , well mannered , middle class inmates being turned into prison bitches undue to their lack of street skills or prison experience ) Watching Oz , you may become comfortable with what is comfortable viewing .... thats if you can get in touch with your darker side .\n",
"\n",
"\n",
"==== Configuration nouns (2) ====\n",
"\n",
"======== Analysis Report ========\n",
"\n",
"Input text : One of the other reviewers has mentioned that after watching just 1 Oz episode you'll be hooked. They are right, as this is exactly what happened with me.<br /><br />The first thing that struck me about Oz was its brutality and unflinching scenes of violence, which set in right from the word GO. Trust me, this is not a show for the faint hearted or timid. This show pulls no punches with regards to drugs, sex or violence. Its is hardcore, in the classic use of the word.<br /><br />It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary. It focuses mainly on Emerald City, an experimental section of the prison where all the cells have glass fronts and face inwards, so privacy is not high on the agenda. Em City is home to many..Aryans, Muslims, gangstas, Latinos, Christians, Italians, Irish and more....so scuffles, death stares, dodgy dealings and shady agreements are never far away.<br /><br />I would say the main appeal of the show is due to the fact that it goes where other shows wouldn't dare. Forget pretty pictures painted for mainstream audiences, forget charm, forget romance...OZ doesn't mess around. The first episode I ever saw struck me as so nasty it was surreal, I couldn't say I was ready for it, but as I watched more, I developed a taste for Oz, and got accustomed to the high levels of graphic violence. Not just violence, but injustice (crooked guards who'll be sold out for a nickel, inmates who'll kill on order and get away with it, well mannered, middle class inmates being turned into prison bitches due to their lack of street skills or prison experience) Watching Oz, you may become comfortable with what is uncomfortable viewing....thats if you can get in touch with your darker side.\n",
"\n",
"Generated contradictory texts : \n",
"\tOne of the other reviewers has mentioned that after watching just 1 Oz episode you 'll be hooked . They are right , as this is exactly what happened with me. < br / > < br / > The first thing that struck me about Oz was its brutality and unflinching scenes of violence , which set in wrongly from the word come . distrust me , this is not a show for the faint hearted or timid . This show pulls no punches with disesteem to drugs , sex or violence . Its is hardcore , in the classic use of the word. < br / > < br / > It is called OZ as that is the nickname given to the Oswald minimal Security State Penitentary . It focuses mainly on Emerald City , an experimental section of the prison where all the cells have glass back and back outward , so privacy is not high on the agenda . Em City is home to many .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and more .... so scuffles , birth stares , dodgy dealings and shady disagreement are never far away. < br / > < br / > I would say the main repel of the disprove is due to the fact that it goes where other hide would n't dare . mind pretty pictures painted for mainstream audiences , forget charm , forget romance ... OZ does n't mess around . The first episode I ever saw struck me as so nasty it was surreal , I could n't say I was ready for it , but as I watched more , I developed a taste for Oz , and got accustomed to the high raise of graphic violence . Not just violence , but injustice ( crooked guards who 'll be sold out for a nickel , outpatient who 'll kill on disorder and get away with it , well mannered , middle class inmates being turned into prison bitches due to their lack of street skills or prison inexperience ) Watching Oz , you may become comfortable with what is uncomfortable viewing .... thats if you can get in touch with your darker top .\n",
"\tOne of the other reviewers has mentioned that after watching just 1 Oz episode you 'll be hooked . They are right , as this is exactly what happened with me. < br / > < br / > The first thing that struck me about Oz was its brutality and unflinching scenes of violence , which set in left from the word stay in place . Trust me , this is not a show for the faint hearted or timid . This show pulls no punches with inattentiveness to drugs , sex or violence . Its is hardcore , in the classic use of the word. < br / > < br / > It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary . It focuses mainly on Emerald City , an experimental section of the prison where all the cells have glass back and back outward , so privacy is not high on the agenda . Em City is home to many .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and more .... so scuffles , death stares , dodgy dealings and shady agreements are never far away. < br / > < br / > I would say the main repel of the show is due to the fact that it goes where other shows would n't dare . Forget pretty pictures painted for mainstream audiences , forget charm , forget romance ... OZ does n't mess around . The first episode I ever saw struck me as so nasty it was surreal , I could n't say I was ready for it , but as I watched more , I developed a taste for Oz , and got accustomed to the high levels of graphic violence . Not just violence , but injustice ( crooked guards who 'll be sold out for a nickel , outpatient who 'll kill on deregulate and get away with it , well mannered , middle class inmates being turned into prison bitches due to their have of street skills or prison inexperience ) Watching Oz , you may become comfortable with what is uncomfortable viewing .... thats if you can get in touch with your light bottom .\n",
"\n",
"Distances to the mirror point : \n",
"\t0.999850010167032\n",
"\t0.9998984151053242\n",
"\n",
"Closest contradictory text : One of the other reviewers has mentioned that after watching just 1 Oz episode you 'll be hooked . They are right , as this is exactly what happened with me. < br / > < br / > The first thing that struck me about Oz was its brutality and unflinching scenes of violence , which set in wrongly from the word come . distrust me , this is not a show for the faint hearted or timid . This show pulls no punches with disesteem to drugs , sex or violence . Its is hardcore , in the classic use of the word. < br / > < br / > It is called OZ as that is the nickname given to the Oswald minimal Security State Penitentary . It focuses mainly on Emerald City , an experimental section of the prison where all the cells have glass back and back outward , so privacy is not high on the agenda . Em City is home to many .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and more .... so scuffles , birth stares , dodgy dealings and shady disagreement are never far away. < br / > < br / > I would say the main repel of the disprove is due to the fact that it goes where other hide would n't dare . mind pretty pictures painted for mainstream audiences , forget charm , forget romance ... OZ does n't mess around . The first episode I ever saw struck me as so nasty it was surreal , I could n't say I was ready for it , but as I watched more , I developed a taste for Oz , and got accustomed to the high raise of graphic violence . Not just violence , but injustice ( crooked guards who 'll be sold out for a nickel , outpatient who 'll kill on disorder and get away with it , well mannered , middle class inmates being turned into prison bitches due to their lack of street skills or prison inexperience ) Watching Oz , you may become comfortable with what is uncomfortable viewing .... thats if you can get in touch with your darker top .\n",
"\n",
"\n",
"==== Configuration adverbs (3) ====\n",
"\n",
"======== Analysis Report ========\n",
"\n",
"Input text : One of the other reviewers has mentioned that after watching just 1 Oz episode you'll be hooked. They are right, as this is exactly what happened with me.<br /><br />The first thing that struck me about Oz was its brutality and unflinching scenes of violence, which set in right from the word GO. Trust me, this is not a show for the faint hearted or timid. This show pulls no punches with regards to drugs, sex or violence. Its is hardcore, in the classic use of the word.<br /><br />It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary. It focuses mainly on Emerald City, an experimental section of the prison where all the cells have glass fronts and face inwards, so privacy is not high on the agenda. Em City is home to many..Aryans, Muslims, gangstas, Latinos, Christians, Italians, Irish and more....so scuffles, death stares, dodgy dealings and shady agreements are never far away.<br /><br />I would say the main appeal of the show is due to the fact that it goes where other shows wouldn't dare. Forget pretty pictures painted for mainstream audiences, forget charm, forget romance...OZ doesn't mess around. The first episode I ever saw struck me as so nasty it was surreal, I couldn't say I was ready for it, but as I watched more, I developed a taste for Oz, and got accustomed to the high levels of graphic violence. Not just violence, but injustice (crooked guards who'll be sold out for a nickel, inmates who'll kill on order and get away with it, well mannered, middle class inmates being turned into prison bitches due to their lack of street skills or prison experience) Watching Oz, you may become comfortable with what is uncomfortable viewing....thats if you can get in touch with your darker side.\n",
"\n",
"Generated contradictory texts : \n",
"\tOne of the other reviewers has mentioned that after watching unfair 1 Oz episode you 'll be hooked . They are right , as this is exactly what happened with me. < br / > < br / > The first thing that struck me about Oz was its brutality and unflinching scenes of violence , which set in right from the word GO . Trust me , this is not a show for the faint hearted or timid . This show pulls no punches with regards to drugs , sex or violence . Its is hardcore , in the classic use of the word. < br / > < br / > It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary . It focuses mainly on Emerald City , an experimental section of the prison where all the cells have glass fronts and face inwards , so privacy is not high on the agenda . Em City is home to many .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and more .... so scuffles , death stares , dodgy dealings and shady agreements are ever near away. < br / > < br / > I would say the main appeal of the show is due to the fact that it goes where other shows would n't dare . Forget pretty pictures painted for mainstream audiences , forget charm , forget romance ... OZ does n't mess around . The first episode I never saw struck me as so nasty it was surreal , I could n't say I was ready for it , but as I watched fewer , I developed a taste for Oz , and got accustomed to the high levels of graphic violence . Not unfair violence , but injustice ( crooked guards who 'll be sold out for a nickel , inmates who 'll kill on order and get home with it , badly mannered , middle class inmates being turned into prison bitches due to their lack of street skills or prison experience ) Watching Oz , you may become comfortable with what is uncomfortable viewing .... thats if you can get in touch with your darker side .\n",
"\tOne of the other reviewers has mentioned that after watching just 1 Oz episode you 'll be hooked . They are right , as this is imprecisely what happened with me. < br / > < br / > The first thing that struck me about Oz was its brutality and unflinching scenes of violence , which set in right from the word GO . Trust me , this is not a show for the faint hearted or timid . This show pulls no punches with regards to drugs , sex or violence . Its is hardcore , in the classic use of the word. < br / > < br / > It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary . It focuses mainly on Emerald City , an experimental section of the prison where all the cells have glass fronts and face inwards , so privacy is not high on the agenda . Em City is home to many .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and more .... so scuffles , death stares , dodgy dealings and shady agreements are never near away. < br / > < br / > I would say the main appeal of the show is due to the fact that it goes where other shows would n't dare . Forget pretty pictures painted for mainstream audiences , forget charm , forget romance ... OZ does n't mess around . The first episode I never saw struck me as so nasty it was surreal , I could n't say I was ready for it , but as I watched less , I developed a taste for Oz , and got accustomed to the high levels of graphic violence . Not unjust violence , but injustice ( crooked guards who 'll be sold out for a nickel , inmates who 'll kill on order and get home with it , well mannered , middle class inmates being turned into prison bitches due to their lack of street skills or prison experience ) Watching Oz , you may become comfortable with what is uncomfortable viewing .... thats if you can get in touch with your darker side .\n",
"\n",
"Distances to the mirror point : \n",
"\t0.9999299795522538\n",
"\t0.9999347792427672\n",
"\n",
"Closest contradictory text : One of the other reviewers has mentioned that after watching unfair 1 Oz episode you 'll be hooked . They are right , as this is exactly what happened with me. < br / > < br / > The first thing that struck me about Oz was its brutality and unflinching scenes of violence , which set in right from the word GO . Trust me , this is not a show for the faint hearted or timid . This show pulls no punches with regards to drugs , sex or violence . Its is hardcore , in the classic use of the word. < br / > < br / > It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary . It focuses mainly on Emerald City , an experimental section of the prison where all the cells have glass fronts and face inwards , so privacy is not high on the agenda . Em City is home to many .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and more .... so scuffles , death stares , dodgy dealings and shady agreements are ever near away. < br / > < br / > I would say the main appeal of the show is due to the fact that it goes where other shows would n't dare . Forget pretty pictures painted for mainstream audiences , forget charm , forget romance ... OZ does n't mess around . The first episode I never saw struck me as so nasty it was surreal , I could n't say I was ready for it , but as I watched fewer , I developed a taste for Oz , and got accustomed to the high levels of graphic violence . Not unfair violence , but injustice ( crooked guards who 'll be sold out for a nickel , inmates who 'll kill on order and get home with it , badly mannered , middle class inmates being turned into prison bitches due to their lack of street skills or prison experience ) Watching Oz , you may become comfortable with what is uncomfortable viewing .... thats if you can get in touch with your darker side .\n",
"\n",
"\n",
"==== Configuration verbs (4) ====\n",
"\n",
"======== Analysis Report ========\n",
"\n",
"Input text : One of the other reviewers has mentioned that after watching just 1 Oz episode you'll be hooked. They are right, as this is exactly what happened with me.<br /><br />The first thing that struck me about Oz was its brutality and unflinching scenes of violence, which set in right from the word GO. Trust me, this is not a show for the faint hearted or timid. This show pulls no punches with regards to drugs, sex or violence. Its is hardcore, in the classic use of the word.<br /><br />It is called OZ as that is the nickname given to the Oswald Maximum Security State Penitentary. It focuses mainly on Emerald City, an experimental section of the prison where all the cells have glass fronts and face inwards, so privacy is not high on the agenda. Em City is home to many..Aryans, Muslims, gangstas, Latinos, Christians, Italians, Irish and more....so scuffles, death stares, dodgy dealings and shady agreements are never far away.<br /><br />I would say the main appeal of the show is due to the fact that it goes where other shows wouldn't dare. Forget pretty pictures painted for mainstream audiences, forget charm, forget romance...OZ doesn't mess around. The first episode I ever saw struck me as so nasty it was surreal, I couldn't say I was ready for it, but as I watched more, I developed a taste for Oz, and got accustomed to the high levels of graphic violence. Not just violence, but injustice (crooked guards who'll be sold out for a nickel, inmates who'll kill on order and get away with it, well mannered, middle class inmates being turned into prison bitches due to their lack of street skills or prison experience) Watching Oz, you may become comfortable with what is uncomfortable viewing....thats if you can get in touch with your darker side.\n",
"\n",
"Generated contradictory texts : \n",
"\tOne of the other reviewers has mentioned that after watching just 1 Oz episode you 'll differ hooked . They are right , as this differ exactly what dematerialize with me. < br / > < br / > The first thing that miss me about Oz differ its brutality and unflinching scenes of violence , which set in right from the word GO . Trust me , this is not a show for the faint hearted or timid . This show pulls no punches with regards to drugs , sex or violence . Its is hardcore , in the classic use of the word. < br / > < br / > It is called OZ as that differ the nickname given to the Oswald Maximum Security State Penitentary . It blur mainly on Emerald City , an experimental section of the prison where all the cells have glass fronts and face inwards , so privacy differ not high on the agenda . Em City is home to many .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and more .... so scuffles , death stares , dodgy dealings and shady agreements are never far away. < br / > < br / > I would say the main appeal of the show differ due to the fact that it malfunction where other shows would n't dare . Forget pretty pictures painted for mainstream audiences , forget charm , forget romance ... OZ unmake n't mess around . The first episode I ever saw miss me as so nasty it differ surreal , I could n't say I differ ready for it , but as I watched more , I developed a taste for Oz , and end unaccustomed to the high levels of graphic violence . Not just violence , but injustice ( crooked guards who 'll be unsold out for a nickel , inmates who 'll kill on order and end away with it , well mannered , middle class inmates being turned into prison bitches due to their lack of street skills or prison experience ) Watching Oz , you may become comfortable with what is uncomfortable viewing .... thats if you can get in touch with your darker side .\n",
"\tOne of the other reviewers refuse mentioned that after watching just 1 Oz episode you 'll differ unhook . They are right , as this is exactly what dematerialize with me. < br / > < br / > The first thing that struck me about Oz differ its brutality and unflinching scenes of violence , which set in right from the word GO . Trust me , this is not a show for the faint hearted or timid . This show pulls no punches with regards to drugs , sex or violence . Its differ hardcore , in the classic use of the word. < br / > < br / > It differ called OZ as that is the nickname starve to the Oswald Maximum Security State Penitentary . It focuses mainly on Emerald City , an experimental section of the prison where all the cells have glass fronts and face inwards , so privacy is not high on the agenda . Em City differ home to many .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and more .... so scuffles , death stares , dodgy dealings and shady agreements are never far away. < br / > < br / > I would say the main appeal of the show differ due to the fact that it goes where other shows would n't dare . Forget pretty pictures painted for mainstream audiences , remember charm , remember romance ... OZ does n't mess around . The first episode I ever saw struck me as so nasty it was surreal , I could n't say I differ ready for it , but as I watched more , I developed a taste for Oz , and end unaccustomed to the high levels of graphic violence . Not just violence , but injustice ( crooked guards who 'll be unsold out for a nickel , inmates who 'll kill on order and get away with it , well mannered , middle class inmates nonexistence unturned into prison bitches due to their lack of street skills or prison experience ) Watching Oz , you may become comfortable with what differ uncomfortable viewing .... thats if you can take away in touch with your darker side .\n",
"\n",
"Distances to the mirror point : \n",
"\t0.999715995308004\n",
"\t0.99969777110124\n",
"\n",
"Closest contradictory text : One of the other reviewers refuse mentioned that after watching just 1 Oz episode you 'll differ unhook . They are right , as this is exactly what dematerialize with me. < br / > < br / > The first thing that struck me about Oz differ its brutality and unflinching scenes of violence , which set in right from the word GO . Trust me , this is not a show for the faint hearted or timid . This show pulls no punches with regards to drugs , sex or violence . Its differ hardcore , in the classic use of the word. < br / > < br / > It differ called OZ as that is the nickname starve to the Oswald Maximum Security State Penitentary . It focuses mainly on Emerald City , an experimental section of the prison where all the cells have glass fronts and face inwards , so privacy is not high on the agenda . Em City differ home to many .. Aryans , Muslims , gangstas , Latinos , Christians , Italians , Irish and more .... so scuffles , death stares , dodgy dealings and shady agreements are never far away. < br / > < br / > I would say the main appeal of the show differ due to the fact that it goes where other shows would n't dare . Forget pretty pictures painted for mainstream audiences , remember charm , remember romance ... OZ does n't mess around . The first episode I ever saw struck me as so nasty it was surreal , I could n't say I differ ready for it , but as I watched more , I developed a taste for Oz , and end unaccustomed to the high levels of graphic violence . Not just violence , but injustice ( crooked guards who 'll be unsold out for a nickel , inmates who 'll kill on order and get away with it , well mannered , middle class inmates nonexistence unturned into prison bitches due to their lack of street skills or prison experience ) Watching Oz , you may become comfortable with what differ uncomfortable viewing .... thats if you can take away in touch with your darker side .\n",
"\n",
"\n"
]
}
],
"source": [
"for report in reports:\n",
" print(report)\n",
" print()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Creating dataset\n",
"Initializing objects\n",
"Encoding\n",
"Dataset created\n"
]
}
],
"source": [
"from src.datasets import IMDBDataset\n",
"\n",
"ds = IMDBDataset(config_path=\"./configs/datasets/imdb.yaml\", root=\"datasets/imdb\")\n",
"tb.evaluate(ds.x_test, ds.y_test, save_dir=\"evaluations/svm\")"
]
}
],
"metadata": {
"colab": {
"collapsed_sections": [
"UyNc81DoHzOU"
],
"provenance": []
},
"kernelspec": {
"display_name": "xai",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.17"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 0
}
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from sagemaker.pytorch import PyTorch\n",
"from sagemaker.inputs import TrainingInput\n",
"\n",
"def train()->None:\n",
" estimator = PyTorch(\n",
" entry_point=f\"sagemaker_main.py\",\n",
" role=\"arn:aws:iam::065257926712:role/SagemakerRole\",\n",
" framework_version=\"2.0\",\n",
" py_version=\"py310\",\n",
" source_dir=\"src\",\n",
" output_path=f\"s3://sliit-xai/training-jobs/results\",\n",
" code_location=f\"s3://sliit-xai/training-jobs/code\",\n",
" instance_count=1,\n",
" instance_type=\"ml.c5.2xlarge\",\n",
" max_run=5 * 24 * 60 * 60\n",
" )\n",
" # Setting the input channels for tuning job\n",
" s3_input_train = TrainingInput(s3_data=\"s3://sliit-xai/datasets/imdb\", s3_data_type=\"S3Prefix\")\n",
"\n",
" # Start job\n",
" estimator.fit(inputs={\"train\": s3_input_train})\n",
"\n",
"train()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "xai-env",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.17"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment