Metrics

Elliot provides 36 evaluation metrics, partitioned into seven families: Accuracy, Rating-Error, Coverage, Novelty, Diversity, Bias, and Fairness. It is worth mentioning that Elliot is the framework that exposes both the largest number of metrics and the only one considering bias and fairness measures. Moreover, the user can choose any metric to drive the model selection and the tuning.

All the metrics inherit from a common abstract class:

base_metric.BaseMetric(recommendations, …)

This class represents the implementation of the Precision recommendation metric.

  • Accuracy

accuracy.AUC.auc.AUC(recommendations, …)

Area Under the Curve

accuracy.AUC.gauc.GAUC(recommendations, …)

Group Area Under the Curve

accuracy.AUC.lauc.LAUC(recommendations, …)

Limited Area Under the Curve

accuracy.DSC.dsc.DSC(recommendations, …)

Sørensen–Dice coefficient

accuracy.f1.f1.F1(recommendations, config, …)

F-Measure

accuracy.f1.extended_f1.ExtendedF1(…)

Extended F-Measure

accuracy.hit_rate.hit_rate.HR(…)

Hit Rate

accuracy.map.map.MAP(recommendations, …)

Mean Average Precision

accuracy.mar.mar.MAR(recommendations, …)

Mean Average Recall

accuracy.mrr.mrr.MRR(recommendations, …)

Mean Reciprocal Rank

accuracy.ndcg.ndcg.nDCG(recommendations, …)

normalized Discounted Cumulative Gain

accuracy.ndcg.ndcg_rendle2020.nDCGRendle2020(…)

normalized Discounted Cumulative Gain

  • Bias

bias.aclt.aclt.ACLT(recommendations, config, …)

Average coverage of long tail items

bias.aplt.aplt.APLT(recommendations, config, …)

Average percentage of long tail items

bias.arp.arp.ARP(recommendations, config, …)

Average Recommendation Popularity

bias.pop_reo.pop_reo.PopREO(recommendations, …)

Popularity-based Ranking-based Equal Opportunity

bias.pop_reo.extended_pop_reo.ExtendedPopREO(…)

Extended Popularity-based Ranking-based Equal Opportunity

bias.pop_rsp.pop_rsp.PopRSP(recommendations, …)

Popularity-based Ranking-based Statistical Parity

bias.pop_rsp.extended_pop_rsp.ExtendedPopRSP(…)

Extended Popularity-based Ranking-based Statistical Parity

  • Coverage

coverage.item_coverage.item_coverage.ItemCoverage(…)

Item Coverage

coverage.num_retrieved.num_retrieved.NumRetrieved(…)

Number of Recommendations Retrieved

coverage.user_coverage.user_coverage.UserCoverage(…)

User Coverage

coverage.user_coverage.user_coverage_at_n.UserCoverageAtN(…)

User Coverage on Top-N rec.

  • Diversity

diversity.gini_index.gini_index.GiniIndex(…)

Gini Index

diversity.shannon_entropy.shannon_entropy.ShannonEntropy(…)

Shannon Entropy

diversity.SRecall.srecall.SRecall(…)

Subtopic Recall

  • Fairness

fairness.BiasDisparity.BiasDisparityBD(…)

Bias Disparity - Standard

fairness.BiasDisparity.BiasDisparityBR(…)

Bias Disparity - Bias Recommendations

fairness.BiasDisparity.BiasDisparityBS(…)

Bias Disparity - Bias Source

fairness.MAD.ItemMADranking.ItemMADranking(…)

Item MAD Ranking-based

fairness.MAD.ItemMADrating.ItemMADrating(…)

Item MAD Rating-based

fairness.MAD.UserMADranking.UserMADranking(…)

User MAD Ranking-based

fairness.MAD.UserMADrating.UserMADrating(…)

User MAD Rating-based

fairness.reo.reo.REO(recommendations, …)

Ranking-based Equal Opportunity

fairness.rsp.rsp.RSP(recommendations, …)

Ranking-based Statistical Parity

  • Novelty

novelty.EFD.efd.EFD(recommendations, config, …)

Expected Free Discovery (EFD)

novelty.EFD.extended_efd.ExtendedEFD(…)

Extended EFD

novelty.EPC.epc.EPC(recommendations, config, …)

Expected Popularity Complement (EPC)

novelty.EPC.extended_epc.ExtendedEPC(…)

Extended EPC

  • Rating

rating.mae.mae.MAE(recommendations, config, …)

Mean Absolute Error

rating.mse.mse.MSE(recommendations, config, …)

Mean Squared Error

rating.rmse.rmse.RMSE(recommendations, …)

Root Mean Squared Error