RADE: resource-efficient supervised anomaly detection using decision tree-based ensemble methods

Shay Vargaftik; Isaac Keslassy; Ariel Orda; Yaniv Ben-Itzhak

doi:10.1007/s10994-021-06047-x

RADE: resource-efficient supervised anomaly detection using decision tree-based ensemble methods

Shay Vargaftik, Isaac Keslassy, Ariel Orda, Yaniv Ben-Itzhak

Electrical and Computer Engineering

Research output: Contribution to journal › Article › peer-review

7 Scopus citations

Abstract

The capability to perform anomaly detection in a resource-constrained setting, such as an edge device or a loaded server, is of increasing need due to emerging on-premises computation constraints as well as security, privacy and profitability reasons. Yet, the increasing size of datasets often results in current anomaly detection methods being too resource consuming, and in particular decision-tree based ensemble classifiers. To address this need, we present RADE—a new resource-efficient anomaly detection framework that augments standard decision-tree based ensemble classifiers to perform well in a resource constrained setting. The key idea behind RADE is first to train a small model that is sufficient to correctly classify the majority of the queries. Then, using only subsets of the training data, train expert models for these fewer harder cases where the small model is at high risk of making a classification mistake. We implement RADE as a scikit-learn classifier. Our evaluation indicates that RADE offers competitive anomaly detection capabilities as compared to standard methods while significantly improving memory footprint by up to 12 × , training-time by up to 20 × , and classification time by up to 16 ×.

Original language	English
Pages (from-to)	2835-2866
Number of pages	32
Journal	Machine Learning
Volume	110
Issue number	10
DOIs	https://doi.org/10.1007/s10994-021-06047-x
State	Published - Oct 2021

Keywords

Anomaly detection
Decision-tree based ensemble methods
Fast machine learning
Resource efficient machine learning
Supervised learning

ASJC Scopus subject areas

Software
Artificial Intelligence

Access to Document

10.1007/s10994-021-06047-x

Cite this

@article{dec57f0ebe3a4d51888301e8cade2253,

title = "RADE: resource-efficient supervised anomaly detection using decision tree-based ensemble methods",

abstract = "The capability to perform anomaly detection in a resource-constrained setting, such as an edge device or a loaded server, is of increasing need due to emerging on-premises computation constraints as well as security, privacy and profitability reasons. Yet, the increasing size of datasets often results in current anomaly detection methods being too resource consuming, and in particular decision-tree based ensemble classifiers. To address this need, we present RADE—a new resource-efficient anomaly detection framework that augments standard decision-tree based ensemble classifiers to perform well in a resource constrained setting. The key idea behind RADE is first to train a small model that is sufficient to correctly classify the majority of the queries. Then, using only subsets of the training data, train expert models for these fewer harder cases where the small model is at high risk of making a classification mistake. We implement RADE as a scikit-learn classifier. Our evaluation indicates that RADE offers competitive anomaly detection capabilities as compared to standard methods while significantly improving memory footprint by up to 12 × , training-time by up to 20 × , and classification time by up to 16 ×.",

keywords = "Anomaly detection, Decision-tree based ensemble methods, Fast machine learning, Resource efficient machine learning, Supervised learning",

author = "Shay Vargaftik and Isaac Keslassy and Ariel Orda and Yaniv Ben-Itzhak",

note = "Publisher Copyright: {\textcopyright} 2021, The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature.",

year = "2021",

month = oct,

doi = "10.1007/s10994-021-06047-x",

language = "אנגלית",

volume = "110",

pages = "2835--2866",

number = "10",

}

TY - JOUR

T1 - RADE

T2 - resource-efficient supervised anomaly detection using decision tree-based ensemble methods

AU - Vargaftik, Shay

AU - Keslassy, Isaac

AU - Orda, Ariel

AU - Ben-Itzhak, Yaniv

PY - 2021/10

Y1 - 2021/10

N2 - The capability to perform anomaly detection in a resource-constrained setting, such as an edge device or a loaded server, is of increasing need due to emerging on-premises computation constraints as well as security, privacy and profitability reasons. Yet, the increasing size of datasets often results in current anomaly detection methods being too resource consuming, and in particular decision-tree based ensemble classifiers. To address this need, we present RADE—a new resource-efficient anomaly detection framework that augments standard decision-tree based ensemble classifiers to perform well in a resource constrained setting. The key idea behind RADE is first to train a small model that is sufficient to correctly classify the majority of the queries. Then, using only subsets of the training data, train expert models for these fewer harder cases where the small model is at high risk of making a classification mistake. We implement RADE as a scikit-learn classifier. Our evaluation indicates that RADE offers competitive anomaly detection capabilities as compared to standard methods while significantly improving memory footprint by up to 12 × , training-time by up to 20 × , and classification time by up to 16 ×.

AB - The capability to perform anomaly detection in a resource-constrained setting, such as an edge device or a loaded server, is of increasing need due to emerging on-premises computation constraints as well as security, privacy and profitability reasons. Yet, the increasing size of datasets often results in current anomaly detection methods being too resource consuming, and in particular decision-tree based ensemble classifiers. To address this need, we present RADE—a new resource-efficient anomaly detection framework that augments standard decision-tree based ensemble classifiers to perform well in a resource constrained setting. The key idea behind RADE is first to train a small model that is sufficient to correctly classify the majority of the queries. Then, using only subsets of the training data, train expert models for these fewer harder cases where the small model is at high risk of making a classification mistake. We implement RADE as a scikit-learn classifier. Our evaluation indicates that RADE offers competitive anomaly detection capabilities as compared to standard methods while significantly improving memory footprint by up to 12 × , training-time by up to 20 × , and classification time by up to 16 ×.

KW - Anomaly detection

KW - Decision-tree based ensemble methods

KW - Fast machine learning

KW - Resource efficient machine learning

KW - Supervised learning

UR - http://www.scopus.com/inward/record.url?scp=85114114661&partnerID=8YFLogxK

U2 - 10.1007/s10994-021-06047-x

DO - 10.1007/s10994-021-06047-x

M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???

AN - SCOPUS:85114114661

SN - 0885-6125

VL - 110

SP - 2835

EP - 2866

JO - Machine Learning

JF - Machine Learning

IS - 10

ER -

RADE: resource-efficient supervised anomaly detection using decision tree-based ensemble methods

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this