Publication:
An ensemble approach for multi-label classification of item click sequences

No Thumbnail Available

Date

2015

Journal Title

Journal ISSN

Volume Title

Publisher

Association for Computing Machinery, Inc acmhelp@acm.org

Research Projects

Organizational Units

Journal Issue

Abstract

In this paper, we describe our approach to RecSys 2015 chal-lenge problem. Given a dataset of item click sessions, the problem is to predict whether a session results in a purchase and which items are purchased if the answer is yes. We define a simpler analogous problem where given an item and its session, we try to predict the probability of purchase for the given item. For each session, the predictions result in a set of purchased items or often an empty set. We apply monthly time windows over the dataset. For each item in a session, we engineer features regarding the session, the item properties, and the time window. Then, a balanced random forest classifier is trained to perform pre-dictions on the test set. The dataset is particularly challenging due to privacy-preserving definition of a session, the class imbalance prob-lem, and the volume of data. We report our findings with re-spect to feature engineering, the choice of sampling schemes, and classifier ensembles. Experimental results together with benefits and shortcomings of the proposed approach are dis-cussed. The solution is efficient and practical in commodity computers. © 2017 Elsevier B.V., All rights reserved.

Description

Keywords

Citation

Endorsement

Review

Supplemented By

Referenced By