Publication: An ensemble approach for multi-label classification of item click sequences
| dc.contributor.author | Yağcı, A. Murat | |
| dc.contributor.author | Aytekin, Tevfik | |
| dc.contributor.author | Gürgen, Fïkret S. | |
| dc.contributor.institution | Yağcı, A. Murat, Department of Computer Engineering, Boğaziçi Üniversitesi, Bebek, Turkey | |
| dc.contributor.institution | Aytekin, Tevfik, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey | |
| dc.contributor.institution | Gürgen, Fïkret S., Department of Computer Engineering, Boğaziçi Üniversitesi, Bebek, Turkey | |
| dc.date.accessioned | 2025-10-05T16:30:25Z | |
| dc.date.issued | 2015 | |
| dc.description.abstract | In this paper, we describe our approach to RecSys 2015 chal-lenge problem. Given a dataset of item click sessions, the problem is to predict whether a session results in a purchase and which items are purchased if the answer is yes. We define a simpler analogous problem where given an item and its session, we try to predict the probability of purchase for the given item. For each session, the predictions result in a set of purchased items or often an empty set. We apply monthly time windows over the dataset. For each item in a session, we engineer features regarding the session, the item properties, and the time window. Then, a balanced random forest classifier is trained to perform pre-dictions on the test set. The dataset is particularly challenging due to privacy-preserving definition of a session, the class imbalance prob-lem, and the volume of data. We report our findings with re-spect to feature engineering, the choice of sampling schemes, and classifier ensembles. Experimental results together with benefits and shortcomings of the proposed approach are dis-cussed. The solution is efficient and practical in commodity computers. © 2017 Elsevier B.V., All rights reserved. | |
| dc.description.sponsorship | YOOCHOOSE | |
| dc.identifier.conferenceName | International ACM Recommender Systems Challenge, RecSys 2015 | |
| dc.identifier.conferencePlace | Vienna | |
| dc.identifier.doi | 10.1145/2813448.2813516 | |
| dc.identifier.isbn | 9781450336659 | |
| dc.identifier.scopus | 2-s2.0-84960924553 | |
| dc.identifier.uri | https://doi.org/10.1145/2813448.2813516 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.14719/12699 | |
| dc.language.iso | en | |
| dc.publisher | Association for Computing Machinery, Inc acmhelp@acm.org | |
| dc.subject.authorkeywords | Recommender Systems | |
| dc.subject.authorkeywords | Sequence Classification | |
| dc.subject.authorkeywords | Web Mining | |
| dc.subject.authorkeywords | Data Privacy | |
| dc.subject.authorkeywords | Decision Trees | |
| dc.subject.authorkeywords | Recommender Systems | |
| dc.subject.authorkeywords | Sales | |
| dc.subject.authorkeywords | Vehicle Routing | |
| dc.subject.authorkeywords | Classifier Ensembles | |
| dc.subject.authorkeywords | Ensemble Approaches | |
| dc.subject.authorkeywords | Feature Engineerings | |
| dc.subject.authorkeywords | Multi Label Classification | |
| dc.subject.authorkeywords | Privacy Preserving | |
| dc.subject.authorkeywords | Random Forest Classifier | |
| dc.subject.authorkeywords | Sequence Classification | |
| dc.subject.authorkeywords | Web Mining | |
| dc.subject.authorkeywords | Classification (of Information) | |
| dc.subject.indexkeywords | Data privacy | |
| dc.subject.indexkeywords | Decision trees | |
| dc.subject.indexkeywords | Recommender systems | |
| dc.subject.indexkeywords | Sales | |
| dc.subject.indexkeywords | Vehicle routing | |
| dc.subject.indexkeywords | Classifier ensembles | |
| dc.subject.indexkeywords | Ensemble approaches | |
| dc.subject.indexkeywords | Feature engineerings | |
| dc.subject.indexkeywords | Multi label classification | |
| dc.subject.indexkeywords | Privacy preserving | |
| dc.subject.indexkeywords | Random forest classifier | |
| dc.subject.indexkeywords | Sequence classification | |
| dc.subject.indexkeywords | Web Mining | |
| dc.subject.indexkeywords | Classification (of information) | |
| dc.title | An ensemble approach for multi-label classification of item click sequences | |
| dc.type | Conference Paper | |
| dcterms.references | Ben-Shimon, David, RecSys challenge 2015 and the YOOCHOOSE dataset, pp. 357-358, (2015), Breiman, Leo, Random forests, Machine Learning, 45, 1, pp. 5-32, (2001), Chawla, Nitesh Vinay, SMOTE: Synthetic minority over-sampling technique, Journal of Artificial Intelligence Research, 16, pp. 321-357, (2002), Using Random Forest to Learn Imbalanced Data, (2004), Galar, Mikel, A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches, IEEE Transactions on Systems, Man and Cybernetics Part C: Applications and Reviews, 42, 4, pp. 463-484, (2012), Louppe, Gilles C., Understanding variable importances in Forests of randomized trees, Advances in Neural Information Processing Systems, (2013), Pedregosa, Fabián, Scikit-learn: Machine learning in Python, Journal of Machine Learning Research, 12, pp. 2825-2830, (2011), Data Mining and Knowledge Discovery Handbook, (2010) | |
| dspace.entity.type | Publication | |
| local.indexed.at | Scopus | |
| person.identifier.scopus-author-id | 35325932900 | |
| person.identifier.scopus-author-id | 35793449500 | |
| person.identifier.scopus-author-id | 6603953162 |
