Anomaly Detection in High-Dimensional Data Using Self-Supervised Representation Learning
Keywords:
Anomaly Detection, High-Dimensional Data, Self-Supervised Learning, Representation Learning, Contrastive Learning, Unsupervised LearningAbstract
Anomaly detection in high-dimensional data presents a significant challenge due to the curse of dimensionality and the scarcity of labeled anomalies. Recent advancements in self-supervised learning (SSL) offer promising solutions by learning compact and informative representations without requiring labeled data. In this study, we explore self-supervised representation learning techniques for anomaly detection in high-dimensional spaces, using contrastive and predictive methods. We demonstrate their effectiveness through synthetic and real-world datasets and provide a comparative analysis. Our results show that SSL-based models outperform traditional methods in both accuracy and robustness
References
Bonawitz, Bengio, Y., Courville, A., & Vincent, P. (2013). Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8), 1798–1828.
Devalla, S. (2023). Adaptive predictive monitoring in enterprise serverless deployments: Enabling early fault detection across multi-cloud environments. International Journal of Computer Applications (IJCA), 4(1), 54–72. https://doi.org/10.34218/IJCA_04_01_007
Breunig, M. M., Kriegel, H. P., Ng, R. T., & Sander, J. (2000). LOF: Identifying density-based local outliers. ACM SIGMOD Record, 29(2), 93–104.
Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: A survey. ACM Computing Surveys (CSUR), 41(3), 1–58.
Devalla, S. (2023). Cross-platform resilience: A security-aware framework for migrating enterprise workloads from PCF to OpenShift. International Journal of Information Technology and Management Information Systems (IJITMIS), 14(2), 129–152. https://doi.org/10.34218/IJITMIS_14_02_013
Golan, I., & El-Yaniv, R. (2018). Deep anomaly detection using geometric transformations. Advances in Neural Information Processing Systems, 31.
Grill, J. B., Strub, F., Altché, F., Tallec, C., Richemond, P., Buchatskaya, E., ... & Valko, M. (2020). Bootstrap your own latent: A new approach to self-supervised learning. arXiv preprint arXiv:2006.07733.
Devalla, S. (2023). Cross-platform resilience: A security-aware framework for migrating enterprise workloads from PCF to OpenShift. International Journal of Information Technology and Management Information Systems (IJITMIS), 14(2), 129–152. https://doi.org/10.34218/IJITMIS_14_02_013
Hawkins, D. M., Basak, S. C., & Mills, D. (2002). Assessing model fit by cross-validation. Journal of Chemical Information and Computer Sciences, 43(2), 579–586.
Ruff, L., Vandermeulen, R., Görnitz, N., Deecke, L., Siddiqui, S. A., Binder, A., ... & Kloft, M. (2018). Deep one-class classification. Proceedings of the 35th International Conference on Machine Learning, 70, 4393–4402.
Devalla, S. (2022). Adaptive performance optimization for data-intensive scientific workflows in function-as-a-service platforms. Frontiers in Computer Science and Information Technology (FCSIT), 3(1), 60–77. https://doi.org/10.34218/FCSIT_03_01_004
Schölkopf, B., Platt, J. C., Shawe-Taylor, J., Smola, A. J., & Williamson, R. C. (2001). Estimating the support of a high-dimensional distribution. Neural Computation, 13(7), 1443–1471.
Tack, J., Mo, S., Jeong, J., & Shin, J. (2020). CSI: Novelty detection via contrastive learning on distributionally shifted instances. Advances in Neural Information Processing Systems, 33.
Devalla, S. (2022). Evaluating the transition to WebFlux: Performance, maintainability, and adoption challenges in modern Java web frameworks. International Journal of Data Science Research and Development (IJDSRD), 1(1), 7–18. https://doi.org/10.34218/IJDSRD_01_01_002
Zhou, C., & Paffenroth, R. C. (2017). Anomaly detection with robust deep autoencoders. Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 665–674.
