Unsupervised Representation Learning for Multimodal Data Fusion in High-Dimensional Cognitive Computing Architectures
Keywords:
Unsupervised learning, multimodal data fusion, high-dimensional data, cognitive computing, autoencoders, graph neural networks, hyperdimensional computingAbstract
The integration of multimodal data in high-dimensional cognitive computing architectures presents significant challenges, particularly in the absence of labeled datasets. Unsupervised representation learning offers a promising solution by enabling the extraction of meaningful features from diverse data modalities without the need for manual annotations. This paper explores contemporary methodologies for unsupervised learning in multimodal data fusion, emphasizing their applicability to cognitive computing systems. We discuss various techniques, including autoencoders, graph-based models, and hyperdimensional computing, highlighting their strengths and limitations. Through a comprehensive literature review and analysis, we aim to provide insights into the current state of the field and identify potential directions for future research.
References
Baltrušaitis, Tadas, Chaitanya Ahuja, and Louis-Philippe Morency. “Multimodal Machine Learning: A Survey and Taxonomy.” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 2, 2019, pp. 423–443.
Ngiam, Jiquan, et al. “Multimodal Deep Learning.” Proceedings of the 28th International Conference on Machine Learning, 2011, pp. 689–696.
Hullurappa, M., & Panyaram, S. (2025). Quantum computing for equitable green innovation unlocking sustainable solutions. In Advancing social equity through accessible green innovation (pp. 387-402). https://doi.org/10.4018/979-8-3693-9471-7.ch024
Srivastava, Nitish, and Ruslan Salakhutdinov. “Multimodal Learning with Deep Boltzmann Machines.” Journal of Machine Learning Research, vol. 15, 2014, pp. 2949–2980.
Feng, Fuli, et al. “Cross-modal Retrieval with Graph Convolutional Networks.” ACM Transactions on Information Systems, vol. 38, no. 2, 2020, pp. 1–22.
Panyaram, S., & Kotte, K. R. (2025). Leveraging AI and data analytics for sustainable robotic process automation (RPA) in media: Driving innovation in green field business process. In Driving business success through eco-friendly strategies (pp. 249-262). https://doi.org/10.4018/979-8-3693-9750-3.ch013
Chen, Ziqian, et al. “Multimodal Fusion with Co-Attentive Transformer.” Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020, pp. 4465–4476.
Wang, Weiran, et al. “Deep Canonical Correlation Analysis for Multimodal Representation Learning.” Proceedings of the 30th International Conference on Machine Learning, 2015, pp. 36–45.
Panyaram, S., & Hullurappa, M. (2025). Data-driven approaches to equitable green innovation bridging sustainability and inclusivity. In Advancing social equity through accessible green innovation (pp. 139-152). https://doi.org/10.4018/979-8-3693-9471-7.ch009
Arevalo, John, et al. “Gated Multimodal Units for Information Fusion.” Proceedings of the 5th International Conference on Learning Representations, 2017.
Sankaranarayanan, S. (2025). The Role of Data Engineering in Enabling Real-Time Analytics and Decision-Making Across Heterogeneous Data Sources in Cloud-Native Environments. International Journal of Advanced Research in Cyber Security (IJARC), 6(1), January-June 2025.
Mukesh, V., Joel, D., Balaji, V. M., Tamilpriyan, R., & Yogesh Pandian, S. (2024). Data management and creation of routes for automated vehicles in smart city. International Journal of Computer Engineering and Technology (IJCET), 15(36), 2119–2150. doi: https://doi.org/10.5281/zenodo.14993009
Gupta, Akhil, et al. “Unsupervised Learning of Multimodal Representations for Audio-Visual Scene Understanding.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 12690–12699.
Mukesh, V. (2024). A Comprehensive Review of Advanced Machine Learning Techniques for Enhancing Cybersecurity in Blockchain Networks. ISCSITR-International Journal of Artificial Intelligence, 5(1), 1–6.
Liang, Paul Pu, et al. “Learning Representations for Counterfactual Inference.” International Conference on Machine Learning, 2016, pp. 3020–3029.
S.Sankara Narayanan and M.Ramakrishnan, Software As A Service: MRI Cloud Automated Brain MRI Segmentation And Quantification Web Services, International Journal of Computer Engineering & Technology, 8(2), 2017, pp. 38–48.
Parthasarathi, Prasanna, and Joelle Pineau. “Training Strategies for Multi-Modal Dialog Systems.” Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue, 2019, pp. 356–365.
Mukesh, V. (2025). Architecting intelligent systems with integration technologies to enable seamless automation in distributed cloud environments. International Journal of Advanced Research in Cloud Computing (IJARCC), 6(1),5-10.
Sankaranarayanan S. (2025). Optimizing Safety Stock in Supply Chain Management Using Deep Learning in R: A Data-Driven Approach to Mitigating Uncertainty. International Journal of Supply Chain Management (IJSCM), 2(1), 7-22 doi: https://doi.org/10.34218/IJSCM_02_01_002