| Issue |
Int. J. Simul. Multidisci. Des. Optim.
Volume 16, 2025
Multi-modal Information Learning and Analytics on Cross-Media Data Integration
|
|
|---|---|---|
| Article Number | 15 | |
| Number of page(s) | 16 | |
| DOI | https://doi.org/10.1051/smdo/2025017 | |
| Published online | 03 October 2025 | |
Research Article
Perception and sharing optimization mechanism of digital media art interactive device driven by adaptive AI
Engineering Training Center, Taiyuan University, Taiyuan, 030032 Shanxi, China
* e-mail: liucong@tyu.edu.cn
Received:
13
June
2025
Accepted:
18
August
2025
Current digital media art interactive devices are difficult to adapt to complex environments due to fixed perception mechanisms, and are easily affected by noise interference and low light, resulting in reduced recognition accuracy and interaction delay, affecting the immersive experience. This study constructs a perception and sharing optimization mechanism for digital media art interactive devices driven by adaptive Artificial Intelligence (AI). A multi-channel data fusion perception network is proposed, and the Transformer encoder is used to fuse Red-Green-Blue (RGB) image, infrared depth map, and speech spectrum map features to improve perception robustness. A user behavior temporal modeling module based on Bidirectional Graph Recurrent Transformer (Bi-GRT) is designed to achieve dynamic recognition of continuous actions, gestures, and speech emotions. Environmental simulation and adaptive control mechanisms are applied, and reinforcement learning is combined to dynamically adjust sensor parameters to adapt to environmental changes. A multi-user interactive sharing framework is constructed through federated learning to ensure privacy while improving model consistency. Finally, a real-time feedback optimization and content rendering collaborative engine is developed to achieve dynamic scheduling and optimization of multi-dimensional output content based on Graph Attention Network (Bi-GRT). Experiments show that the system has an accuracy rate of over 80% in multi-action sample recognition; the environmental adaptability achieves a response delay of 210 ms in a noisy environment; the sequence similarity of each terminal in federated learning is above 0.91; the user immersion peak is 9.2 points, Compared with existing systems, this framework has achieved breakthrough progress in environmental adaptability, multi-user collaboration, and privacy protection, providing a quantifiable performance improvement path for digital media art interactive devices.
Key words: Digital media art interaction / adaptive artificial intelligence / multimodal data fusion / federated learning / graph neural networks
© C. Liu, Published by EDP Sciences, 2025
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.
