Multimodal Large Language Model (MLLM) Noise Resistance

Prasham Shah

doi:10.33774/coe-2025-6z9zx

The “noise resistance” of an AI model determines its usability under non-ideal real-world conditions—in technology products, scientific research, etc.—where a few characters in text or pixels in images are often inaccurate. Prior to this study, the noise resistance of classification models and text-based large language models (LLMs) had been investigated, but the noise resistance of multimodal LLMs (MLLMs) had not. Thus, I studied MLLMs’ noise resistance against both textual noise (misspellings) and image noise (Gaussian, salt-pepper, and speckle). I also employed two denoising algorithms, spell-check (“aspell”) for textual prompts and OpenCV’s “Fast NL Means” for image prompts, to see if such pre-processing improves MLLM accuracy. I developed 10 textual prompts and 30 image-based prompts, each then noised and then denoised. I tested two MLLMs (LLaVA and GPT-4o), alongside a traditional LLM (GPT-3.5) given the textual prompts for comparison. I hypothesized that MLLMs would have poor noise resistance (even worse than traditional LLMs) and be helped by denoising algorithms. The first hypothesis was supported by the data, but the second hypothesis was refuted—traditional denoising algorithms generally hurt model performance. I also predicted, though not central to my study, that lower-parameter models would fare worse, which the data supported; however, as it was not a factor I set out to measure, future controlled studies should confirm this. Future studies should employ larger sample sizes to reduce variability and experiment with using smaller AI models as denoisers. MLLM users should put effort into crafting clean prompts and avoid traditional algorithmic denoisers.

Multimodal Large Language Model (MLLM) Noise Resistance

Abstract

Keywords

Supplementary weblinks

Comments

Version History

Metrics

License

DOI

Author’s competing interest statement

Ethics

Share

Multimodal Large Language Model (MLLM) Noise Resistance

Authors

Abstract

Keywords

Supplementary weblinks

Comments

Version History

Metrics

License

DOI

Author’s competing interest statement

Ethics

Share