Abstract
In the standard transformer architecture, in
creasing model parameters leads to linear
growth in computational cost and activation
memory. To address this issue, we propose
a novel Infinite Parameter Large Language
Model (IP-LLM) architecture that decouples
model size from computational cost and de
vice memory. Existing large language models
Figure 1: Parameters A, B, C, and D store knowledge
are all fixed-parameter models, while human
knowledge is infinite and expands daily. Finite
parameters are inherently limited in their capac
ity to accommodate this boundless knowledge.
Our IP-LLM architecture can potentially ac
commodate infinite knowledge, resolving this
issue and laying the foundation for realizing a
truly omniscient and omnipotent artificial gen
eral intelligence in the future.Our architecture
surpasses MOE in performance while requiring
significantly less memory.