Transformer Performance: Hopfield Theory & Cross-Entropy Loss Data

This work contextualizes large language model dynamics using a review of Hopfield network models and empirical data on Transformer cross-entropy loss.


This content originally appeared on HackerNoon and was authored by Reinforcement Technology Advancements

Abstract and 1 Introduction

2 Related Work

3 Model and 3.1 Associative memories

3.2 Transformer blocks

4 A New Energy Function

4.1 The layered structure

5 Cross-Entropy Loss

6 Empirical Results and 6.1 Empirical evaluation of the radius

6.2 Training GPT-2

6.3 Training Vanilla Transformers

7 Conclusion and Acknowledgments

\ Appendix A. Deferred Tables

Appendix B. Some Properties of the Energy Functions

Appendix C. Deferred Proofs from Section 5

Appendix D. Transformer Details: Using GPT-2 as an Example

\ References

Appendix A. Deferred Tables

Table 1: Table of selected related works for Hopfield network, enumerating their domain, energy function, and memory capacity. For all the works above, n represents the dimension of the input vector. W is the outer product of the patterns. M is the matrix of patterns. r is the order of polynomial F(·), d is the number of patterns, and c is a positive constant.

\ Table 2: Large transformer-based language models and their reported cross-entropy loss.

\

:::info Authors:

(1) Xueyan Niu, Theory Laboratory, Central Research Institute, 2012 Laboratories, Huawei Technologies Co., Ltd.;

(2) Bo Bai baibo (8@huawei.com);

(3) Lei Deng (deng.lei2@huawei.com);

(4) Wei Han (harvey.hanwei@huawei.com).

:::


:::info This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license.

:::

\


This content originally appeared on HackerNoon and was authored by Reinforcement Technology Advancements


Print Share Comment Cite Upload Translate Updates
APA

Reinforcement Technology Advancements | Sciencx (2025-06-24T02:00:05+00:00) Transformer Performance: Hopfield Theory & Cross-Entropy Loss Data. Retrieved from https://www.scien.cx/2025/06/24/transformer-performance-hopfield-theory-cross-entropy-loss-data/

MLA
" » Transformer Performance: Hopfield Theory & Cross-Entropy Loss Data." Reinforcement Technology Advancements | Sciencx - Tuesday June 24, 2025, https://www.scien.cx/2025/06/24/transformer-performance-hopfield-theory-cross-entropy-loss-data/
HARVARD
Reinforcement Technology Advancements | Sciencx Tuesday June 24, 2025 » Transformer Performance: Hopfield Theory & Cross-Entropy Loss Data., viewed ,<https://www.scien.cx/2025/06/24/transformer-performance-hopfield-theory-cross-entropy-loss-data/>
VANCOUVER
Reinforcement Technology Advancements | Sciencx - » Transformer Performance: Hopfield Theory & Cross-Entropy Loss Data. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/06/24/transformer-performance-hopfield-theory-cross-entropy-loss-data/
CHICAGO
" » Transformer Performance: Hopfield Theory & Cross-Entropy Loss Data." Reinforcement Technology Advancements | Sciencx - Accessed . https://www.scien.cx/2025/06/24/transformer-performance-hopfield-theory-cross-entropy-loss-data/
IEEE
" » Transformer Performance: Hopfield Theory & Cross-Entropy Loss Data." Reinforcement Technology Advancements | Sciencx [Online]. Available: https://www.scien.cx/2025/06/24/transformer-performance-hopfield-theory-cross-entropy-loss-data/. [Accessed: ]
rf:citation
» Transformer Performance: Hopfield Theory & Cross-Entropy Loss Data | Reinforcement Technology Advancements | Sciencx | https://www.scien.cx/2025/06/24/transformer-performance-hopfield-theory-cross-entropy-loss-data/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.