Empirical Results: GPT-2 Analysis of Transformer Memorization & Loss

We explore the hypothesis regarding the radius r in Section 5 using a pre-trained GPT-2 medium model. Additionally, we train various GPT-2 small models and vanilla Transformer models to analyze their cross-entropy losses.

6.1 Empirical evaluation of the radius

\ Figure 3: Cross-entropy loss of GPT-2 small model trained on (left) 100%, (middle) 1%, and (right) 0.1% of OpenWebText-9B dataset with a typical training time.

:::info Authors:

(1) Xueyan Niu, Theory Laboratory, Central Research Institute, 2012 Laboratories, Huawei Technologies Co., Ltd.;

(2) Bo Bai baibo (8@huawei.com);

(3) Lei Deng (deng.lei2@huawei.com);

(4) Wei Han (harvey.hanwei@huawei.com).

:::

:::info This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license.

:::

1. available at https://github.com/openai/gpt-2

This content originally appeared on HackerNoon and was authored by Reinforcement Technology Advancements

Print Share Comment Cite Upload Translate Updates

APA

Reinforcement Technology Advancements | Sciencx (2025-06-21T17:00:03+00:00) Empirical Results: GPT-2 Analysis of Transformer Memorization & Loss. Retrieved from https://www.scien.cx/2025/06/21/empirical-results-gpt-2-analysis-of-transformer-memorization-loss/

MLA

" » Empirical Results: GPT-2 Analysis of Transformer Memorization & Loss." Reinforcement Technology Advancements | Sciencx - Saturday June 21, 2025, https://www.scien.cx/2025/06/21/empirical-results-gpt-2-analysis-of-transformer-memorization-loss/

HARVARD

Reinforcement Technology Advancements | Sciencx Saturday June 21, 2025 » Empirical Results: GPT-2 Analysis of Transformer Memorization & Loss., viewed ,<https://www.scien.cx/2025/06/21/empirical-results-gpt-2-analysis-of-transformer-memorization-loss/>

VANCOUVER

Reinforcement Technology Advancements | Sciencx - » Empirical Results: GPT-2 Analysis of Transformer Memorization & Loss. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/06/21/empirical-results-gpt-2-analysis-of-transformer-memorization-loss/

CHICAGO

" » Empirical Results: GPT-2 Analysis of Transformer Memorization & Loss." Reinforcement Technology Advancements | Sciencx - Accessed . https://www.scien.cx/2025/06/21/empirical-results-gpt-2-analysis-of-transformer-memorization-loss/

IEEE

" » Empirical Results: GPT-2 Analysis of Transformer Memorization & Loss." Reinforcement Technology Advancements | Sciencx [Online]. Available: https://www.scien.cx/2025/06/21/empirical-results-gpt-2-analysis-of-transformer-memorization-loss/. [Accessed: ]

rf:citation

» Empirical Results: GPT-2 Analysis of Transformer Memorization & Loss | Reinforcement Technology Advancements | Sciencx | https://www.scien.cx/2025/06/21/empirical-results-gpt-2-analysis-of-transformer-memorization-loss/ |

Please log in to upload a file.

There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.

Table of Links

6 Empirical Results

6.1 Empirical evaluation of the radius

Related Posts