How Fast Is PyJuice? Testing Compilation Speed Across GPUs and Batch Sizes

This article presents experimental benchmarks for PyJuice, highlighting its efficiency in both compilation and runtime. Tests show that even models with nearly 1 billion parameters can be compiled in about 30 seconds, and PyJuice consistently outperforms baseline methods across different GPUs (RTX 4090, NVIDIA A40) and batch sizes. These results underline PyJuice’s speed, scalability, and advantage in real-world machine learning workloads.


This content originally appeared on HackerNoon and was authored by Probabilistic

Abstract and 1. Introduction

  1. Preliminaries and Related Work

  2. Key Bottlenecks in PC Parallelization

  3. Harnessing Block-Based PC Parallelization

    4.1. Fully Connected Sum Layers

    4.2. Generalizing To Practical Sum Layers

    4.3. Efficient Implementations by Compiling PC Layers

    4.4. Analysis: IO and Computation Overhead

  4. Optimizing Backpropagation with PC Flows

  5. Experiments

    6.1. Faster Models with PyJuice

    6.2. Better PCs At Scale

    6.3. Benchmarking Existing PCs

  6. Conclusion, Acknowledgements, Impact Statement, and References

A. Algorithm Details

B. Additional Technical Details

C. Experimental Details

D. Additional Experiments

\

D. Additional Experiments

D.1. Speed of the Compilation Process

In Table 5, we show the compilation speed of PCs with different structures and different sizes. Experiments are conducted on a server with an AMD EPYC 7763 64-Core Processor and 8 RTX 4090 GPUs (we only use one GPU). The results demonstrate the efficiency of the compilation process, where even the PD model with close to 1B parameters can be compiled in around 30 seconds.

\ Table 5. Average (± standard deviation of 3 runs) runtime (in seconds) of the compilation process of four PCs.

D.2. Runtime on Different GPUs

In addition to the RTX 4090 GPU adopted in the experiments in Table 1, we compare the runtime of PyJuice with the baselines on an NVIDIA A40 GPU. As shown in the following table, PyJuice is still significantly faster than all baselines for PCs of different sizes.

\ Table 6. Average (± standard deviation of 5 runs) runtime (in seconds) per training epoch of 60K samples for PyJuice and the baselines on five RAT-SPNs (Peharz et al., 2020b) with different sizes. All other settings are the same as described in Section 6.1.

D.3. Runtime on Different Batch Sizes

As a supplement to Table 1, we report the runtime for a RAT-SPN (Peharz et al., 2020b) with 465K nodes and 33.4M edges using batch sizes {8, 16, 32, 64, 128, 256, 512}. To minimize distractions, we only record the time to compute the forward and backward process, but not the time used for EM updates. Results are shown in the table below.

\ Table 7. Average (± standard deviation of 5 runs) runtime (in seconds) per training epoch (excluding EM updates) of 60K samples for PyJuice and the baselines on a RAT-SPNs (Peharz et al., 2020b) with 465K nodes and 33.4M edges. All other settings are the same as described in Section 6.1. OOM denotes out-of-memory.

\

:::info Authors:

(1) Anji Liu, Department of Computer Science, University of California, Los Angeles, USA (liuanji@cs.ucla.edu);

(2) Kareem Ahmed, Department of Computer Science, University of California, Los Angeles, USA;

(3) Guy Van den Broeck, Department of Computer Science, University of California, Los Angeles, USA;

:::


:::info This paper is available on arxiv under CC BY 4.0 DEED license.

:::

\


This content originally appeared on HackerNoon and was authored by Probabilistic


Print Share Comment Cite Upload Translate Updates
APA

Probabilistic | Sciencx (2025-08-25T20:00:03+00:00) How Fast Is PyJuice? Testing Compilation Speed Across GPUs and Batch Sizes. Retrieved from https://www.scien.cx/2025/08/25/how-fast-is-pyjuice-testing-compilation-speed-across-gpus-and-batch-sizes/

MLA
" » How Fast Is PyJuice? Testing Compilation Speed Across GPUs and Batch Sizes." Probabilistic | Sciencx - Monday August 25, 2025, https://www.scien.cx/2025/08/25/how-fast-is-pyjuice-testing-compilation-speed-across-gpus-and-batch-sizes/
HARVARD
Probabilistic | Sciencx Monday August 25, 2025 » How Fast Is PyJuice? Testing Compilation Speed Across GPUs and Batch Sizes., viewed ,<https://www.scien.cx/2025/08/25/how-fast-is-pyjuice-testing-compilation-speed-across-gpus-and-batch-sizes/>
VANCOUVER
Probabilistic | Sciencx - » How Fast Is PyJuice? Testing Compilation Speed Across GPUs and Batch Sizes. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/08/25/how-fast-is-pyjuice-testing-compilation-speed-across-gpus-and-batch-sizes/
CHICAGO
" » How Fast Is PyJuice? Testing Compilation Speed Across GPUs and Batch Sizes." Probabilistic | Sciencx - Accessed . https://www.scien.cx/2025/08/25/how-fast-is-pyjuice-testing-compilation-speed-across-gpus-and-batch-sizes/
IEEE
" » How Fast Is PyJuice? Testing Compilation Speed Across GPUs and Batch Sizes." Probabilistic | Sciencx [Online]. Available: https://www.scien.cx/2025/08/25/how-fast-is-pyjuice-testing-compilation-speed-across-gpus-and-batch-sizes/. [Accessed: ]
rf:citation
» How Fast Is PyJuice? Testing Compilation Speed Across GPUs and Batch Sizes | Probabilistic | Sciencx | https://www.scien.cx/2025/08/25/how-fast-is-pyjuice-testing-compilation-speed-across-gpus-and-batch-sizes/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.