Modern Bazel with Python-Module 4: Caching and Dependencies

Learning Objectives

By the end of this module, you will:

Master Bazel’s caching mechanisms (local and remote)
Implement efficient dependency management for Python projects
Set up remote caching for team collaboration
Optimize build perform…


This content originally appeared on DEV Community and was authored by Sushil Baligar

Learning Objectives

By the end of this module, you will:

  • Master Bazel's caching mechanisms (local and remote)
  • Implement efficient dependency management for Python projects
  • Set up remote caching for team collaboration
  • Optimize build performance through smart caching strategies
  • Handle complex dependency scenarios and version conflicts

4.1 Understanding Bazel Caching

Local Caching Fundamentals

Bazel automatically caches build outputs locally. Understanding how this works is crucial for optimization:

# Check cache status
bazel info output_base
bazel info repository_cache

# Clean specific caches
bazel clean --expunge  # Nuclear option - removes everything
bazel clean            # Removes output files but keeps cache

Cache Key Components

Bazel creates cache keys based on:

  • Input file contents (not timestamps)
  • Build command and flags
  • Toolchain configuration
  • Environment variables that affect the build
# //BUILD - This will be cached efficiently
py_library(
    name = "utils",
    srcs = ["utils.py"],
    deps = ["@pypi//requests"],
)

# Changes to utils.py content will invalidate cache
# Changes to modification time won't affect cache

4.2 Remote Caching Setup

Basic Remote Cache Configuration

Set up a simple HTTP remote cache:

# .bazelrc
build --remote_cache=https://storage.googleapis.com/my-bazel-cache
build --remote_upload_local_results=true
build --remote_timeout=60
build --remote_retries=3

# For authentication
build --google_default_credentials=true
# OR
build --remote_header=Authorization=Bearer=your-token

Advanced Remote Cache Setup

Configure a more sophisticated remote cache with build event publishing:

# .bazelrc.remote
# Remote cache configuration
build:remote --remote_cache=grpc://cache.example.com:443
build:remote --remote_timeout=600
build:remote --remote_retries=5
build:remote --remote_upload_local_results=true

# Build event publishing
build:remote --bes_backend=grpc://analytics.example.com:443
build:remote --bes_results_url=https://analytics.example.com/build/

# Remote execution (if available)
build:remote --remote_executor=grpc://executor.example.com:443
build:remote --jobs=50

# Platform configuration
build:remote --host_platform=@bazel_tools//platforms:linux_x86_64
build:remote --platforms=@bazel_tools//platforms:linux_x86_64

# Use remote config
build --config=remote

Docker-based Remote Cache

Set up a local remote cache using Docker for team development:

# cache-server.dockerfile
FROM nginx:alpine

COPY nginx.conf /etc/nginx/nginx.conf
COPY cache.conf /etc/nginx/conf.d/default.conf

EXPOSE 8080
# cache.conf
server {
    listen 8080;
    client_max_body_size 2G;

    location / {
        root /var/cache/bazel;
        dav_methods PUT DELETE;
        create_full_put_path on;
        dav_access user:rw group:rw all:r;
    }
}
# docker-compose.yml for team cache
version: '3.8'
services:
  bazel-cache:
    build:
      context: .
      dockerfile: cache-server.dockerfile
    ports:
      - "8080:8080"
    volumes:
      - bazel-cache-data:/var/cache/bazel
    restart: unless-stopped

volumes:
  bazel-cache-data:

4.3 Python Dependency Management

Advanced pip_parse Configuration

Set up sophisticated Python dependency management:

# WORKSPACE
load("@rules_python//python:repositories.bzl", "python_register_toolchains")
load("@rules_python//python:pip.bzl", "pip_parse")

python_register_toolchains(
    name = "python3_11",
    python_version = "3.11.4",
)

# Main dependencies
pip_parse(
    name = "pypi",
    requirements_lock = "//third_party:requirements.lock",
    python_interpreter_target = "@python3_11_host//:python",
)

load("@pypi//:requirements.bzl", "install_deps")
install_deps()

# Development dependencies (separate namespace)
pip_parse(
    name = "pypi_dev", 
    requirements_lock = "//third_party:requirements-dev.lock",
    python_interpreter_target = "@python3_11_host//:python",
)

load("@pypi_dev//:requirements.bzl", dev_install_deps = "install_deps")
dev_install_deps()

# Testing dependencies
pip_parse(
    name = "pypi_test",
    requirements_lock = "//third_party:requirements-test.lock", 
    python_interpreter_target = "@python3_11_host//:python",
)

load("@pypi_test//:requirements.bzl", test_install_deps = "install_deps")
test_install_deps()

Dependency Lock Files

Create comprehensive lock files for reproducible builds:

# //third_party/requirements.txt
# Production dependencies
fastapi>=0.104.0,<0.105.0
uvicorn[standard]>=0.24.0,<0.25.0
pydantic>=2.5.0,<3.0.0
sqlalchemy>=2.0.0,<2.1.0
alembic>=1.13.0,<1.14.0
redis>=5.0.0,<6.0.0
celery>=5.3.0,<5.4.0
# //third_party/requirements-dev.lock
# Auto-generated - DO NOT EDIT
# This file was generated by pip-compile with python 3.11
# To update, run: pip-compile requirements-dev.txt

fastapi==0.104.1
    # via -r requirements-dev.txt
starlette==0.27.0
    # via fastapi
pydantic==2.5.2
    # via fastapi
pydantic-core==2.14.5
    # via pydantic
typing-extensions==4.8.0
    # via pydantic
uvicorn==0.24.0
    # via -r requirements-dev.txt
# ... complete locked versions

Multi-Platform Dependencies

Handle platform-specific dependencies:

# //third_party/BUILD
load("@rules_python//python:defs.bzl", "py_library")

# Platform-specific dependencies
py_library(
    name = "platform_deps",
    deps = select({
        "@platforms//os:linux": [
            "@pypi//psutil",
            "@pypi//linux_specific_lib",
        ],
        "@platforms//os:macos": [
            "@pypi//psutil", 
            "@pypi//pyobjc_framework_cocoa",
        ],
        "@platforms//os:windows": [
            "@pypi//psutil",
            "@pypi//pywin32",
        ],
        "//conditions:default": ["@pypi//psutil"],
    }),
)

4.4 Advanced Caching Strategies

Repository Caching

Configure repository-level caching for external dependencies:

# .bazelrc
build --repository_cache=/home/user/.cache/bazel/repos
build --experimental_repository_cache_hardlinks=true

# Force repository re-fetch when needed
build --repository_cache_hits_threshold=10

Action Caching vs Output Caching

Understand the difference and optimize accordingly:

# //tools/cache_optimization.bzl
def cache_friendly_genrule(name, srcs, cmd, **kwargs):
    """Genrule optimized for caching."""
    native.genrule(
        name = name,
        srcs = srcs,
        cmd = cmd,
        # Ensure deterministic output
        stamp = 0,
        # Add cache-friendly attributes
        **kwargs
    )

Cache Warming Strategies

Implement cache warming for CI/CD:

#!/bin/bash
# scripts/warm_cache.sh

# Warm cache with common targets
bazel build //... --keep_going
bazel test //... --test_tag_filters=-slow --keep_going

# Pre-build common development targets
bazel build //src/main:app //src/tests:unit_tests

# Cache commonly used external dependencies
bazel build @pypi//requests @pypi//fastapi @pypi//pytest

echo "Cache warming complete"

4.5 Dependency Resolution and Conflicts

Version Conflict Resolution

Handle complex version conflicts systematically:

# //third_party/overrides.bzl
def apply_dependency_overrides():
    """Apply necessary dependency version overrides."""

    # Override conflicting versions
    override_targets = {
        # Force specific numpy version across all deps
        "@pypi//numpy": "@pypi_pinned//numpy",
        # Use our patched version of requests
        "@pypi//requests": "//third_party/patched:requests",
    }

    return override_targets
# //third_party/patched/BUILD
py_library(
    name = "requests",
    srcs = ["requests_patched.py"],
    deps = [
        "@pypi//urllib3", 
        "@pypi//certifi",
        "@pypi//charset_normalizer",
    ],
    visibility = ["//visibility:public"],
)

Custom Dependency Resolution

Implement custom resolution for complex scenarios:

# //tools/custom_deps.bzl
load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

def custom_python_deps():
    """Install Python dependencies with custom resolution."""

    # Custom ML library not available on PyPI
    http_archive(
        name = "custom_ml_lib",
        urls = ["https://github.com/company/ml-lib/archive/v2.1.0.tar.gz"],
        sha256 = "abcd1234...",
        strip_prefix = "ml-lib-2.1.0",
        build_file = "//third_party:custom_ml_lib.BUILD",
    )

    # Forked dependency with patches
    http_archive(
        name = "patched_fastapi",
        urls = ["https://github.com/our-org/fastapi/archive/patched-v0.104.1.tar.gz"],
        sha256 = "efgh5678...",
        strip_prefix = "fastapi-patched-v0.104.1", 
        build_file = "//third_party:patched_fastapi.BUILD",
    )

4.6 Performance Monitoring and Optimization

Build Performance Analysis

Monitor and analyze build performance:

# Generate build profile
bazel build //... --profile=build_profile.json
bazel analyze-profile build_profile.json

# Memory usage analysis  
bazel build //... --memory_profile=memory_profile.json

# Detailed timing
bazel build //... --experimental_profile_additional_tasks

Cache Hit Rate Monitoring

Track cache effectiveness:

#!/bin/bash
# scripts/monitor_cache.sh

echo "Cache Statistics:"
echo "=================="

# Local cache info
echo "Local cache location: $(bazel info output_base)"
echo "Repository cache: $(bazel info repository_cache)"

# Build with cache stats
bazel build //... --profile=cache_profile.json 2>&1 | tee build.log

# Extract cache hit statistics
grep -E "(cache hit|remote cache)" build.log | wc -l
echo "Remote cache hits: $(grep -c 'remote cache hit' build.log)"
echo "Local cache hits: $(grep -c 'local cache hit' build.log)"

Optimizing for Cache Efficiency

Best practices for cache-friendly builds:

# //BUILD - Cache-optimized targets
py_library(
    name = "stable_utils",
    srcs = ["utils.py"],
    # Stable dependencies cache better
    deps = [
        "@pypi//requests",  # Pinned version
        "//common:constants",  # Rarely changing
    ],
)

py_library(
    name = "feature_code", 
    srcs = ["feature.py"],
    # Separate frequently changing code
    deps = [
        ":stable_utils",  # Reuse cached stable parts
        "//config:dynamic_config",  # Accept this changes often
    ],
)

4.7 Practical Examples

Complete Web Application Setup

Real-world example with optimized caching and dependencies:

# //web_app/BUILD
load("@rules_python//python:defs.bzl", "py_binary", "py_library", "py_test")

# Core application library (stable, caches well)
py_library(
    name = "app_core",
    srcs = [
        "core/__init__.py",
        "core/models.py", 
        "core/database.py",
        "core/auth.py",
    ],
    deps = [
        "@pypi//fastapi",
        "@pypi//sqlalchemy", 
        "@pypi//pydantic",
        "@pypi//passlib",
        "//common:config",
    ],
)

# API routes (changes more frequently)
py_library(
    name = "api_routes",
    srcs = [
        "api/__init__.py",
        "api/users.py",
        "api/posts.py", 
        "api/auth.py",
    ],
    deps = [
        ":app_core",
        "@pypi//fastapi",
        "//common:validators",
    ],
)

# Main application
py_binary(
    name = "web_app",
    srcs = ["main.py"],
    deps = [
        ":app_core",
        ":api_routes", 
        "@pypi//uvicorn",
    ],
    main = "main.py",
)

# Comprehensive test suite
py_test(
    name = "integration_test",
    srcs = ["test_integration.py"],
    deps = [
        ":web_app",
        "@pypi_test//pytest",
        "@pypi_test//httpx",
        "@pypi_test//pytest_asyncio",
    ],
    data = ["test_data.json"],
)

CI/CD Cache Configuration

Optimize CI/CD with proper cache configuration:

# .github/workflows/build.yml
name: Build and Test

on: [push, pull_request]

jobs:
  build:
    runs-on: ubuntu-latest

    steps:
    - uses: actions/checkout@v4

    - name: Mount bazel cache
      uses: actions/cache@v3
      with:
        path: |
          ~/.cache/bazel
          ~/.cache/bazel-repo
        key: bazel-${{ runner.os }}-${{ hashFiles('WORKSPACE', '**/*.bzl', 'requirements*.txt') }}
        restore-keys: |
          bazel-${{ runner.os }}-

    - name: Configure Bazel
      run: |
        echo "build --repository_cache=/home/runner/.cache/bazel-repo" >> .bazelrc.ci
        echo "build --disk_cache=/home/runner/.cache/bazel" >> .bazelrc.ci
        echo "build --remote_cache=${{ secrets.REMOTE_CACHE_URL }}" >> .bazelrc.ci
        echo "build --remote_upload_local_results=true" >> .bazelrc.ci

    - name: Build
      run: bazel build //... --config=ci

    - name: Test  
      run: bazel test //... --config=ci --test_output=errors

4.8 Troubleshooting Common Issues

Cache Invalidation Problems

Debug cache invalidation issues:

# Check why target was rebuilt
bazel build //target:name --explain=explain.log
bazel build //target:name --verbose_explanations

# Compare action keys
bazel aquery //target:name --output=textproto > action1.txt
# Make change and run again
bazel aquery //target:name --output=textproto > action2.txt
diff action1.txt action2.txt

Dependency Resolution Issues

Debug complex dependency problems:

# Analyze dependency graph
bazel query "deps(//your:target)" --output=graph | dot -Tpng > deps.png

# Find conflicting versions
bazel query "//... intersect deps(@pypi//problematic_package//:*)"

# Check why a dependency was selected
bazel query --output=build //external:pypi_problematic_package

Remote Cache Issues

Troubleshoot remote cache problems:

# Test remote cache connectivity
bazel build //simple:target --remote_cache=your-cache-url --execution_log_json_file=exec.json

# Check cache upload/download
bazel build //target --remote_cache=your-cache-url --experimental_remote_cache_eviction_retries=3 -v

# Verify authentication
curl -H "Authorization: Bearer $TOKEN" https://your-cache-url/status

4.9 Best Practices Summary

Caching Best Practices

  • Use remote caching for team collaboration
  • Separate stable and volatile dependencies
  • Monitor cache hit rates regularly
  • Implement cache warming in CI/CD
  • Use repository caching for external dependencies

Dependency Management Best Practices

  • Pin all dependency versions in lock files
  • Separate production, development, and test dependencies
  • Handle platform-specific dependencies explicitly
  • Use custom resolution for complex scenarios
  • Monitor for security vulnerabilities in dependencies

Performance Optimization

  • Profile builds regularly to identify bottlenecks
  • Structure targets to maximize cache reuse
  • Use hermetic builds for reproducibility
  • Implement incremental build strategies
  • Monitor and optimize resource usage

Module 4 Exercises

Exercise 1: Remote Cache Setup

Set up a remote cache using Google Cloud Storage or AWS S3 and measure the cache hit rate improvement.

Exercise 2: Dependency Conflict Resolution

Create a scenario with conflicting dependency versions and resolve it using custom overrides.

Exercise 3: Cache Performance Analysis

Profile a complex build, identify cache inefficiencies, and implement optimizations.

Exercise 4: CI/CD Cache Integration

Set up cache optimization in a CI/CD pipeline and measure build time improvements.

Next Steps

In Module 5, we'll cover "Advanced Python Rules and Toolchains" where you'll learn to:

  • Create custom Python rules and macros
  • Configure multiple Python toolchains
  • Implement hermetic Python builds
  • Use aspects for code analysis

Key Takeaways

  • Bazel's caching is content-based, not timestamp-based
  • Remote caching enables massive build speedups for teams
  • Proper dependency management prevents version conflicts
  • Cache performance should be monitored and optimized
  • Structured targets maximize cache reuse efficiency
  • Lock files ensure reproducible builds across environments

https://www.linkedin.com/in/sushilbaligar/
https://github.com/sushilbaligar
https://dev.to/sushilbaligar
https://medium.com/@sushilbaligar


This content originally appeared on DEV Community and was authored by Sushil Baligar


Print Share Comment Cite Upload Translate Updates
APA

Sushil Baligar | Sciencx (2025-07-26T06:40:53+00:00) Modern Bazel with Python-Module 4: Caching and Dependencies. Retrieved from https://www.scien.cx/2025/07/26/modern-bazel-with-python-module-4-caching-and-dependencies/

MLA
" » Modern Bazel with Python-Module 4: Caching and Dependencies." Sushil Baligar | Sciencx - Saturday July 26, 2025, https://www.scien.cx/2025/07/26/modern-bazel-with-python-module-4-caching-and-dependencies/
HARVARD
Sushil Baligar | Sciencx Saturday July 26, 2025 » Modern Bazel with Python-Module 4: Caching and Dependencies., viewed ,<https://www.scien.cx/2025/07/26/modern-bazel-with-python-module-4-caching-and-dependencies/>
VANCOUVER
Sushil Baligar | Sciencx - » Modern Bazel with Python-Module 4: Caching and Dependencies. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/07/26/modern-bazel-with-python-module-4-caching-and-dependencies/
CHICAGO
" » Modern Bazel with Python-Module 4: Caching and Dependencies." Sushil Baligar | Sciencx - Accessed . https://www.scien.cx/2025/07/26/modern-bazel-with-python-module-4-caching-and-dependencies/
IEEE
" » Modern Bazel with Python-Module 4: Caching and Dependencies." Sushil Baligar | Sciencx [Online]. Available: https://www.scien.cx/2025/07/26/modern-bazel-with-python-module-4-caching-and-dependencies/. [Accessed: ]
rf:citation
» Modern Bazel with Python-Module 4: Caching and Dependencies | Sushil Baligar | Sciencx | https://www.scien.cx/2025/07/26/modern-bazel-with-python-module-4-caching-and-dependencies/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.