This content originally appeared on DEV Community and was authored by SIDDHARTH PATIL
Pothole Detection Model - Complete Training & Implementation Guide
Table of Contents
- Dataset Acquisition
- Dataset Preparation with Roboflow
- Model Training Process
- Understanding the .pt Weight File
- YOLOv8 Implementation Code
- Complete Workflow
1. Dataset Acquisition
1.1 Source: Kaggle
Platform: Kaggle (kaggle.com)
Dataset Selection Process:
- Searched for "pothole detection dataset" on Kaggle
- Selected dataset with diverse pothole images
- Typical dataset characteristics:
- 500-2000+ images of roads with potholes
- Various lighting conditions
- Different road types and pothole sizes
- Mix of annotated and unannotated images
Download Process:
# Method 1: Manual Download
1. Navigate to Kaggle dataset page
2. Click "Download" button
3. Extract ZIP file to local directory
# Method 2: Kaggle API
kaggle datasets download -d <dataset-name>
unzip <dataset-name>.zip
1.2 Dataset Structure
pothole_dataset/
│
├── images/
│ ├── image001.jpg
│ ├── image002.jpg
│ └── ...
│
└── annotations/
├── image001.txt
├── image002.txt
└── ...
2. Dataset Preparation with Roboflow
2.1 Why Roboflow?
Roboflow is a computer vision platform that simplifies:
- Dataset organization and management
- Image annotation and labeling
- Data augmentation
- Format conversion (to YOLO format)
- Train/Validation/Test splitting
- Model training integration
2.2 Roboflow Workflow
Step 1: Create Project
- Sign up at roboflow.com
- Create new project: "Pothole Detection"
- Select project type: Object Detection
- Choose annotation group: Single Class (Pothole)
Step 2: Upload Dataset
1. Click "Upload" → Select images from Kaggle dataset
2. Roboflow automatically processes images
3. Wait for upload completion (shows progress bar)
Step 3: Annotation
-
If images are pre-annotated (COCO/PASCAL VOC format):
- Roboflow auto-imports annotations
- Review and verify bounding boxes
-
If manual annotation needed:
- Use Roboflow's annotation tool
- Draw bounding boxes around each pothole
- Label as "pothole"
- Save annotations
Step 4: Dataset Augmentation
Applied augmentation techniques:
Preprocessing:
- Auto-Orient: Correct image orientation
- Resize: 640x640 pixels (YOLO standard)
Augmentation:
- Rotation: ±15 degrees
- Brightness: ±25%
- Exposure: ±25%
- Blur: Up to 2px
- Flip: Horizontal
These create multiple variations of each image, expanding dataset size 3-5x.
Step 5: Generate Dataset Version
1. Split data:
- Train: 70%
- Validation: 20%
- Test: 10%
2. Export format: YOLOv8
3. Generate → Roboflow creates downloadable dataset
Step 6: Download Training Code
Roboflow provides ready-to-use code snippet:
from roboflow import Roboflow
rf = Roboflow(api_key="YOUR_API_KEY")
project = rf.workspace("workspace-name").project("pothole-detection")
dataset = project.version(1).download("yolov8")
3. Model Training Process
3.1 Training Script: potholewrightfile.py
Purpose: This Python script trains the YOLOv8 model on your custom pothole dataset.
Complete Training Code:
"""
potholewrightfile.py
Custom YOLOv8 Pothole Detection Model Training Script
"""
from ultralytics import YOLO
from roboflow import Roboflow
import os
# ====================
# 1. DATASET DOWNLOAD
# ====================
print("📥 Downloading dataset from Roboflow...")
rf = Roboflow(api_key="YOUR_ROBOFLOW_API_KEY")
project = rf.workspace("your-workspace").project("pothole-detection")
dataset = project.version(1).download("yolov8")
print(f"✅ Dataset downloaded to: {dataset.location}")
# ====================
# 2. MODEL INITIALIZATION
# ====================
print("\n🤖 Initializing YOLOv8 model...")
# Start with pre-trained YOLOv8 base model
model = YOLO('yolov8n.pt') # Options: yolov8n/s/m/l/x (nano to extra-large)
print("✅ Model loaded successfully")
# ====================
# 3. TRAINING CONFIGURATION
# ====================
print("\n⚙️ Configuring training parameters...")
training_config = {
'data': f'{dataset.location}/data.yaml', # Path to dataset config
'epochs': 100, # Number of training iterations
'imgsz': 640, # Image size (640x640)
'batch': 16, # Batch size (adjust based on GPU)
'name': 'pothole_detection', # Experiment name
'patience': 20, # Early stopping patience
'save': True, # Save checkpoints
'device': 0, # GPU device (0 = first GPU, 'cpu' for CPU)
'workers': 8, # Number of data loader workers
'project': 'runs/detect', # Project directory
'exist_ok': True, # Overwrite existing project
'pretrained': True, # Use pretrained weights
'optimizer': 'auto', # Optimizer (auto/SGD/Adam/AdamW)
'verbose': True, # Verbose output
'seed': 42, # Random seed for reproducibility
'deterministic': True, # Deterministic training
'single_cls': False, # Treat as single class
'rect': False, # Rectangular training
'cos_lr': False, # Cosine learning rate scheduler
'close_mosaic': 10, # Disable mosaic augmentation in last N epochs
'resume': False, # Resume from checkpoint
'amp': True, # Automatic Mixed Precision training
'fraction': 1.0, # Fraction of dataset to train on
'profile': False, # Profile ONNX and TensorRT speeds
'lr0': 0.01, # Initial learning rate
'lrf': 0.01, # Final learning rate (lr0 * lrf)
'momentum': 0.937, # SGD momentum/Adam beta1
'weight_decay': 0.0005, # Optimizer weight decay
'warmup_epochs': 3.0, # Warmup epochs
'warmup_momentum': 0.8, # Warmup initial momentum
'warmup_bias_lr': 0.1, # Warmup initial bias learning rate
'box': 7.5, # Box loss gain
'cls': 0.5, # Class loss gain
'dfl': 1.5, # DFL loss gain
'pose': 12.0, # Pose loss gain
'kobj': 2.0, # Keypoint object loss gain
'label_smoothing': 0.0, # Label smoothing epsilon
'nbs': 64, # Nominal batch size
'hsv_h': 0.015, # HSV-Hue augmentation
'hsv_s': 0.7, # HSV-Saturation augmentation
'hsv_v': 0.4, # HSV-Value augmentation
'degrees': 0.0, # Rotation augmentation
'translate': 0.1, # Translation augmentation
'scale': 0.5, # Scaling augmentation
'shear': 0.0, # Shear augmentation
'perspective': 0.0, # Perspective augmentation
'flipud': 0.0, # Flip up-down augmentation probability
'fliplr': 0.5, # Flip left-right augmentation probability
'mosaic': 1.0, # Mosaic augmentation probability
'mixup': 0.0, # MixUp augmentation probability
'copy_paste': 0.0, # Copy-paste augmentation probability
}
# ====================
# 4. START TRAINING
# ====================
print("\n🚀 Starting training process...\n")
print("="*60)
results = model.train(**training_config)
print("\n" + "="*60)
print("✅ Training completed successfully!")
# ====================
# 5. SAVE MODEL
# ====================
print("\n💾 Saving trained model...")
# The best weights are automatically saved as 'best.pt'
# Rename to 'pothole.pt' for clarity
best_model_path = 'runs/detect/pothole_detection/weights/best.pt'
output_path = 'pothole.pt'
if os.path.exists(best_model_path):
import shutil
shutil.copy(best_model_path, output_path)
print(f"✅ Model saved as: {output_path}")
else:
print("⚠️ Best model not found. Check training directory.")
# ====================
# 6. MODEL VALIDATION
# ====================
print("\n🔍 Validating model on test set...")
validation_results = model.val()
print("\n📊 Validation Metrics:")
print(f"mAP50: {validation_results.box.map50:.4f}")
print(f"mAP50-95: {validation_results.box.map:.4f}")
print(f"Precision: {validation_results.box.mp:.4f}")
print(f"Recall: {validation_results.box.mr:.4f}")
# ====================
# 7. EXPORT MODEL
# ====================
print("\n📦 Exporting model to different formats...")
# Export to ONNX (for deployment)
model.export(format='onnx')
print("✅ ONNX model exported")
# Export to TensorRT (for NVIDIA devices - optional)
# model.export(format='engine')
# Export to TensorFlow (optional)
# model.export(format='tflite')
print("\n" + "="*60)
print("🎉 Training pipeline completed successfully!")
print("="*60)
print(f"\n📁 Trained model location: {output_path}")
print(f"📊 Training results: runs/detect/pothole_detection/")
print(f"📈 View results using: tensorboard --logdir runs/detect/pothole_detection/")
3.2 Running the Training Script
# Install required packages
pip install ultralytics roboflow opencv-python torch torchvision
# Run training
python potholewrightfile.py
3.3 Training Output
The script creates the following structure:
runs/detect/pothole_detection/
│
├── weights/
│ ├── best.pt # Best performing model
│ └── last.pt # Last epoch model
│
├── confusion_matrix.png # Classification confusion matrix
├── results.csv # Training metrics per epoch
├── results.png # Training curves (loss, mAP, etc.)
├── F1_curve.png # F1 score curve
├── P_curve.png # Precision curve
├── R_curve.png # Recall curve
├── PR_curve.png # Precision-Recall curve
└── val_batch0_pred.jpg # Validation predictions sample
4. Understanding the .pt Weight File
4.1 What is a .pt File?
.pt = PyTorch file extension
- Contains trained neural network weights (parameters)
- Stores model architecture information
- Includes optimizer state and training configuration
- Binary format (not human-readable)
- Typical size: 6MB (nano) to 140MB (extra-large)
4.2 What's Inside pothole.pt?
import torch
# Load the .pt file
model_data = torch.load('pothole.pt')
# Contents:
{
'model': <trained neural network weights>,
'optimizer': <optimizer state>,
'training_results': <loss, mAP, metrics>,
'epoch': <last trained epoch number>,
'date': <training completion date>,
'version': <YOLOv8 version>,
}
4.3 How YOLOv8 Uses pothole.pt
When you load the model:
model = YOLO('pothole.pt')
YOLOv8 does the following:
- Reads file: Loads binary weights from disk
- Reconstructs architecture: Builds neural network layers
- Applies weights: Sets each neuron's learned parameters
- Prepares for inference: Model ready to detect potholes
Think of it like a brain transplant:
- yolov8n.pt = Generic brain (knows common objects)
- pothole.pt = Specialized brain (expert at finding potholes)
5. YOLOv8 Implementation Code
5.1 Basic Detection Script
"""
pothole_detector.py
Real-time pothole detection using trained YOLOv8 model
"""
from ultralytics import YOLO
import cv2
import numpy as np
from datetime import datetime
# ====================
# 1. LOAD TRAINED MODEL
# ====================
print("🔄 Loading pothole detection model...")
model = YOLO('pothole.pt') # Load your custom trained weights
print("✅ Model loaded successfully\n")
# ====================
# 2. CONFIGURATION
# ====================
CONFIDENCE_THRESHOLD = 0.5 # Minimum confidence for detection
VIDEO_SOURCE = 'road_video.mp4' # Video file or 0 for webcam
OUTPUT_VIDEO = 'pothole_detected.mp4'
SHOW_CONFIDENCE = True
SAVE_VIDEO = True
# ====================
# 3. VIDEO CAPTURE
# ====================
cap = cv2.VideoCapture(VIDEO_SOURCE)
if not cap.isOpened():
print("❌ Error: Cannot open video source")
exit()
# Get video properties
frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(cap.get(cv2.CAP_PROP_FPS))
print(f"📹 Video Info:")
print(f" Resolution: {frame_width}x{frame_height}")
print(f" FPS: {fps}\n")
# ====================
# 4. VIDEO WRITER (Optional)
# ====================
if SAVE_VIDEO:
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter(OUTPUT_VIDEO, fourcc, fps, (frame_width, frame_height))
# ====================
# 5. DETECTION LOOP
# ====================
frame_count = 0
total_potholes_detected = 0
print("🚀 Starting detection...\n")
print("Press 'q' to quit, 'p' to pause\n")
while cap.isOpened():
ret, frame = cap.read()
if not ret:
print("\n📹 End of video or error reading frame")
break
frame_count += 1
# ---------------------
# Run YOLOv8 Detection
# ---------------------
results = model(frame, conf=CONFIDENCE_THRESHOLD)
# Extract detection information
detections = results[0].boxes
num_potholes = len(detections)
total_potholes_detected += num_potholes
# ---------------------
# Draw Bounding Boxes
# ---------------------
annotated_frame = frame.copy()
for detection in detections:
# Get bounding box coordinates
x1, y1, x2, y2 = map(int, detection.xyxy[0])
# Get confidence score
confidence = float(detection.conf[0])
# Get class (should be 'pothole')
class_id = int(detection.cls[0])
class_name = model.names[class_id]
# Draw bounding box
cv2.rectangle(annotated_frame, (x1, y1), (x2, y2), (0, 0, 255), 2)
# Create label
if SHOW_CONFIDENCE:
label = f"{class_name}: {confidence:.2f}"
else:
label = class_name
# Draw label background
label_size, _ = cv2.getTextSize(label, cv2.FONT_HERSHEY_SIMPLEX, 0.6, 2)
cv2.rectangle(annotated_frame,
(x1, y1 - label_size[1] - 10),
(x1 + label_size[0], y1),
(0, 0, 255), -1)
# Draw label text
cv2.putText(annotated_frame, label, (x1, y1 - 5),
cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255, 255, 255), 2)
# ---------------------
# Add Info Overlay
# ---------------------
info_text = [
f"Frame: {frame_count}",
f"Potholes in frame: {num_potholes}",
f"Total detected: {total_potholes_detected}",
f"FPS: {fps}"
]
y_offset = 30
for text in info_text:
cv2.putText(annotated_frame, text, (10, y_offset),
cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 0), 2)
y_offset += 30
# ---------------------
# Display Frame
# ---------------------
cv2.imshow('Pothole Detection', annotated_frame)
# Save frame to output video
if SAVE_VIDEO:
out.write(annotated_frame)
# ---------------------
# Keyboard Controls
# ---------------------
key = cv2.waitKey(1) & 0xFF
if key == ord('q'):
print("\n⏹️ Stopping detection...")
break
elif key == ord('p'):
print("\n⏸️ Paused. Press any key to continue...")
cv2.waitKey(0)
# Print progress every 30 frames
if frame_count % 30 == 0:
print(f"Processed {frame_count} frames | Potholes: {total_potholes_detected}")
# ====================
# 6. CLEANUP
# ====================
cap.release()
if SAVE_VIDEO:
out.release()
cv2.destroyAllWindows()
# ====================
# 7. SUMMARY
# ====================
print("\n" + "="*60)
print("📊 DETECTION SUMMARY")
print("="*60)
print(f"Total frames processed: {frame_count}")
print(f"Total potholes detected: {total_potholes_detected}")
print(f"Average potholes per frame: {total_potholes_detected/frame_count:.2f}")
if SAVE_VIDEO:
print(f"Output saved to: {OUTPUT_VIDEO}")
print("="*60)
5.2 Image Detection Script
"""
detect_image.py
Detect potholes in a single image
"""
from ultralytics import YOLO
import cv2
# Load model
model = YOLO('pothole.pt')
# Load image
image_path = 'road_image.jpg'
image = cv2.imread(image_path)
# Run detection
results = model(image, conf=0.5)
# Get annotated image
annotated_image = results[0].plot()
# Display
cv2.imshow('Pothole Detection', annotated_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
# Save result
cv2.imwrite('detected_potholes.jpg', annotated_image)
print("✅ Detection complete. Result saved as 'detected_potholes.jpg'")
5.3 Batch Processing Script
"""
batch_detect.py
Process multiple images from a folder
"""
from ultralytics import YOLO
import cv2
import os
from pathlib import Path
# Configuration
INPUT_FOLDER = 'input_images/'
OUTPUT_FOLDER = 'output_images/'
model = YOLO('pothole.pt')
# Create output folder
Path(OUTPUT_FOLDER).mkdir(exist_ok=True)
# Get all images
image_files = [f for f in os.listdir(INPUT_FOLDER)
if f.lower().endswith(('.jpg', '.jpeg', '.png'))]
print(f"📁 Found {len(image_files)} images to process\n")
# Process each image
for idx, filename in enumerate(image_files, 1):
print(f"Processing {idx}/{len(image_files)}: {filename}")
# Read image
image_path = os.path.join(INPUT_FOLDER, filename)
image = cv2.imread(image_path)
# Detect
results = model(image, conf=0.5)
annotated = results[0].plot()
# Save
output_path = os.path.join(OUTPUT_FOLDER, f"detected_{filename}")
cv2.imwrite(output_path, annotated)
# Count detections
num_potholes = len(results[0].boxes)
print(f" ✅ Detected {num_potholes} pothole(s)\n")
print("🎉 Batch processing complete!")
6. Complete Workflow
Step-by-Step Process:
1. DATASET ACQUISITION (Kaggle)
├── Search for pothole dataset
├── Download dataset (images + annotations)
└── Extract to local folder
↓
2. DATASET PREPARATION (Roboflow)
├── Create project
├── Upload images
├── Annotate/verify annotations
├── Apply augmentations
├── Split train/val/test
└── Export as YOLOv8 format
↓
3. MODEL TRAINING (potholewrightfile.py)
├── Download dataset from Roboflow
├── Initialize YOLOv8 base model
├── Configure training parameters
├── Train for 100 epochs
├── Validate performance
└── Save best weights as pothole.pt
↓
4. MODEL DEPLOYMENT (YOLOv8 + OpenCV)
├── Load pothole.pt weights
├── Initialize video capture
├── Process frames in loop:
│ ├── Read frame
│ ├── Run YOLOv8 inference
│ ├── Extract bounding boxes
│ ├── Draw annotations
│ └── Display/save results
└── Generate detection summary
Technical Flow Diagram:
Kaggle Dataset → Roboflow Processing → YOLOv8 Training → pothole.pt
↓
Inference
↓
Video/Image Input → OpenCV → YOLOv8 → Detections
7. Key Concepts Explained
7.1 Transfer Learning
- Start with yolov8n.pt (pre-trained on COCO dataset)
- Fine-tune on pothole-specific images
- Model learns pothole features while retaining general object detection ability
7.2 Data Augmentation
- Creates artificial variations of training images
- Prevents overfitting
- Improves model generalization
- Examples: rotation, brightness, flipping
7.3 Epochs
- One complete pass through entire training dataset
- 100 epochs = model sees all training images 100 times
- More epochs = better learning (up to a point)
7.4 Confidence Threshold
- Minimum score for detection to be considered valid
- 0.5 = 50% confidence
- Higher threshold = fewer false positives, more missed detections
- Lower threshold = more detections, more false positives
7.5 Bounding Box
- Rectangle drawn around detected pothole
- Defined by coordinates: (x1, y1, x2, y2)
- (x1, y1) = top-left corner
- (x2, y2) = bottom-right corner
8. Troubleshooting
Common Issues:
Issue: "CUDA out of memory" error
Solution: Reduce batch size in training config (e.g., batch=8)
Issue: Low detection accuracy
Solution:
- Increase training epochs
- Add more diverse training images
- Adjust confidence threshold
- Check annotation quality
Issue: Model detects non-potholes
Solution:
- Add hard negative examples to training set
- Increase training epochs
- Adjust confidence threshold higher
Issue: Slow inference speed
Solution:
- Use smaller model (yolov8n instead of yolov8x)
- Reduce input image size
- Use GPU instead of CPU
9. Performance Metrics
Training Metrics Explained:
- mAP50: Mean Average Precision at 50% IoU threshold (higher is better)
- mAP50-95: mAP averaged over IoU thresholds 50%-95% (more strict)
- Precision: Percentage of correct detections out of all detections
- Recall: Percentage of actual potholes that were detected
- F1 Score: Harmonic mean of precision and recall
Target Performance:
- mAP50: > 0.70 (Good)
- Precision: > 0.75 (Good)
- Recall: > 0.70 (Good)
This content originally appeared on DEV Community and was authored by SIDDHARTH PATIL

SIDDHARTH PATIL | Sciencx (2025-10-06T04:15:04+00:00) smart suspension- raj patil. Retrieved from https://www.scien.cx/2025/10/06/smart-suspension-raj-patil/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.