UrbanVision Segmentation Model v1

📋 Model Description

UrbanVision Segmentation v1 là mô hình segmentation tiên tiến dựa trên YOLO11x-seg architecture, được huấn luyện đặc biệt để phân đoạn các yếu tố trong môi trường đô thị. Model có khả năng nhận diện và phân đoạn 48 classes khác nhau, phục vụ cho các ứng dụng như:

🚗 Autonomous Driving
🏙️ Urban Planning & Analysis
🚦 Traffic Monitoring
🛣️ Road Infrastructure Assessment
📊 Smart City Solutions

🎯 Key Features

✅ 48 segmentation classes covering urban elements
✅ YOLO11x-seg architecture - Latest YOLO version
✅ 640x640 input resolution - Balance between speed & accuracy
✅ Pretrained weights - Ready for inference or fine-tuning
✅ 100 epochs training - Fully converged model
✅ MIT License - Free for commercial use

📊 Dataset Information

Dataset: khanhromvn/urbanvision

Split	Images	Percentage
Train	3,000	93.0%
Validation	151	4.7%
Test	74	2.3%
Total	3,225	100%

🏷️ Classes (48 Categories)

Click to expand full class list

Vehicles & Transportation

car, bus, truck, motorcycle, motorbike, bicycle, train, trailer, caravan, autorickshaw

Road Infrastructure

road, drivable fallback, non-drivable fallback, paved path, footpath, sidewalk, curb, pothole

Road Elements

traffic light, traffic sign, pole, polegroup, guard rail, rail track, license plate

Structures

building, bridge, tunnel, wall, fence, billboard

Pedestrians & Animals

person, rider, animal

Environment

vegetation, sky, ground, open area, shallow, stairs

Special Categories

parking, ego vehicle, vehicle fallback, obs-str-bar-fallback, fallback background, out of roi, rectification border, unlabeled

⚙️ Training Configuration

Parameter	Value
Architecture	YOLO11x-seg
Pretrained	Yes
Epochs	100
Image Size	640x640
Batch Size	Auto
Device	Auto (GPU/CPU)
Patience	100
Cache	None

🚀 Quick Start

Installation

pip install ultralytics
pip install huggingface_hub

Inference

from ultralytics import YOLO
from huggingface_hub import hf_hub_download

# Download model from Hugging Face
model_path = hf_hub_download(
    repo_id="khanhromvn/urbanvision-seg-v1",
    filename="best.pt"
)

# Load model
model = YOLO(model_path)

# Run inference
results = model("path/to/your/image.jpg")

# Display results
results[0].show()

# Save results
results[0].save("output.jpg")

Batch Inference

# Process multiple images
results = model(["image1.jpg", "image2.jpg", "image3.jpg"])

for i, result in enumerate(results):
    result.save(f"output_{i}.jpg")

Advanced Usage

# Custom inference parameters
results = model.predict(
    source="image.jpg",
    conf=0.25,        # Confidence threshold
    iou=0.7,          # IoU threshold
    imgsz=640,        # Image size
    device="cuda:0",  # GPU device
    save=True,        # Save results
    show_labels=True, # Show class labels
    show_conf=True    # Show confidence scores
)

# Access segmentation masks
masks = results[0].masks.data  # Segmentation masks
boxes = results[0].boxes.data  # Bounding boxes
classes = results[0].boxes.cls # Class IDs

📈 Performance Metrics

Note: Performance metrics will be updated soon with comprehensive evaluation results.

Expected capabilities:

High accuracy on urban scene segmentation
Real-time inference capability
Robust performance across diverse lighting conditions
Accurate detection of small objects (traffic signs, poles)

🎓 Model Training

Train from Scratch

from ultralytics import YOLO

# Load pretrained YOLO11x-seg
model = YOLO("yolo11x-seg.pt")

# Train on custom dataset
results = model.train(
    data="urbanvision.yaml",
    epochs=100,
    imgsz=640,
    patience=100,
    batch=-1,  # Auto batch size
    device="auto"
)

Fine-tuning

# Load this model for fine-tuning
model = YOLO("khanhromvn/urbanvision-seg-v1")

# Fine-tune on your dataset
results = model.train(
    data="your_dataset.yaml",
    epochs=50,
    imgsz=640
)

📁 Model Files

urbanvision-seg-v1/
├── best.pt              # Best model weights
├── last.pt              # Last epoch weights
├── config.yaml          # Training configuration
└── README.md            # This file

🔧 Requirements

ultralytics>=8.0.0
torch>=2.0.0
torchvision>=0.15.0
opencv-python>=4.8.0
numpy>=1.23.0
Pillow>=9.5.0

📝 Citation

If you use this model in your research or projects, please cite:

@misc{urbanvision-seg-v1,
  author = {khanhromvn},
  title = {UrbanVision Segmentation Model v1},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/khanhromvn/urbanvision-seg-v1}},
}

🤝 Contributing

Contributions are welcome! If you find any issues or have suggestions:

Open an issue on the model repository
Submit a pull request with improvements
Share your results and use cases

📧 Contact

Author: khanhromvn
Dataset: khanhromvn/urbanvision
Model: khanhromvn/urbanvision-seg-v1

📜 License

This model is released under the MIT License. See LICENSE file for details.

You are free to:

✅ Use commercially
✅ Modify
✅ Distribute
✅ Private use

🙏 Acknowledgments

Ultralytics for the amazing YOLO11 framework
Hugging Face for the model hosting platform
Community contributors for dataset preparation and feedback

📊 Use Cases Examples

Autonomous Driving

# Real-time road scene segmentation
model = YOLO("khanhromvn/urbanvision-seg-v1")
results = model("dashcam_video.mp4", stream=True)

for result in results:
    # Extract drivable area
    road_mask = result.masks[result.boxes.cls == class_id["road"]]
    # Process for path planning

Traffic Monitoring

# Count vehicles and pedestrians
results = model("traffic_camera.jpg")
vehicles = sum([1 for cls in results[0].boxes.cls if cls in vehicle_classes])
pedestrians = sum([1 for cls in results[0].boxes.cls if cls == person_class])

Urban Planning

# Analyze urban infrastructure
results = model("city_aerial.jpg")
# Extract building footprints, road networks, vegetation coverage

⭐ If you find this model useful, please give it a star! ⭐

Downloads last month: 113