UrbanVision Segmentation Model v1
π Model Description
UrbanVision Segmentation v1 lΓ mΓ΄ hΓ¬nh segmentation tiΓͺn tiαΊΏn dα»±a trΓͺn YOLO11x-seg architecture, Δược huαΊ₯n luyα»n ΔαΊ·c biα»t Δα» phΓ’n ΔoαΊ‘n cΓ‘c yαΊΏu tα» trong mΓ΄i trΖ°α»ng ΔΓ΄ thα». Model cΓ³ khαΊ£ nΔng nhαΊn diα»n vΓ phΓ’n ΔoαΊ‘n 48 classes khΓ‘c nhau, phα»₯c vα»₯ cho cΓ‘c α»©ng dα»₯ng nhΖ°:
- π Autonomous Driving
- ποΈ Urban Planning & Analysis
- π¦ Traffic Monitoring
- π£οΈ Road Infrastructure Assessment
- π Smart City Solutions
π― Key Features
- β 48 segmentation classes covering urban elements
- β YOLO11x-seg architecture - Latest YOLO version
- β 640x640 input resolution - Balance between speed & accuracy
- β Pretrained weights - Ready for inference or fine-tuning
- β 100 epochs training - Fully converged model
- β MIT License - Free for commercial use
π Dataset Information
Dataset: khanhromvn/urbanvision
| Split | Images | Percentage |
|---|---|---|
| Train | 3,000 | 93.0% |
| Validation | 151 | 4.7% |
| Test | 74 | 2.3% |
| Total | 3,225 | 100% |
π·οΈ Classes (48 Categories)
Click to expand full class list
Vehicles & Transportation
car,bus,truck,motorcycle,motorbike,bicycle,train,trailer,caravan,autorickshaw
Road Infrastructure
road,drivable fallback,non-drivable fallback,paved path,footpath,sidewalk,curb,pothole
Road Elements
traffic light,traffic sign,pole,polegroup,guard rail,rail track,license plate
Structures
building,bridge,tunnel,wall,fence,billboard
Pedestrians & Animals
person,rider,animal
Environment
vegetation,sky,ground,open area,shallow,stairs
Special Categories
parking,ego vehicle,vehicle fallback,obs-str-bar-fallback,fallback background,out of roi,rectification border,unlabeled
βοΈ Training Configuration
| Parameter | Value |
|---|---|
| Architecture | YOLO11x-seg |
| Pretrained | Yes |
| Epochs | 100 |
| Image Size | 640x640 |
| Batch Size | Auto |
| Device | Auto (GPU/CPU) |
| Patience | 100 |
| Cache | None |
π Quick Start
Installation
pip install ultralytics
pip install huggingface_hub
Inference
from ultralytics import YOLO
from huggingface_hub import hf_hub_download
# Download model from Hugging Face
model_path = hf_hub_download(
repo_id="khanhromvn/urbanvision-seg-v1",
filename="best.pt"
)
# Load model
model = YOLO(model_path)
# Run inference
results = model("path/to/your/image.jpg")
# Display results
results[0].show()
# Save results
results[0].save("output.jpg")
Batch Inference
# Process multiple images
results = model(["image1.jpg", "image2.jpg", "image3.jpg"])
for i, result in enumerate(results):
result.save(f"output_{i}.jpg")
Advanced Usage
# Custom inference parameters
results = model.predict(
source="image.jpg",
conf=0.25, # Confidence threshold
iou=0.7, # IoU threshold
imgsz=640, # Image size
device="cuda:0", # GPU device
save=True, # Save results
show_labels=True, # Show class labels
show_conf=True # Show confidence scores
)
# Access segmentation masks
masks = results[0].masks.data # Segmentation masks
boxes = results[0].boxes.data # Bounding boxes
classes = results[0].boxes.cls # Class IDs
π Performance Metrics
Note: Performance metrics will be updated soon with comprehensive evaluation results.
Expected capabilities:
- High accuracy on urban scene segmentation
- Real-time inference capability
- Robust performance across diverse lighting conditions
- Accurate detection of small objects (traffic signs, poles)
π Model Training
Train from Scratch
from ultralytics import YOLO
# Load pretrained YOLO11x-seg
model = YOLO("yolo11x-seg.pt")
# Train on custom dataset
results = model.train(
data="urbanvision.yaml",
epochs=100,
imgsz=640,
patience=100,
batch=-1, # Auto batch size
device="auto"
)
Fine-tuning
# Load this model for fine-tuning
model = YOLO("khanhromvn/urbanvision-seg-v1")
# Fine-tune on your dataset
results = model.train(
data="your_dataset.yaml",
epochs=50,
imgsz=640
)
π Model Files
urbanvision-seg-v1/
βββ best.pt # Best model weights
βββ last.pt # Last epoch weights
βββ config.yaml # Training configuration
βββ README.md # This file
π§ Requirements
ultralytics>=8.0.0
torch>=2.0.0
torchvision>=0.15.0
opencv-python>=4.8.0
numpy>=1.23.0
Pillow>=9.5.0
π Citation
If you use this model in your research or projects, please cite:
@misc{urbanvision-seg-v1,
author = {khanhromvn},
title = {UrbanVision Segmentation Model v1},
year = {2024},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/khanhromvn/urbanvision-seg-v1}},
}
π€ Contributing
Contributions are welcome! If you find any issues or have suggestions:
- Open an issue on the model repository
- Submit a pull request with improvements
- Share your results and use cases
π§ Contact
- Author: khanhromvn
- Dataset: khanhromvn/urbanvision
- Model: khanhromvn/urbanvision-seg-v1
π License
This model is released under the MIT License. See LICENSE file for details.
You are free to:
- β Use commercially
- β Modify
- β Distribute
- β Private use
π Acknowledgments
- Ultralytics for the amazing YOLO11 framework
- Hugging Face for the model hosting platform
- Community contributors for dataset preparation and feedback
π Use Cases Examples
Autonomous Driving
# Real-time road scene segmentation
model = YOLO("khanhromvn/urbanvision-seg-v1")
results = model("dashcam_video.mp4", stream=True)
for result in results:
# Extract drivable area
road_mask = result.masks[result.boxes.cls == class_id["road"]]
# Process for path planning
Traffic Monitoring
# Count vehicles and pedestrians
results = model("traffic_camera.jpg")
vehicles = sum([1 for cls in results[0].boxes.cls if cls in vehicle_classes])
pedestrians = sum([1 for cls in results[0].boxes.cls if cls == person_class])
Urban Planning
# Analyze urban infrastructure
results = model("city_aerial.jpg")
# Extract building footprints, road networks, vegetation coverage
β If you find this model useful, please give it a star! β
- Downloads last month
- 113