ViCA2-thinkng / README.md
nkkbr's picture
Create README.md
8089c54 verified
---
license: apache-2.0
tags:
- multimodal
- vision-language
- video understanding
- visuospatial cognition
- spatial reasoning
- vlm
- llava
- qwen
- siglip
- hiera
- sam2
- dual-encoder
datasets:
- nkkbr/ViCA-thinking-2.68k
language:
- en
library_name: transformers
pipeline_tag: video-text-to-text
model_name: ViCA2-7B-Thinking
---
## Usage and Full Documentation
For detailed model description, training setup, datasets, evaluation results, and inference code, **please refer to the following links**:
[![GitHub](https://img.shields.io/badge/GitHub-ViCA2-181717?logo=github&logoColor=white)](https://github.com/nkkbr/ViCA)
[![Hugging Face Models](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-ViCA2-blue)](https://huggingface.co/nkkbr/ViCA2)