VLLM when have more then 3 concurrent connection to do OCR it will fail

#20
by CHONGYOEYAT - opened

when have more then 3 concurrent connection to do OCR it will fail

when max_num_seqs > 2 will have error

source ~/venv-vllm-nightly-3_13/bin/activate &&
vllm serve HunyuanOCR
--served-model-name ocr_model
--no-enable-prefix-caching
--mm-processor-cache-gb 0
--gpu-memory-utilization 0.2
--max_num_batched_tokens 65536
--max_num_seqs 3
--port 14651

log

Dis 01 12:43:24 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) INFO: Started server process [131645]
Dis 01 12:43:24 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) INFO: Waiting for application startup.
Dis 01 12:43:25 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) INFO: Application startup complete.
Dis 01 12:43:34 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) Using a slow image processor as use_fast is unset and a slow processor was saved with this model. use_fast=True will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with use_fast=False.
Dis 01 12:43:35 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) INFO 12-01 12:43:35 [chat_utils.py:574] Detected the chat template content format to be 'openai'. You can set --chat-template-content-format to override this.
Dis 01 12:43:35 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) use_fast is set to True but the image processor class does not have a fast version. Falling back to the slow version.
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:72] Dumping input data for V1 LLM engine (v0.11.2.dev412+g8c363ed66.d20251130) with config: model='HunyuanOCR', speculative_config=None, tokenizer='HunyuanOCR', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.bfloat16, max_seq_len=32768, download_dir=None, load_format=auto, tensor_parallel_size=1, pipeline_parallel_size=1, data_parallel_size=1, disable_custom_all_reduce=False, quantization=None, enforce_eager=False, kv_cache_dtype=auto, device_config=cuda, structured_outputs_config=StructuredOutputsConfig(backend='auto', disable_fallback=False, disable_any_whitespace=False, disable_additional_properties=False, reasoning_parser='', reasoning_parser_plugin='', enable_in_reasoning=False), observability_config=ObservabilityConfig(show_hidden_metrics_for_version=None, otlp_traces_endpoint=None, collect_detailed_traces=None), seed=0, served_model_name=ocr_model, enable_prefix_caching=False, enable_chunked_prefill=True, pooler_config=None, compilation_config={'level': None, 'mode': <CompilationMode.VLLM_COMPILE: 3>, 'debug_dump_path': None, 'cache_dir': '/home/vrs-ai1/.cache/vllm/torch_compile_cache/a54ff506e1', 'compile_cache_save_format': 'binary', 'backend': 'inductor', 'custom_ops': ['none'], 'splitting_ops': ['vllm::unified_attention', 'vllm::unified_attention_with_output', 'vllm::unified_mla_attention', 'vllm::unified_mla_attention_with_output', 'vllm::mamba_mixer2', 'vllm::mamba_mixer', 'vllm::short_conv', 'vllm::linear_attention', 'vllm::plamo2_mamba_mixer', 'vllm::gdn_attention_core', 'vllm::kda_attention', 'vllm::sparse_attn_indexer'], 'compile_mm_encoder': False, 'compile_sizes': [], 'inductor_compile_config': {'enable_auto_functionalized_v2': False, 'combo_kernels': True, 'benchmark_combo_kernel': True}, 'inductor_passes': {}, 'cudagraph_mode': <CUDAGraphMode.FULL_AND_PIECEWISE: (2, 1)>, 'cudagraph_num_of_warmups': 1, 'cudagraph_capture_sizes': [1, 2, 4], 'cudagraph_copy_inputs': False, 'cudagraph_specialize_lora': True, 'use_inductor_graph_partition': False, 'pass_config': {'enable_fusion': False, 'enable_attn_fusion': False, 'enable_noop': True, 'enable_sequence_parallelism': False, 'enable_async_tp': False, 'enable_fi_allreduce_fusion': False}, 'max_cudagraph_capture_size': 4, 'dynamic_shapes_config': {'type': <DynamicShapesType.BACKED: 'backed'>}, 'local_cache_dir': '/home/vrs-ai1/.cache/vllm/torch_compile_cache/a54ff506e1/rank_0_0/backbone'},
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] Dumping scheduler output for model execution: SchedulerOutput(scheduled_new_reqs=[NewRequestData(req_id=chatcmpl-9c644ff201792e2e,prompt_token_ids_len=339,mm_features=[MultiModalFeatureSpec(data={'pixel_values': MultiModalFieldElem(modality='image', key='pixel_values', data=tensor([[1.9297, 1.9297, 1.9297, ..., 2.1406, 2.1406, 2.1406],
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] [1.9297, 1.9297, 1.9297, ..., 2.1406, 2.1406, 2.1406],
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] [1.9297, 1.9297, 1.9297, ..., 2.1406, 2.1406, 2.1406],
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] ...,
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] [1.9297, 1.9297, 1.9297, ..., 2.1406, 2.1406, 2.1406],
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] [1.9297, 1.9297, 1.9297, ..., 2.1406, 2.1406, 2.1406],
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] [1.9297, 1.9297, 1.9297, ..., 2.1406, 2.1406, 2.1406]],
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] dtype=torch.bfloat16), field=MultiModalFlatField(slices=[[slice(0, 1080, None)]], dim=0)), 'image_grid_thw': MultiModalFieldElem(modality='image', key='image_grid_thw', data=tensor([ 1, 20, 54]), field=MultiModalBatchedField())}, modality='image', identifier='chatcmpl-9c644ff201792e2e-image-0', mm_position=PlaceholderRange(offset=1, length=282, is_embed=None))],sampling_params=SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.03, temperature=0.0, top_p=1.0, top_k=0, min_p=0.0, seed=None, stop=[], stop_token_ids=[120007], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=2048, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, structured_outputs=None, extra_args=None),block_ids=([23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44],),num_computed_tokens=0,lora_request=None,prompt_embeds_shape=None), NewRequestData(req_id=chatcmpl-998e2e3bd92f353c,prompt_token_ids_len=329,mm_features=[MultiModalFeatureSpec(data={'pixel_values': MultiModalFieldElem(modality='image', key='pixel_values', data=tensor([[-1.7891, -1.7891, -1.7891, ..., -1.4766, -1.4766, -1.4766],
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] [-1.7891, -1.7891, -1.7891, ..., -1.4766, -1.4766, -1.4766],
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] [-1.7891, -1.7891, -1.7891, ..., -1.4766, -1.4766, -1.4766],
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] ...,
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] [-1.7891, -1.7891, -1.7891, ..., -1.4766, -1.4766, -1.4766],
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] [-1.7891, -1.7891, -1.7891, ..., -1.4766, -1.4766, -1.4766],
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] [-1.7891, -1.7891, -1.7891, ..., -1.4766, -1.4766, -1.4766]],
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] dtype=torch.bfloat16), field=MultiModalFlatField(slices=[[slice(0, 1056, None)]], dim=0)), 'image_grid_thw': MultiModalFieldElem(modality='image', key='image_grid_thw', data=tensor([ 1, 12, 88]), field=MultiModalBatchedField())}, modality='image', identifier='chatcmpl-998e2e3bd92f353c-image-0', mm_position=PlaceholderRange(offset=1, length=272, is_embed=None))],sampling_params=SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.03, temperature=0.0, top_p=1.0, top_k=0, min_p=0.0, seed=None, stop=[], stop_token_ids=[120007], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=2048, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, structured_outputs=None, extra_args=None),block_ids=([45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65],),num_computed_tokens=0,lora_request=None,prompt_embeds_shape=None)], scheduled_cached_reqs=CachedRequestData(req_ids=['chatcmpl-af751dc5b5bb611b'], resumed_req_ids=[], new_token_ids=[], all_token_ids={}, new_block_ids=[null], num_computed_tokens=[349], num_output_tokens=[1]), num_scheduled_tokens={chatcmpl-af751dc5b5bb611b: 1, chatcmpl-998e2e3bd92f353c: 329, chatcmpl-9c644ff201792e2e: 339}, total_num_scheduled_tokens=669, scheduled_spec_decode_tokens={}, scheduled_encoder_inputs={chatcmpl-9c644ff201792e2e: [0], chatcmpl-998e2e3bd92f353c: [0]}, num_common_prefix_blocks=[0], finished_req_ids=[], free_encoder_mm_hashes=[], preempted_req_ids=[], pending_structured_output_tokens=false, kv_connector_metadata=null, ec_connector_metadata=null)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:81] Dumping scheduler stats: SchedulerStats(num_running_reqs=3, num_waiting_reqs=2, step_counter=0, current_wave=0, kv_cache_usage=0.009046624913013224, prefix_cache_stats=PrefixCacheStats(reset=False, requests=0, queries=0, hits=0, preempted_requests=0, preempted_queries=0, preempted_hits=0), connector_prefix_cache_stats=None, spec_decoding_stats=None, kv_connector_stats=None, waiting_lora_adapters={}, running_lora_adapters={})
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] EngineCore encountered a fatal error.
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] Traceback (most recent call last):
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/engine/core.py", line 836, in run_engine_core
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] engine_core.run_busy_loop()
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ~~~~~~~~~~~~~~~~~~~~~~~~~^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/engine/core.py", line 863, in run_busy_loop
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] self._process_engine_step()
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ~~~~~~~~~~~~~~~~~~~~~~~~~^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/engine/core.py", line 892, in _process_engine_step
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] outputs, model_executed = self.step_fn()
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ~~~~~~~~~~~~^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/engine/core.py", line 346, in step
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] model_output = future.result()
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/.local/share/uv/python/cpython-3.13.7-linux-x86_64-gnu/lib/python3.13/concurrent/futures/_base.py", line 449, in result
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] return self.__get_result()
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ~~~~~~~~~~~~~~~~~^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/.local/share/uv/python/cpython-3.13.7-linux-x86_64-gnu/lib/python3.13/concurrent/futures/_base.py", line 401, in __get_result
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] raise self._exception
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/executor/uniproc_executor.py", line 79, in collective_rpc
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] result = run_method(self.driver_worker, method, args, kwargs)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/serial_utils.py", line 479, in run_method
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] return func(*args, **kwargs)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/worker/worker_base.py", line 369, in execute_model
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] return self.worker.execute_model(scheduler_output, *args, **kwargs)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/venv-vllm-nightly-3_13/lib/python3.13/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] return func(*args, **kwargs)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/worker/gpu_worker.py", line 591, in execute_model
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] output = self.model_runner.execute_model(
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] scheduler_output, intermediate_tensors
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] )
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/venv-vllm-nightly-3_13/lib/python3.13/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] return func(*args, **kwargs)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/worker/gpu_model_runner.py", line 2970, in execute_model
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ) = self._preprocess(
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ~~~~~~~~~~~~~~~~^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] scheduler_output, num_tokens_padded, intermediate_tensors
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] )
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/worker/gpu_model_runner.py", line 2450, in _preprocess
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] self._execute_mm_encoder(scheduler_output)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/worker/gpu_model_runner.py", line 2146, in _execute_mm_encoder
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] curr_group_outputs = model.embed_multimodal(**mm_kwargs_group) # type: ignore[assignment]
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/model_executor/models/hunyuan_vision.py", line 971, in embed_multimodal
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] mm_input_by_modality = self._parse_and_validate_multimodal_inputs(**kwargs)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/model_executor/models/hunyuan_vision.py", line 962, in _parse_and_validate_multimodal_inputs
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] mm_input_by_modality["image"] = self._parse_and_validate_image_input(
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] **kwargs
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] )
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/model_executor/models/hunyuan_vision.py", line 922, in _parse_and_validate_image_input
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] return HunYuanVLImagePixelInputs(
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] type="pixel_values",
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] pixel_values=pixel_values,
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] image_grid_thw=image_grid_thw,
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] )
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/utils/tensor_schema.py", line 63, in init
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] self.validate()
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ~~~~~~~~~~~~~^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/utils/tensor_schema.py", line 237, in validate
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] self._validate_tensor_shape_expected(
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] actual_shape,
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ^^^^^^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ...<3 lines>...
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] arg.dynamic_dims,
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ^^^^^^^^^^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] )
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/utils/tensor_schema.py", line 166, in _validate_tensor_shape_expected
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] raise ValueError(
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ...<4 lines>...
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] )
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ValueError: image_grid_thw has rank 3 but expected 2. Expected shape: ('ni', 3), but got torch.Size([2, 1, 3])
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) Process EngineCore_DP0:
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) Traceback (most recent call last):
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/.local/share/uv/python/cpython-3.13.7-linux-x86_64-gnu/lib/python3.13/multiprocessing/process.py", line 313, in _bootstrap
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) self.run()
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ~~~~~~~~^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/.local/share/uv/python/cpython-3.13.7-linux-x86_64-gnu/lib/python3.13/multiprocessing/process.py", line 108, in run
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) self._target(*self._args, **self._kwargs)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/engine/core.py", line 847, in run_engine_core
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) raise e
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/engine/core.py", line 836, in run_engine_core
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) engine_core.run_busy_loop()
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ~~~~~~~~~~~~~~~~~~~~~~~~~^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/engine/core.py", line 863, in run_busy_loop
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) self._process_engine_step()
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ~~~~~~~~~~~~~~~~~~~~~~~~~^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/engine/core.py", line 892, in _process_engine_step
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) outputs, model_executed = self.step_fn()
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ~~~~~~~~~~~~^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/engine/core.py", line 346, in step
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) model_output = future.result()
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/.local/share/uv/python/cpython-3.13.7-linux-x86_64-gnu/lib/python3.13/concurrent/futures/_base.py", line 449, in result
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) return self.__get_result()
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ~~~~~~~~~~~~~~~~~^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/.local/share/uv/python/cpython-3.13.7-linux-x86_64-gnu/lib/python3.13/concurrent/futures/_base.py", line 401, in __get_result
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) raise self._exception
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/executor/uniproc_executor.py", line 79, in collective_rpc
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) result = run_method(self.driver_worker, method, args, kwargs)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/serial_utils.py", line 479, in run_method
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) return func(*args, **kwargs)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/worker/worker_base.py", line 369, in execute_model
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) return self.worker.execute_model(scheduler_output, *args, **kwargs)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/venv-vllm-nightly-3_13/lib/python3.13/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) return func(*args, **kwargs)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/worker/gpu_worker.py", line 591, in execute_model
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) output = self.model_runner.execute_model(
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) scheduler_output, intermediate_tensors
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) )
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/venv-vllm-nightly-3_13/lib/python3.13/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) return func(*args, **kwargs)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/worker/gpu_model_runner.py", line 2970, in execute_model
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ) = self._preprocess(
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ~~~~~~~~~~~~~~~~^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) scheduler_output, num_tokens_padded, intermediate_tensors
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) )
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/worker/gpu_model_runner.py", line 2450, in _preprocess
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) self._execute_mm_encoder(scheduler_output)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/worker/gpu_model_runner.py", line 2146, in _execute_mm_encoder
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) curr_group_outputs = model.embed_multimodal(**mm_kwargs_group) # type: ignore[assignment]
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/model_executor/models/hunyuan_vision.py", line 971, in embed_multimodal
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) mm_input_by_modality = self._parse_and_validate_multimodal_inputs(**kwargs)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/model_executor/models/hunyuan_vision.py", line 962, in _parse_and_validate_multimodal_inputs
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) mm_input_by_modality["image"] = self._parse_and_validate_image_input(
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) **kwargs
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) )
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/model_executor/models/hunyuan_vision.py", line 922, in _parse_and_validate_image_input
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) return HunYuanVLImagePixelInputs(
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) type="pixel_values",
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) pixel_values=pixel_values,
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) image_grid_thw=image_grid_thw,
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) )
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/utils/tensor_schema.py", line 63, in init
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) self.validate()
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ~~~~~~~~~~~~~^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/utils/tensor_schema.py", line 237, in validate
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) self._validate_tensor_shape_expected(
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) actual_shape,
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ^^^^^^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ...<3 lines>...
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) arg.dynamic_dims,
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ^^^^^^^^^^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) )
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/utils/tensor_schema.py", line 166, in _validate_tensor_shape_expected
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) raise ValueError(
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ...<4 lines>...
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) )
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ValueError: image_grid_thw has rank 3 but expected 2. Expected shape: ('ni', 3), but got torch.Size([2, 1, 3])
Dis 01 12:43:37 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) ERROR 12-01 12:43:37 [async_llm.py:545] AsyncLLM output_handler failed.
Dis 01 12:43:37 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) ERROR 12-01 12:43:37 [async_llm.py:545] Traceback (most recent call last):
Dis 01 12:43:37 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) ERROR 12-01 12:43:37 [async_llm.py:545] File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/engine/async_llm.py", line 497, in output_handler
Dis 01 12:43:37 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) ERROR 12-01 12:43:37 [async_llm.py:545] outputs = await engine_core.get_output_async()
Dis 01 12:43:37 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) ERROR 12-01 12:43:37 [async_llm.py:545] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) ERROR 12-01 12:43:37 [async_llm.py:545] File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/engine/core_client.py", line 883, in get_output_async
Dis 01 12:43:37 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) ERROR 12-01 12:43:37 [async_llm.py:545] raise self._format_exception(outputs) from None
Dis 01 12:43:37 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) ERROR 12-01 12:43:37 [async_llm.py:545] vllm.v1.engine.exceptions.EngineDeadError: EngineCore encountered an issue. See stack trace (above) for the root cause.

Sign up or log in to comment