VLLM when have more then 3 concurrent connection to do OCR it will fail

#20

by CHONGYOEYAT - opened 6 days ago

6 days ago

when have more then 3 concurrent connection to do OCR it will fail

when max_num_seqs > 2 will have error

source ~/venv-vllm-nightly-3_13/bin/activate &&
vllm serve HunyuanOCR
--served-model-name ocr_model
--no-enable-prefix-caching
--mm-processor-cache-gb 0
--gpu-memory-utilization 0.2
--max_num_batched_tokens 65536
--max_num_seqs 3
--port 14651

log

Dis 01 12:43:24 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) INFO: Started server process [131645]
Dis 01 12:43:24 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) INFO: Waiting for application startup.
Dis 01 12:43:25 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) INFO: Application startup complete.
Dis 01 12:43:34 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) Using a slow image processor as use_fast is unset and a slow processor was saved with this model. use_fast=True will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with use_fast=False.
Dis 01 12:43:35 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) INFO 12-01 12:43:35 [chat_utils.py:574] Detected the chat template content format to be 'openai'. You can set --chat-template-content-format to override this.
Dis 01 12:43:35 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) use_fast is set to True but the image processor class does not have a fast version. Falling back to the slow version.
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:72] Dumping input data for V1 LLM engine (v0.11.2.dev412+g8c363ed66.d20251130) with config: model='HunyuanOCR', speculative_config=None, tokenizer='HunyuanOCR', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.bfloat16, max_seq_len=32768, download_dir=None, load_format=auto, tensor_parallel_size=1, pipeline_parallel_size=1, data_parallel_size=1, disable_custom_all_reduce=False, quantization=None, enforce_eager=False, kv_cache_dtype=auto, device_config=cuda, structured_outputs_config=StructuredOutputsConfig(backend='auto', disable_fallback=False, disable_any_whitespace=False, disable_additional_properties=False, reasoning_parser='', reasoning_parser_plugin='', enable_in_reasoning=False), observability_config=ObservabilityConfig(show_hidden_metrics_for_version=None, otlp_traces_endpoint=None, collect_detailed_traces=None), seed=0, served_model_name=ocr_model, enable_prefix_caching=False, enable_chunked_prefill=True, pooler_config=None, compilation_config={'level': None, 'mode': <CompilationMode.VLLM_COMPILE: 3>, 'debug_dump_path': None, 'cache_dir': '/home/vrs-ai1/.cache/vllm/torch_compile_cache/a54ff506e1', 'compile_cache_save_format': 'binary', 'backend': 'inductor', 'custom_ops': ['none'], 'splitting_ops': ['vllm::unified_attention', 'vllm::unified_attention_with_output', 'vllm::unified_mla_attention', 'vllm::unified_mla_attention_with_output', 'vllm::mamba_mixer2', 'vllm::mamba_mixer', 'vllm::short_conv', 'vllm::linear_attention', 'vllm::plamo2_mamba_mixer', 'vllm::gdn_attention_core', 'vllm::kda_attention', 'vllm::sparse_attn_indexer'], 'compile_mm_encoder': False, 'compile_sizes': [], 'inductor_compile_config': {'enable_auto_functionalized_v2': False, 'combo_kernels': True, 'benchmark_combo_kernel': True}, 'inductor_passes': {}, 'cudagraph_mode': <CUDAGraphMode.FULL_AND_PIECEWISE: (2, 1)>, 'cudagraph_num_of_warmups': 1, 'cudagraph_capture_sizes': [1, 2, 4], 'cudagraph_copy_inputs': False, 'cudagraph_specialize_lora': True, 'use_inductor_graph_partition': False, 'pass_config': {'enable_fusion': False, 'enable_attn_fusion': False, 'enable_noop': True, 'enable_sequence_parallelism': False, 'enable_async_tp': False, 'enable_fi_allreduce_fusion': False}, 'max_cudagraph_capture_size': 4, 'dynamic_shapes_config': {'type': <DynamicShapesType.BACKED: 'backed'>}, 'local_cache_dir': '/home/vrs-ai1/.cache/vllm/torch_compile_cache/a54ff506e1/rank_0_0/backbone'},
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] Dumping scheduler output for model execution: SchedulerOutput(scheduled_new_reqs=[NewRequestData(req_id=chatcmpl-9c644ff201792e2e,prompt_token_ids_len=339,mm_features=[MultiModalFeatureSpec(data={'pixel_values': MultiModalFieldElem(modality='image', key='pixel_values', data=tensor([[1.9297, 1.9297, 1.9297, ..., 2.1406, 2.1406, 2.1406],
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] [1.9297, 1.9297, 1.9297, ..., 2.1406, 2.1406, 2.1406],
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] [1.9297, 1.9297, 1.9297, ..., 2.1406, 2.1406, 2.1406],
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] ...,
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] [1.9297, 1.9297, 1.9297, ..., 2.1406, 2.1406, 2.1406],
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] [1.9297, 1.9297, 1.9297, ..., 2.1406, 2.1406, 2.1406],
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] [1.9297, 1.9297, 1.9297, ..., 2.1406, 2.1406, 2.1406]],
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] dtype=torch.bfloat16), field=MultiModalFlatField(slices=[[slice(0, 1080, None)]], dim=0)), 'image_grid_thw': MultiModalFieldElem(modality='image', key='image_grid_thw', data=tensor([ 1, 20, 54]), field=MultiModalBatchedField())}, modality='image', identifier='chatcmpl-9c644ff201792e2e-image-0', mm_position=PlaceholderRange(offset=1, length=282, is_embed=None))],sampling_params=SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.03, temperature=0.0, top_p=1.0, top_k=0, min_p=0.0, seed=None, stop=[], stop_token_ids=[120007], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=2048, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, structured_outputs=None, extra_args=None),block_ids=([23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44],),num_computed_tokens=0,lora_request=None,prompt_embeds_shape=None), NewRequestData(req_id=chatcmpl-998e2e3bd92f353c,prompt_token_ids_len=329,mm_features=[MultiModalFeatureSpec(data={'pixel_values': MultiModalFieldElem(modality='image', key='pixel_values', data=tensor([[-1.7891, -1.7891, -1.7891, ..., -1.4766, -1.4766, -1.4766],
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] [-1.7891, -1.7891, -1.7891, ..., -1.4766, -1.4766, -1.4766],
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] [-1.7891, -1.7891, -1.7891, ..., -1.4766, -1.4766, -1.4766],
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] ...,
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] [-1.7891, -1.7891, -1.7891, ..., -1.4766, -1.4766, -1.4766],
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] [-1.7891, -1.7891, -1.7891, ..., -1.4766, -1.4766, -1.4766],
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] [-1.7891, -1.7891, -1.7891, ..., -1.4766, -1.4766, -1.4766]],
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:79] dtype=torch.bfloat16), field=MultiModalFlatField(slices=[[slice(0, 1056, None)]], dim=0)), 'image_grid_thw': MultiModalFieldElem(modality='image', key='image_grid_thw', data=tensor([ 1, 12, 88]), field=MultiModalBatchedField())}, modality='image', identifier='chatcmpl-998e2e3bd92f353c-image-0', mm_position=PlaceholderRange(offset=1, length=272, is_embed=None))],sampling_params=SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.03, temperature=0.0, top_p=1.0, top_k=0, min_p=0.0, seed=None, stop=[], stop_token_ids=[120007], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=2048, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, structured_outputs=None, extra_args=None),block_ids=([45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65],),num_computed_tokens=0,lora_request=None,prompt_embeds_shape=None)], scheduled_cached_reqs=CachedRequestData(req_ids=['chatcmpl-af751dc5b5bb611b'], resumed_req_ids=[], new_token_ids=[], all_token_ids={}, new_block_ids=[null], num_computed_tokens=[349], num_output_tokens=[1]), num_scheduled_tokens={chatcmpl-af751dc5b5bb611b: 1, chatcmpl-998e2e3bd92f353c: 329, chatcmpl-9c644ff201792e2e: 339}, total_num_scheduled_tokens=669, scheduled_spec_decode_tokens={}, scheduled_encoder_inputs={chatcmpl-9c644ff201792e2e: [0], chatcmpl-998e2e3bd92f353c: [0]}, num_common_prefix_blocks=[0], finished_req_ids=[], free_encoder_mm_hashes=[], preempted_req_ids=[], pending_structured_output_tokens=false, kv_connector_metadata=null, ec_connector_metadata=null)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [dump_input.py:81] Dumping scheduler stats: SchedulerStats(num_running_reqs=3, num_waiting_reqs=2, step_counter=0, current_wave=0, kv_cache_usage=0.009046624913013224, prefix_cache_stats=PrefixCacheStats(reset=False, requests=0, queries=0, hits=0, preempted_requests=0, preempted_queries=0, preempted_hits=0), connector_prefix_cache_stats=None, spec_decoding_stats=None, kv_connector_stats=None, waiting_lora_adapters={}, running_lora_adapters={})
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] EngineCore encountered a fatal error.
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] Traceback (most recent call last):
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/engine/core.py", line 836, in run_engine_core
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] engine_core.run_busy_loop()
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ~~~~~~~~~~~~~~~~~~~~~~~~~^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/engine/core.py", line 863, in run_busy_loop
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] self._process_engine_step()
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ~~~~~~~~~~~~~~~~~~~~~~~~~^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/engine/core.py", line 892, in _process_engine_step
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] outputs, model_executed = self.step_fn()
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ~~~~~~~~~~~~^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/engine/core.py", line 346, in step
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] model_output = future.result()
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/.local/share/uv/python/cpython-3.13.7-linux-x86_64-gnu/lib/python3.13/concurrent/futures/_base.py", line 449, in result
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] return self.__get_result()
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ~~~~~~~~~~~~~~~~~^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/.local/share/uv/python/cpython-3.13.7-linux-x86_64-gnu/lib/python3.13/concurrent/futures/_base.py", line 401, in __get_result
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] raise self._exception
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/executor/uniproc_executor.py", line 79, in collective_rpc
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] result = run_method(self.driver_worker, method, args, kwargs)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/serial_utils.py", line 479, in run_method
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] return func(*args, **kwargs)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/worker/worker_base.py", line 369, in execute_model
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] return self.worker.execute_model(scheduler_output, *args, **kwargs)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/venv-vllm-nightly-3_13/lib/python3.13/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] return func(*args, **kwargs)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/worker/gpu_worker.py", line 591, in execute_model
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] output = self.model_runner.execute_model(
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] scheduler_output, intermediate_tensors
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] )
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/venv-vllm-nightly-3_13/lib/python3.13/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] return func(*args, **kwargs)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/worker/gpu_model_runner.py", line 2970, in execute_model
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ) = self._preprocess(
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ~~~~~~~~~~~~~~~~^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] scheduler_output, num_tokens_padded, intermediate_tensors
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] )
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/worker/gpu_model_runner.py", line 2450, in _preprocess
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] self._execute_mm_encoder(scheduler_output)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/worker/gpu_model_runner.py", line 2146, in _execute_mm_encoder
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] curr_group_outputs = model.embed_multimodal(**mm_kwargs_group) # type: ignore[assignment]
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/model_executor/models/hunyuan_vision.py", line 971, in embed_multimodal
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] mm_input_by_modality = self._parse_and_validate_multimodal_inputs(**kwargs)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/model_executor/models/hunyuan_vision.py", line 962, in _parse_and_validate_multimodal_inputs
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] mm_input_by_modality["image"] = self._parse_and_validate_image_input(
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] **kwargs
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] )
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/model_executor/models/hunyuan_vision.py", line 922, in _parse_and_validate_image_input
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] return HunYuanVLImagePixelInputs(
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] type="pixel_values",
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] pixel_values=pixel_values,
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] image_grid_thw=image_grid_thw,
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] )
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/utils/tensor_schema.py", line 63, in init
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] self.validate()
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ~~~~~~~~~~~~~^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/utils/tensor_schema.py", line 237, in validate
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] self._validate_tensor_shape_expected(
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] actual_shape,
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ^^^^^^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ...<3 lines>...
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] arg.dynamic_dims,
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ^^^^^^^^^^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] )
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] File "/home/vrs-ai1/vllm_py_3_13/vllm/utils/tensor_schema.py", line 166, in _validate_tensor_shape_expected
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] raise ValueError(
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ...<4 lines>...
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] )
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ERROR 12-01 12:43:37 [core.py:845] ValueError: image_grid_thw has rank 3 but expected 2. Expected shape: ('ni', 3), but got torch.Size([2, 1, 3])
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) Process EngineCore_DP0:
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) Traceback (most recent call last):
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/.local/share/uv/python/cpython-3.13.7-linux-x86_64-gnu/lib/python3.13/multiprocessing/process.py", line 313, in _bootstrap
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) self.run()
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ~~~~~~~~^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/.local/share/uv/python/cpython-3.13.7-linux-x86_64-gnu/lib/python3.13/multiprocessing/process.py", line 108, in run
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) self._target(*self._args, **self._kwargs)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/engine/core.py", line 847, in run_engine_core
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) raise e
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/engine/core.py", line 836, in run_engine_core
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) engine_core.run_busy_loop()
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ~~~~~~~~~~~~~~~~~~~~~~~~~^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/engine/core.py", line 863, in run_busy_loop
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) self._process_engine_step()
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ~~~~~~~~~~~~~~~~~~~~~~~~~^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/engine/core.py", line 892, in _process_engine_step
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) outputs, model_executed = self.step_fn()
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ~~~~~~~~~~~~^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/engine/core.py", line 346, in step
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) model_output = future.result()
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/.local/share/uv/python/cpython-3.13.7-linux-x86_64-gnu/lib/python3.13/concurrent/futures/_base.py", line 449, in result
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) return self.__get_result()
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ~~~~~~~~~~~~~~~~~^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/.local/share/uv/python/cpython-3.13.7-linux-x86_64-gnu/lib/python3.13/concurrent/futures/_base.py", line 401, in __get_result
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) raise self._exception
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/executor/uniproc_executor.py", line 79, in collective_rpc
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) result = run_method(self.driver_worker, method, args, kwargs)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/serial_utils.py", line 479, in run_method
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) return func(*args, **kwargs)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/worker/worker_base.py", line 369, in execute_model
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) return self.worker.execute_model(scheduler_output, *args, **kwargs)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/venv-vllm-nightly-3_13/lib/python3.13/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) return func(*args, **kwargs)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/worker/gpu_worker.py", line 591, in execute_model
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) output = self.model_runner.execute_model(
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) scheduler_output, intermediate_tensors
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) )
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/venv-vllm-nightly-3_13/lib/python3.13/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) return func(*args, **kwargs)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/worker/gpu_model_runner.py", line 2970, in execute_model
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ) = self._preprocess(
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ~~~~~~~~~~~~~~~~^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) scheduler_output, num_tokens_padded, intermediate_tensors
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) )
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/worker/gpu_model_runner.py", line 2450, in _preprocess
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) self._execute_mm_encoder(scheduler_output)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/worker/gpu_model_runner.py", line 2146, in _execute_mm_encoder
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) curr_group_outputs = model.embed_multimodal(**mm_kwargs_group) # type: ignore[assignment]
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/model_executor/models/hunyuan_vision.py", line 971, in embed_multimodal
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) mm_input_by_modality = self._parse_and_validate_multimodal_inputs(**kwargs)
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/model_executor/models/hunyuan_vision.py", line 962, in _parse_and_validate_multimodal_inputs
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) mm_input_by_modality["image"] = self._parse_and_validate_image_input(
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) **kwargs
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) )
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/model_executor/models/hunyuan_vision.py", line 922, in _parse_and_validate_image_input
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) return HunYuanVLImagePixelInputs(
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) type="pixel_values",
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) pixel_values=pixel_values,
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) image_grid_thw=image_grid_thw,
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) )
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/utils/tensor_schema.py", line 63, in init
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) self.validate()
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ~~~~~~~~~~~~~^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/utils/tensor_schema.py", line 237, in validate
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) self._validate_tensor_shape_expected(
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) actual_shape,
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ^^^^^^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ...<3 lines>...
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) arg.dynamic_dims,
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ^^^^^^^^^^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) )
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ^
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) File "/home/vrs-ai1/vllm_py_3_13/vllm/utils/tensor_schema.py", line 166, in _validate_tensor_shape_expected
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) raise ValueError(
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ...<4 lines>...
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) )
Dis 01 12:43:37 vrsai1-Super-Server bash[131724]: (EngineCore_DP0 pid=131724) ValueError: image_grid_thw has rank 3 but expected 2. Expected shape: ('ni', 3), but got torch.Size([2, 1, 3])
Dis 01 12:43:37 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) ERROR 12-01 12:43:37 [async_llm.py:545] AsyncLLM output_handler failed.
Dis 01 12:43:37 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) ERROR 12-01 12:43:37 [async_llm.py:545] Traceback (most recent call last):
Dis 01 12:43:37 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) ERROR 12-01 12:43:37 [async_llm.py:545] File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/engine/async_llm.py", line 497, in output_handler
Dis 01 12:43:37 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) ERROR 12-01 12:43:37 [async_llm.py:545] outputs = await engine_core.get_output_async()
Dis 01 12:43:37 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) ERROR 12-01 12:43:37 [async_llm.py:545] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Dis 01 12:43:37 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) ERROR 12-01 12:43:37 [async_llm.py:545] File "/home/vrs-ai1/vllm_py_3_13/vllm/v1/engine/core_client.py", line 883, in get_output_async
Dis 01 12:43:37 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) ERROR 12-01 12:43:37 [async_llm.py:545] raise self._format_exception(outputs) from None
Dis 01 12:43:37 vrsai1-Super-Server bash[131645]: (APIServer pid=131645) ERROR 12-01 12:43:37 [async_llm.py:545] vllm.v1.engine.exceptions.EngineDeadError: EngineCore encountered an issue. See stack trace (above) for the root cause.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment