PeterLauLukCh's picture
Upload folder using huggingface_hub
26abfaa verified
INFO 10-29 02:19:13 [__init__.py:216] Automatically detected platform cuda.
(APIServer pid=1486408) INFO 10-29 02:19:14 [api_server.py:1896] vLLM API server version 0.10.2
(APIServer pid=1486408) INFO 10-29 02:19:14 [utils.py:328] non-default args: {'model': '/data_storage/shared/gjc/models/ALFWorld-Llama3.2-3B-Real'}
(APIServer pid=1486408) Traceback (most recent call last):
(APIServer pid=1486408) File "/root/miniconda3/lib/python3.10/runpy.py", line 196, in _run_module_as_main
(APIServer pid=1486408) return _run_code(code, main_globals, None,
(APIServer pid=1486408) File "/root/miniconda3/lib/python3.10/runpy.py", line 86, in _run_code
(APIServer pid=1486408) exec(code, run_globals)
(APIServer pid=1486408) File "/data_storage/shared/gjc/models/vllm/lib/python3.10/site-packages/vllm/entrypoints/openai/api_server.py", line 2011, in <module>
(APIServer pid=1486408) uvloop.run(run_server(args))
(APIServer pid=1486408) File "/data_storage/shared/gjc/models/vllm/lib/python3.10/site-packages/uvloop/__init__.py", line 82, in run
(APIServer pid=1486408) return loop.run_until_complete(wrapper())
(APIServer pid=1486408) File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
(APIServer pid=1486408) File "/data_storage/shared/gjc/models/vllm/lib/python3.10/site-packages/uvloop/__init__.py", line 61, in wrapper
(APIServer pid=1486408) return await main
(APIServer pid=1486408) File "/data_storage/shared/gjc/models/vllm/lib/python3.10/site-packages/vllm/entrypoints/openai/api_server.py", line 1941, in run_server
(APIServer pid=1486408) await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
(APIServer pid=1486408) File "/data_storage/shared/gjc/models/vllm/lib/python3.10/site-packages/vllm/entrypoints/openai/api_server.py", line 1961, in run_server_worker
(APIServer pid=1486408) async with build_async_engine_client(
(APIServer pid=1486408) File "/root/miniconda3/lib/python3.10/contextlib.py", line 199, in __aenter__
(APIServer pid=1486408) return await anext(self.gen)
(APIServer pid=1486408) File "/data_storage/shared/gjc/models/vllm/lib/python3.10/site-packages/vllm/entrypoints/openai/api_server.py", line 179, in build_async_engine_client
(APIServer pid=1486408) async with build_async_engine_client_from_engine_args(
(APIServer pid=1486408) File "/root/miniconda3/lib/python3.10/contextlib.py", line 199, in __aenter__
(APIServer pid=1486408) return await anext(self.gen)
(APIServer pid=1486408) File "/data_storage/shared/gjc/models/vllm/lib/python3.10/site-packages/vllm/entrypoints/openai/api_server.py", line 205, in build_async_engine_client_from_engine_args
(APIServer pid=1486408) vllm_config = engine_args.create_engine_config(usage_context=usage_context)
(APIServer pid=1486408) File "/data_storage/shared/gjc/models/vllm/lib/python3.10/site-packages/vllm/engine/arg_utils.py", line 1119, in create_engine_config
(APIServer pid=1486408) model_config = self.create_model_config()
(APIServer pid=1486408) File "/data_storage/shared/gjc/models/vllm/lib/python3.10/site-packages/vllm/engine/arg_utils.py", line 963, in create_model_config
(APIServer pid=1486408) return ModelConfig(
(APIServer pid=1486408) File "/data_storage/shared/gjc/models/vllm/lib/python3.10/site-packages/pydantic/_internal/_dataclasses.py", line 123, in __init__
(APIServer pid=1486408) s.__pydantic_validator__.validate_python(ArgsKwargs(args, kwargs), self_instance=s)
(APIServer pid=1486408) pydantic_core._pydantic_core.ValidationError: 1 validation error for ModelConfig
(APIServer pid=1486408) Value error, Invalid repository ID or local directory specified: '/data_storage/shared/gjc/models/ALFWorld-Llama3.2-3B-Real'.
(APIServer pid=1486408) Please verify the following requirements:
(APIServer pid=1486408) 1. Provide a valid Hugging Face repository ID.
(APIServer pid=1486408) 2. Specify a local directory that contains a recognized configuration file.
(APIServer pid=1486408) - For Hugging Face models: ensure the presence of a 'config.json'.
(APIServer pid=1486408) - For Mistral models: ensure the presence of a 'params.json'.
(APIServer pid=1486408) 3. For GGUF: pass the local path of the GGUF checkpoint.
(APIServer pid=1486408) Loading GGUF from a remote repo directly is not yet supported.
(APIServer pid=1486408) [type=value_error, input_value=ArgsKwargs((), {'model': ...rocessor_plugin': None}), input_type=ArgsKwargs]
(APIServer pid=1486408) For further information visit https://errors.pydantic.dev/2.11/v/value_error
INFO 10-29 02:20:11 [__init__.py:216] Automatically detected platform cuda.
(APIServer pid=1486881) INFO 10-29 02:20:12 [api_server.py:1896] vLLM API server version 0.10.2
(APIServer pid=1486881) INFO 10-29 02:20:12 [utils.py:328] non-default args: {'host': '0.0.0.0', 'port': 8003, 'model': '/data_storage/shared/gjc/models/ALFWorld-Llama3.2-3B-Real/actor/global_step_938', 'max_model_len': 4096}
(APIServer pid=1486881) INFO 10-29 02:20:19 [__init__.py:742] Resolved architecture: LlamaForCausalLM
(APIServer pid=1486881) `torch_dtype` is deprecated! Use `dtype` instead!
(APIServer pid=1486881) INFO 10-29 02:20:19 [__init__.py:2764] Downcasting torch.float32 to torch.bfloat16.
(APIServer pid=1486881) INFO 10-29 02:20:19 [__init__.py:1815] Using max model len 4096
(APIServer pid=1486881) INFO 10-29 02:20:19 [scheduler.py:222] Chunked prefill is enabled with max_num_batched_tokens=2048.
INFO 10-29 02:20:24 [__init__.py:216] Automatically detected platform cuda.
(EngineCore_DP0 pid=1487215) INFO 10-29 02:20:26 [core.py:654] Waiting for init message from front-end.
(EngineCore_DP0 pid=1487215) INFO 10-29 02:20:26 [core.py:76] Initializing a V1 LLM engine (v0.10.2) with config: model='/data_storage/shared/gjc/models/ALFWorld-Llama3.2-3B-Real/actor/global_step_938', speculative_config=None, tokenizer='/data_storage/shared/gjc/models/ALFWorld-Llama3.2-3B-Real/actor/global_step_938', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.bfloat16, max_seq_len=4096, download_dir=None, load_format=auto, tensor_parallel_size=1, pipeline_parallel_size=1, data_parallel_size=1, disable_custom_all_reduce=False, quantization=None, enforce_eager=False, kv_cache_dtype=auto, device_config=cuda, decoding_config=DecodingConfig(backend='auto', disable_fallback=False, disable_any_whitespace=False, disable_additional_properties=False, reasoning_backend=''), observability_config=ObservabilityConfig(show_hidden_metrics_for_version=None, otlp_traces_endpoint=None, collect_detailed_traces=None), seed=0, served_model_name=/data_storage/shared/gjc/models/ALFWorld-Llama3.2-3B-Real/actor/global_step_938, enable_prefix_caching=True, chunked_prefill_enabled=True, use_async_output_proc=True, pooler_config=None, compilation_config={"level":3,"debug_dump_path":"","cache_dir":"","backend":"","custom_ops":[],"splitting_ops":["vllm.unified_attention","vllm.unified_attention_with_output","vllm.mamba_mixer2","vllm.mamba_mixer","vllm.short_conv","vllm.linear_attention","vllm.plamo2_mamba_mixer","vllm.gdn_attention"],"use_inductor":true,"compile_sizes":[],"inductor_compile_config":{"enable_auto_functionalized_v2":false},"inductor_passes":{},"cudagraph_mode":1,"use_cudagraph":true,"cudagraph_num_of_warmups":1,"cudagraph_capture_sizes":[512,504,496,488,480,472,464,456,448,440,432,424,416,408,400,392,384,376,368,360,352,344,336,328,320,312,304,296,288,280,272,264,256,248,240,232,224,216,208,200,192,184,176,168,160,152,144,136,128,120,112,104,96,88,80,72,64,56,48,40,32,24,16,8,4,2,1],"cudagraph_copy_inputs":false,"full_cuda_graph":false,"pass_config":{},"max_capture_size":512,"local_cache_dir":null}
[W1029 02:20:28.261298969 ProcessGroupNCCL.cpp:981] Warning: TORCH_NCCL_AVOID_RECORD_STREAMS is the default now, this environment variable is thus deprecated. (function operator())
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
[Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0
(EngineCore_DP0 pid=1487215) INFO 10-29 02:20:28 [parallel_state.py:1165] rank 0 in world size 1 is assigned as DP rank 0, PP rank 0, TP rank 0, EP rank 0
(EngineCore_DP0 pid=1487215) WARNING 10-29 02:20:28 [topk_topp_sampler.py:69] FlashInfer is not available. Falling back to the PyTorch-native implementation of top-p & top-k sampling. For the best performance, please install FlashInfer.
(EngineCore_DP0 pid=1487215) INFO 10-29 02:20:28 [gpu_model_runner.py:2338] Starting to load model /data_storage/shared/gjc/models/ALFWorld-Llama3.2-3B-Real/actor/global_step_938...
(EngineCore_DP0 pid=1487215) INFO 10-29 02:20:28 [gpu_model_runner.py:2370] Loading model from scratch...
(EngineCore_DP0 pid=1487215) INFO 10-29 02:20:29 [cuda.py:362] Using Flash Attention backend on V1 engine.
(EngineCore_DP0 pid=1487215) Loading safetensors checkpoint shards: 0% Completed | 0/3 [00:00<?, ?it/s]
(EngineCore_DP0 pid=1487215) Loading safetensors checkpoint shards: 33% Completed | 1/3 [00:04<00:09, 4.54s/it]
(EngineCore_DP0 pid=1487215) Loading safetensors checkpoint shards: 67% Completed | 2/3 [00:07<00:03, 3.50s/it]
(EngineCore_DP0 pid=1487215) Loading safetensors checkpoint shards: 100% Completed | 3/3 [00:12<00:00, 4.05s/it]
(EngineCore_DP0 pid=1487215) Loading safetensors checkpoint shards: 100% Completed | 3/3 [00:12<00:00, 4.00s/it]
(EngineCore_DP0 pid=1487215)
(EngineCore_DP0 pid=1487215) INFO 10-29 02:20:41 [default_loader.py:268] Loading weights took 12.23 seconds
(EngineCore_DP0 pid=1487215) INFO 10-29 02:20:41 [gpu_model_runner.py:2392] Model loading took 6.0160 GiB and 12.619037 seconds
(EngineCore_DP0 pid=1487215) INFO 10-29 02:20:47 [backends.py:539] Using cache directory: /root/.cache/vllm/torch_compile_cache/c93b9f65cd/rank_0_0/backbone for vLLM's torch.compile
(EngineCore_DP0 pid=1487215) INFO 10-29 02:20:47 [backends.py:550] Dynamo bytecode transform time: 5.29 s
(EngineCore_DP0 pid=1487215) INFO 10-29 02:20:47 [backends.py:194] Cache the graph for dynamic shape for later use
(EngineCore_DP0 pid=1487215) INFO 10-29 02:20:51 [backends.py:215] Compiling a graph for dynamic shape takes 4.00 s
(EngineCore_DP0 pid=1487215) INFO 10-29 02:20:53 [monitor.py:34] torch.compile takes 9.29 s in total
(EngineCore_DP0 pid=1487215) INFO 10-29 02:20:54 [gpu_worker.py:298] Available KV cache memory: 64.02 GiB
(EngineCore_DP0 pid=1487215) INFO 10-29 02:20:54 [kv_cache_utils.py:864] GPU KV cache size: 599,328 tokens
(EngineCore_DP0 pid=1487215) INFO 10-29 02:20:54 [kv_cache_utils.py:868] Maximum concurrency for 4,096 tokens per request: 146.32x
(EngineCore_DP0 pid=1487215) Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 0%| | 0/67 [00:00<?, ?it/s] Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 4%|▍ | 3/67 [00:00<00:02, 27.86it/s] Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 9%|▉ | 6/67 [00:00<00:02, 27.80it/s] Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 13%|█▎ | 9/67 [00:00<00:02, 27.15it/s] Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 18%|█▊ | 12/67 [00:00<00:02, 19.92it/s] Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 22%|██▏ | 15/67 [00:00<00:02, 22.57it/s] Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 28%|██▊ | 19/67 [00:00<00:01, 25.50it/s] Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 33%|███▎ | 22/67 [00:00<00:01, 26.57it/s] Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 37%|███▋ | 25/67 [00:01<00:01, 22.82it/s] Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 43%|████▎ | 29/67 [00:01<00:01, 25.33it/s] Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 49%|████▉ | 33/67 [00:01<00:01, 27.41it/s] Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 54%|█████▎ | 36/67 [00:01<00:01, 28.05it/s] Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 58%|█████▊ | 39/67 [00:01<00:01, 23.69it/s] Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 64%|██████▍ | 43/67 [00:01<00:00, 26.28it/s] Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 70%|███████ | 47/67 [00:01<00:00, 28.02it/s] Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 75%|███████▍ | 50/67 [00:01<00:00, 28.34it/s] Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 79%|███████▉ | 53/67 [00:02<00:00, 23.29it/s] Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 85%|████████▌ | 57/67 [00:02<00:00, 25.82it/s] Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 91%|█████████ | 61/67 [00:02<00:00, 28.13it/s] Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 97%|█████████▋| 65/67 [00:02<00:00, 26.35it/s] Capturing CUDA graphs (mixed prefill-decode, PIECEWISE): 100%|██████████| 67/67 [00:02<00:00, 25.27it/s]
(EngineCore_DP0 pid=1487215) INFO 10-29 02:20:58 [gpu_model_runner.py:3118] Graph capturing finished in 3 secs, took 1.56 GiB
(EngineCore_DP0 pid=1487215) INFO 10-29 02:20:58 [gpu_worker.py:391] Free memory on device (78.66/79.15 GiB) on startup. Desired GPU memory utilization is (0.9, 71.24 GiB). Actual usage is 6.02 GiB for weight, 1.19 GiB for peak activation, 0.02 GiB for non-torch memory, and 1.56 GiB for CUDAGraph memory. Replace gpu_memory_utilization config with `--kv-cache-memory=66899016704` to fit into requested memory, or `--kv-cache-memory=74869963776` to fully utilize gpu memory. Current kv cache memory in use is 68736121856 bytes.
(EngineCore_DP0 pid=1487215) INFO 10-29 02:20:58 [core.py:218] init engine (profile, create kv cache, warmup model) took 16.11 seconds
(APIServer pid=1486881) INFO 10-29 02:20:58 [loggers.py:142] Engine 000: vllm cache_config_info with initialization after num_gpu_blocks is: 37458
(APIServer pid=1486881) INFO 10-29 02:20:58 [async_llm.py:180] Torch profiler disabled. AsyncLLM CPU traces will not be collected.
(APIServer pid=1486881) INFO 10-29 02:20:58 [api_server.py:1692] Supported_tasks: ['generate']
(APIServer pid=1486881) WARNING 10-29 02:20:58 [__init__.py:1695] Default sampling parameters have been overridden by the model's Hugging Face generation config recommended from the model creator. If this is not intended, please relaunch vLLM instance with `--generation-config vllm`.
(APIServer pid=1486881) INFO 10-29 02:20:58 [serving_responses.py:130] Using default chat sampling params from model: {'temperature': 0.6, 'top_p': 0.9}
(APIServer pid=1486881) INFO 10-29 02:20:58 [serving_chat.py:137] Using default chat sampling params from model: {'temperature': 0.6, 'top_p': 0.9}
(APIServer pid=1486881) INFO 10-29 02:20:58 [serving_completion.py:76] Using default completion sampling params from model: {'temperature': 0.6, 'top_p': 0.9}
(APIServer pid=1486881) INFO 10-29 02:20:58 [api_server.py:1971] Starting vLLM API server 0 on http://0.0.0.0:8003
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:36] Available routes are:
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /openapi.json, Methods: GET, HEAD
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /docs, Methods: GET, HEAD
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /docs/oauth2-redirect, Methods: GET, HEAD
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /redoc, Methods: GET, HEAD
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /health, Methods: GET
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /load, Methods: GET
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /ping, Methods: POST
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /ping, Methods: GET
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /tokenize, Methods: POST
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /detokenize, Methods: POST
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /v1/models, Methods: GET
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /version, Methods: GET
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /v1/responses, Methods: POST
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /v1/responses/{response_id}, Methods: GET
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /v1/responses/{response_id}/cancel, Methods: POST
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /v1/chat/completions, Methods: POST
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /v1/completions, Methods: POST
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /v1/embeddings, Methods: POST
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /pooling, Methods: POST
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /classify, Methods: POST
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /score, Methods: POST
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /v1/score, Methods: POST
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /v1/audio/transcriptions, Methods: POST
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /v1/audio/translations, Methods: POST
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /rerank, Methods: POST
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /v1/rerank, Methods: POST
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /v2/rerank, Methods: POST
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /scale_elastic_ep, Methods: POST
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /is_scaling_elastic_ep, Methods: POST
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /invocations, Methods: POST
(APIServer pid=1486881) INFO 10-29 02:20:58 [launcher.py:44] Route: /metrics, Methods: GET
(APIServer pid=1486881) INFO: Started server process [1486881]
(APIServer pid=1486881) INFO: Waiting for application startup.
(APIServer pid=1486881) INFO: Application startup complete.
(APIServer pid=1486881) INFO 10-29 02:23:08 [chat_utils.py:538] Detected the chat template content format to be 'string'. You can set `--chat-template-content-format` to override this.
(APIServer pid=1486881) INFO: 127.0.0.1:44610 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44616 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:23:09 [loggers.py:123] Engine 000: Avg prompt throughput: 117.0 tokens/s, Avg generation throughput: 11.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 38.3%
(APIServer pid=1486881) INFO: 127.0.0.1:44626 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44636 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44648 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44656 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44662 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44666 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44674 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41720 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41730 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41738 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41752 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41768 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41770 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41772 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41788 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41794 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41806 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41808 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41822 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41834 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:23:19 [loggers.py:123] Engine 000: Avg prompt throughput: 1024.4 tokens/s, Avg generation throughput: 67.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 52.4%
(APIServer pid=1486881) INFO: 127.0.0.1:41836 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41846 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41860 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41864 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41866 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41880 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47774 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47776 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47788 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47790 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47794 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47798 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47812 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47824 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47830 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:23:29 [loggers.py:123] Engine 000: Avg prompt throughput: 1067.5 tokens/s, Avg generation throughput: 72.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 49.6%
(APIServer pid=1486881) INFO: 127.0.0.1:47842 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47852 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47862 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47878 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47892 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47902 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47918 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47926 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47934 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47948 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47956 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47970 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47980 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47988 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47992 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48008 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48018 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59544 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59558 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59572 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59586 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59598 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:23:39 [loggers.py:123] Engine 000: Avg prompt throughput: 1480.3 tokens/s, Avg generation throughput: 83.9 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 55.9%
(APIServer pid=1486881) INFO: 127.0.0.1:59602 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59606 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59612 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59620 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59622 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59632 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59640 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59650 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59662 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48826 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48836 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48844 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48850 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48862 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48868 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48872 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48880 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48884 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48898 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48908 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48912 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48928 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48938 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:23:49 [loggers.py:123] Engine 000: Avg prompt throughput: 1337.2 tokens/s, Avg generation throughput: 119.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 53.1%
(APIServer pid=1486881) INFO: 127.0.0.1:48948 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48964 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48966 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48976 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48988 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49002 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49010 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49020 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59766 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59780 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59782 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59796 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59806 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59808 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59816 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59828 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59838 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59846 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:23:59 [loggers.py:123] Engine 000: Avg prompt throughput: 953.5 tokens/s, Avg generation throughput: 94.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 52.2%
(APIServer pid=1486881) INFO: 127.0.0.1:59852 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59862 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59878 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59888 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59904 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59906 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59918 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59932 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59948 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56582 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56594 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56604 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56614 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56618 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56632 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:24:09 [loggers.py:123] Engine 000: Avg prompt throughput: 795.9 tokens/s, Avg generation throughput: 102.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 51.5%
(APIServer pid=1486881) INFO: 127.0.0.1:56646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56652 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56668 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56682 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56694 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56706 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56716 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56728 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56736 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56750 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37086 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37088 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37100 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37112 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37120 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37124 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:24:19 [loggers.py:123] Engine 000: Avg prompt throughput: 717.3 tokens/s, Avg generation throughput: 135.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 51.6%
(APIServer pid=1486881) INFO: 127.0.0.1:37136 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37152 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37168 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37176 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37184 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37196 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37212 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37214 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52026 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52042 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52046 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52060 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52068 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52080 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52092 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52102 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52110 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52118 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:24:29 [loggers.py:123] Engine 000: Avg prompt throughput: 880.4 tokens/s, Avg generation throughput: 118.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 51.4%
(APIServer pid=1486881) INFO: 127.0.0.1:52134 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52144 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52156 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52168 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52170 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52176 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52182 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52198 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52210 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52214 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52230 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52244 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52250 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52258 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52262 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52282 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35100 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35114 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35130 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35136 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35152 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35168 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35182 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35190 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35204 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35212 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35226 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35238 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35248 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35260 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35272 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35282 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35292 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35306 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35314 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:24:39 [loggers.py:123] Engine 000: Avg prompt throughput: 1824.5 tokens/s, Avg generation throughput: 87.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.3%
(APIServer pid=1486881) INFO: 127.0.0.1:35324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35332 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35342 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35344 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35356 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35358 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35374 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35376 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35392 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35396 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35404 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35412 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42018 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42022 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42028 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42034 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42046 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42054 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42062 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42074 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42090 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42106 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42114 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42126 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42132 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42146 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42148 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:24:49 [loggers.py:123] Engine 000: Avg prompt throughput: 1262.3 tokens/s, Avg generation throughput: 118.5 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 55.6%
(APIServer pid=1486881) INFO: 127.0.0.1:42150 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42164 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42172 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42180 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48014 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48016 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48024 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48050 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48066 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48078 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48080 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48086 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48094 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48108 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48122 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48126 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48142 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:24:59 [loggers.py:123] Engine 000: Avg prompt throughput: 853.7 tokens/s, Avg generation throughput: 101.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 55.7%
(APIServer pid=1486881) INFO: 127.0.0.1:48158 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48172 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48182 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48186 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48202 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48210 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48220 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48226 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60848 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60850 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60852 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60864 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60876 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60884 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60892 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60906 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60910 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60918 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60922 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60930 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60932 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60936 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60946 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60952 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60960 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60968 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:25:09 [loggers.py:123] Engine 000: Avg prompt throughput: 1332.3 tokens/s, Avg generation throughput: 98.8 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 56.5%
(APIServer pid=1486881) INFO: 127.0.0.1:60976 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60992 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60996 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:32778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:32780 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:32796 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:32808 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:32818 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:32834 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:32846 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:32850 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:32858 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:32862 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:32868 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48300 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48304 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48316 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48318 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48332 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48338 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48346 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48360 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48366 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48372 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48378 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48380 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48390 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48396 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:25:19 [loggers.py:123] Engine 000: Avg prompt throughput: 1522.3 tokens/s, Avg generation throughput: 103.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 55.5%
(APIServer pid=1486881) INFO: 127.0.0.1:48404 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:48412 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:43792 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:25:29 [loggers.py:123] Engine 000: Avg prompt throughput: 164.2 tokens/s, Avg generation throughput: 148.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 55.8%
(APIServer pid=1486881) INFO: 127.0.0.1:43796 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:43806 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44126 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:25:39 [loggers.py:123] Engine 000: Avg prompt throughput: 164.0 tokens/s, Avg generation throughput: 148.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.3%
(APIServer pid=1486881) INFO: 127.0.0.1:44134 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44146 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:25:49 [loggers.py:123] Engine 000: Avg prompt throughput: 109.4 tokens/s, Avg generation throughput: 149.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 56.6%
(APIServer pid=1486881) INFO: 127.0.0.1:47200 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47214 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47220 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:25:59 [loggers.py:123] Engine 000: Avg prompt throughput: 164.2 tokens/s, Avg generation throughput: 148.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 57.1%
(APIServer pid=1486881) INFO: 127.0.0.1:46994 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47008 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47022 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:26:09 [loggers.py:123] Engine 000: Avg prompt throughput: 164.0 tokens/s, Avg generation throughput: 148.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 57.5%
(APIServer pid=1486881) INFO: 127.0.0.1:52514 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52516 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52522 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39892 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39906 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39910 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39926 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:26:19 [loggers.py:123] Engine 000: Avg prompt throughput: 300.4 tokens/s, Avg generation throughput: 131.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.8%
(APIServer pid=1486881) INFO: 127.0.0.1:39940 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39942 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39946 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39962 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39970 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39986 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40000 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40002 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40004 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40016 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40020 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40032 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60134 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60140 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60152 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60166 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60168 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60170 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60184 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60196 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60210 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60212 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60214 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:26:29 [loggers.py:123] Engine 000: Avg prompt throughput: 1077.6 tokens/s, Avg generation throughput: 101.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.3%
(APIServer pid=1486881) INFO: 127.0.0.1:60220 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60230 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60244 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60258 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60260 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60264 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60272 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60284 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60290 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60306 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60320 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:26:39 [loggers.py:123] Engine 000: Avg prompt throughput: 517.3 tokens/s, Avg generation throughput: 113.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.1%
(APIServer pid=1486881) INFO: 127.0.0.1:43924 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:43934 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:43944 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:51394 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:51396 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:51398 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:51410 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:51424 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:26:49 [loggers.py:123] Engine 000: Avg prompt throughput: 417.5 tokens/s, Avg generation throughput: 123.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.0%
(APIServer pid=1486881) INFO: 127.0.0.1:51440 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:51452 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:51458 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:51468 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:51484 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56798 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56804 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56820 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56830 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56846 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56862 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56870 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56874 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:26:59 [loggers.py:123] Engine 000: Avg prompt throughput: 677.6 tokens/s, Avg generation throughput: 117.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.1%
(APIServer pid=1486881) INFO: 127.0.0.1:56878 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56886 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56892 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56900 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56902 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56916 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56932 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56944 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56958 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56970 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56972 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56976 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56990 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56998 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57014 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57018 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57028 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57030 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57032 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57044 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57048 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57058 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40258 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40264 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40278 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40282 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40292 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40298 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40312 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40320 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40336 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40340 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40344 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40354 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40366 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40376 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40378 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:27:09 [loggers.py:123] Engine 000: Avg prompt throughput: 1878.4 tokens/s, Avg generation throughput: 87.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 59.0%
(APIServer pid=1486881) INFO: 127.0.0.1:40392 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40394 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40404 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40408 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40420 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40428 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40444 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40456 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40466 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40482 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:46838 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:46848 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:46856 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:46862 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:27:19 [loggers.py:123] Engine 000: Avg prompt throughput: 746.6 tokens/s, Avg generation throughput: 135.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.9%
(APIServer pid=1486881) INFO: 127.0.0.1:46864 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:46880 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:46890 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:46900 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:46908 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:46922 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41034 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41048 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41056 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41060 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41072 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41078 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41080 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41094 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41106 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41116 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41120 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41128 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41132 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:27:29 [loggers.py:123] Engine 000: Avg prompt throughput: 931.9 tokens/s, Avg generation throughput: 64.3 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 58.3%
(APIServer pid=1486881) INFO: 127.0.0.1:41134 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60282 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60284 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:27:39 [loggers.py:123] Engine 000: Avg prompt throughput: 278.5 tokens/s, Avg generation throughput: 109.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 58.1%
(APIServer pid=1486881) INFO: 127.0.0.1:60288 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60302 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60314 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60318 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60322 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:43908 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:27:49 [loggers.py:123] Engine 000: Avg prompt throughput: 382.8 tokens/s, Avg generation throughput: 110.5 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 57.9%
(APIServer pid=1486881) INFO: 127.0.0.1:43910 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:43926 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:43928 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:43944 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:43954 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60766 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60768 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60776 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60788 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60796 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60798 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60812 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60828 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60844 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60848 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60858 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60860 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60864 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60868 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60880 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:27:59 [loggers.py:123] Engine 000: Avg prompt throughput: 1339.4 tokens/s, Avg generation throughput: 109.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.3%
(APIServer pid=1486881) INFO: 127.0.0.1:60890 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60904 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60912 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60928 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60942 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60950 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60954 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47100 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47110 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47116 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47128 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47134 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47138 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47148 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47162 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47170 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47184 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47186 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47196 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47202 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47206 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47218 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:28:09 [loggers.py:123] Engine 000: Avg prompt throughput: 1023.9 tokens/s, Avg generation throughput: 91.0 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 57.0%
(APIServer pid=1486881) INFO: 127.0.0.1:47232 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47244 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36188 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36200 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36212 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36222 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36234 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36236 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36250 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36258 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36268 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:28:19 [loggers.py:123] Engine 000: Avg prompt throughput: 776.9 tokens/s, Avg generation throughput: 51.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.7%
(APIServer pid=1486881) INFO: 127.0.0.1:36276 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36290 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39004 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39020 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39026 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39042 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39046 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39048 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39062 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39072 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39084 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39096 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39112 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39124 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39134 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39142 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39150 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:28:29 [loggers.py:123] Engine 000: Avg prompt throughput: 766.2 tokens/s, Avg generation throughput: 105.5 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 56.7%
(APIServer pid=1486881) INFO: 127.0.0.1:39156 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39162 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39164 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36360 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36372 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36388 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36396 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36408 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36422 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36430 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36442 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36446 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36450 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36462 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36474 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:28:39 [loggers.py:123] Engine 000: Avg prompt throughput: 918.4 tokens/s, Avg generation throughput: 84.4 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 56.4%
(APIServer pid=1486881) INFO: 127.0.0.1:36482 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36492 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36508 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:50936 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:50946 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:50956 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:50960 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:50962 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:50976 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:50982 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:50994 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:28:49 [loggers.py:123] Engine 000: Avg prompt throughput: 560.2 tokens/s, Avg generation throughput: 126.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.3%
(APIServer pid=1486881) INFO: 127.0.0.1:50998 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:51000 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39964 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39972 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39980 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:28:59 [loggers.py:123] Engine 000: Avg prompt throughput: 257.2 tokens/s, Avg generation throughput: 145.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.4%
(APIServer pid=1486881) INFO: 127.0.0.1:39992 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40004 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:29:09 [loggers.py:123] Engine 000: Avg prompt throughput: 103.0 tokens/s, Avg generation throughput: 149.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 56.5%
(APIServer pid=1486881) INFO: 127.0.0.1:57914 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57928 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57936 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:29:19 [loggers.py:123] Engine 000: Avg prompt throughput: 155.0 tokens/s, Avg generation throughput: 148.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 56.8%
(APIServer pid=1486881) INFO: 127.0.0.1:60920 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60932 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60934 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:29:29 [loggers.py:123] Engine 000: Avg prompt throughput: 154.0 tokens/s, Avg generation throughput: 148.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.0%
(APIServer pid=1486881) INFO: 127.0.0.1:39770 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39792 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:29:39 [loggers.py:123] Engine 000: Avg prompt throughput: 155.0 tokens/s, Avg generation throughput: 148.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.2%
(APIServer pid=1486881) INFO: 127.0.0.1:52754 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52762 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53812 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53828 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:29:49 [loggers.py:123] Engine 000: Avg prompt throughput: 214.3 tokens/s, Avg generation throughput: 132.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.3%
(APIServer pid=1486881) INFO: 127.0.0.1:53844 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53848 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53852 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53856 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53862 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53870 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53876 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53880 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53892 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36024 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36052 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36062 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36072 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36078 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36090 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36092 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36106 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36118 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36126 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:29:59 [loggers.py:123] Engine 000: Avg prompt throughput: 1050.8 tokens/s, Avg generation throughput: 82.9 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 57.5%
(APIServer pid=1486881) INFO: 127.0.0.1:36132 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36142 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44362 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44376 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44386 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44400 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44414 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44426 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44440 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44448 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44450 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44458 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44464 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44472 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44476 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44488 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44500 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:30:09 [loggers.py:123] Engine 000: Avg prompt throughput: 973.2 tokens/s, Avg generation throughput: 90.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.3%
(APIServer pid=1486881) INFO: 127.0.0.1:44504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44520 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44524 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44540 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:45992 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:30:19 [loggers.py:123] Engine 000: Avg prompt throughput: 254.7 tokens/s, Avg generation throughput: 145.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.2%
(APIServer pid=1486881) INFO: 127.0.0.1:46006 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:46008 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:46020 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:46030 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:46036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:46044 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:46050 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:54346 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:54350 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:54366 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:30:29 [loggers.py:123] Engine 000: Avg prompt throughput: 514.0 tokens/s, Avg generation throughput: 138.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 57.1%
(APIServer pid=1486881) INFO: 127.0.0.1:54370 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:54374 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:54380 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:54384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:54390 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52206 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52214 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:30:39 [loggers.py:123] Engine 000: Avg prompt throughput: 395.0 tokens/s, Avg generation throughput: 84.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 56.9%
(APIServer pid=1486881) INFO: 127.0.0.1:52228 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52242 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52244 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52250 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52262 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52284 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:30:49 [loggers.py:123] Engine 000: Avg prompt throughput: 516.6 tokens/s, Avg generation throughput: 133.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 56.8%
(APIServer pid=1486881) INFO: 127.0.0.1:49546 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49556 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49570 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49576 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57178 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57190 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57192 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:30:59 [loggers.py:123] Engine 000: Avg prompt throughput: 533.5 tokens/s, Avg generation throughput: 133.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.6%
(APIServer pid=1486881) INFO: 127.0.0.1:57198 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57202 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57388 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57404 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:31:09 [loggers.py:123] Engine 000: Avg prompt throughput: 230.8 tokens/s, Avg generation throughput: 141.0 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 56.5%
(APIServer pid=1486881) INFO: 127.0.0.1:57410 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57414 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57430 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57436 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57442 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:45810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:45820 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:45824 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:31:19 [loggers.py:123] Engine 000: Avg prompt throughput: 614.4 tokens/s, Avg generation throughput: 132.7 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 56.3%
(APIServer pid=1486881) INFO: 127.0.0.1:45826 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:45832 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:45836 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:45848 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:45858 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:45866 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:45876 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56928 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:31:29 [loggers.py:123] Engine 000: Avg prompt throughput: 508.7 tokens/s, Avg generation throughput: 119.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 56.2%
(APIServer pid=1486881) INFO: 127.0.0.1:56938 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56948 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56956 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56958 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56966 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56980 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56992 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57008 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57022 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57028 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57032 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57042 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57048 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42116 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42126 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42132 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42148 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42154 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42156 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42162 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42166 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42174 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42178 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42190 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42198 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42200 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42216 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42228 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42230 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42232 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42234 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:31:39 [loggers.py:123] Engine 000: Avg prompt throughput: 1499.2 tokens/s, Avg generation throughput: 85.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.3%
(APIServer pid=1486881) INFO: 127.0.0.1:42238 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42240 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42256 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42258 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42260 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53172 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53174 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53186 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53202 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53218 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53224 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53230 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53234 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53248 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53252 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53256 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53258 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53268 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:31:49 [loggers.py:123] Engine 000: Avg prompt throughput: 830.6 tokens/s, Avg generation throughput: 118.0 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 56.4%
(APIServer pid=1486881) INFO: 127.0.0.1:53270 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53280 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53286 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53292 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53300 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53306 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53320 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53322 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53334 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53338 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53340 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:53344 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34234 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34248 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34252 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34260 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34272 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34280 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:31:59 [loggers.py:123] Engine 000: Avg prompt throughput: 998.3 tokens/s, Avg generation throughput: 104.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 56.4%
(APIServer pid=1486881) INFO: 127.0.0.1:34288 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34300 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34308 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34312 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34320 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34324 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34336 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34360 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34370 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34376 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60766 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60780 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60792 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60806 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60820 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60836 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60838 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60854 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60866 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60882 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60886 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:32:09 [loggers.py:123] Engine 000: Avg prompt throughput: 1095.0 tokens/s, Avg generation throughput: 62.0 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 56.3%
(APIServer pid=1486881) INFO: 127.0.0.1:60902 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60906 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60914 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:43956 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:43958 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:43972 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:43984 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:32:19 [loggers.py:123] Engine 000: Avg prompt throughput: 469.6 tokens/s, Avg generation throughput: 133.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.5%
(APIServer pid=1486881) INFO: 127.0.0.1:43994 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44008 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44020 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44032 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44042 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44056 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44070 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44080 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44094 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44106 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55650 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55666 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55674 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55684 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55692 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55706 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55722 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55732 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55742 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55752 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55754 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55768 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55776 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55792 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:32:29 [loggers.py:123] Engine 000: Avg prompt throughput: 1280.3 tokens/s, Avg generation throughput: 73.1 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 56.4%
(APIServer pid=1486881) INFO: 127.0.0.1:55802 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55810 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55820 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55828 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55838 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55840 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55850 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55862 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55870 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35274 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35276 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35278 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35284 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:32:39 [loggers.py:123] Engine 000: Avg prompt throughput: 641.6 tokens/s, Avg generation throughput: 115.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.3%
(APIServer pid=1486881) INFO: 127.0.0.1:35288 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35296 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35300 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35310 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35314 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41242 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41252 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41266 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41282 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41294 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41302 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41318 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41330 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41334 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:32:49 [loggers.py:123] Engine 000: Avg prompt throughput: 725.5 tokens/s, Avg generation throughput: 104.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.3%
(APIServer pid=1486881) INFO: 127.0.0.1:41342 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41346 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41358 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41372 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41390 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41392 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41400 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41412 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41424 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41428 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:32:59 [loggers.py:123] Engine 000: Avg prompt throughput: 527.1 tokens/s, Avg generation throughput: 124.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 56.3%
(APIServer pid=1486881) INFO: 127.0.0.1:40724 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40740 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40752 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40766 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40770 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:40772 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58622 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:33:09 [loggers.py:123] Engine 000: Avg prompt throughput: 416.6 tokens/s, Avg generation throughput: 70.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.2%
(APIServer pid=1486881) INFO: 127.0.0.1:58628 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58652 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58666 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58676 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49122 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49128 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49142 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49158 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49168 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49176 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49190 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49206 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49222 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:33:19 [loggers.py:123] Engine 000: Avg prompt throughput: 844.3 tokens/s, Avg generation throughput: 92.1 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 56.3%
(APIServer pid=1486881) INFO: 127.0.0.1:49236 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49242 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49254 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49266 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49278 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49292 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49302 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38604 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38606 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38614 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:33:29 [loggers.py:123] Engine 000: Avg prompt throughput: 558.3 tokens/s, Avg generation throughput: 121.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 56.5%
(APIServer pid=1486881) INFO: 127.0.0.1:38620 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38624 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38628 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38636 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38642 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38652 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38662 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41610 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41622 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41624 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41634 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41648 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41654 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:33:39 [loggers.py:123] Engine 000: Avg prompt throughput: 752.7 tokens/s, Avg generation throughput: 133.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.4%
(APIServer pid=1486881) INFO: 127.0.0.1:41658 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41668 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41682 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41690 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41694 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41696 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:41698 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35342 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35358 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35360 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35364 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35370 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:33:49 [loggers.py:123] Engine 000: Avg prompt throughput: 717.6 tokens/s, Avg generation throughput: 134.3 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.2%
(APIServer pid=1486881) INFO: 127.0.0.1:35378 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35384 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35388 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35390 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35398 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35402 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35414 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35418 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35424 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35430 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35436 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55212 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55226 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55236 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55240 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55244 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55246 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55260 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55272 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55286 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55296 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55300 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55308 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:33:59 [loggers.py:123] Engine 000: Avg prompt throughput: 1047.4 tokens/s, Avg generation throughput: 109.1 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 56.1%
(APIServer pid=1486881) INFO: 127.0.0.1:55320 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55336 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55346 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55356 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55362 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55372 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55378 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55382 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56114 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:34:09 [loggers.py:123] Engine 000: Avg prompt throughput: 453.5 tokens/s, Avg generation throughput: 113.5 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 56.2%
(APIServer pid=1486881) INFO: 127.0.0.1:56124 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56128 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44914 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44920 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44936 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:34:19 [loggers.py:123] Engine 000: Avg prompt throughput: 382.1 tokens/s, Avg generation throughput: 116.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.1%
(APIServer pid=1486881) INFO: 127.0.0.1:44950 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44966 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44972 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44974 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44978 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44992 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49924 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49940 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49944 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49954 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:34:29 [loggers.py:123] Engine 000: Avg prompt throughput: 619.3 tokens/s, Avg generation throughput: 132.3 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 56.0%
(APIServer pid=1486881) INFO: 127.0.0.1:49966 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49982 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49990 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:49992 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:50000 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:50010 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59638 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59640 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59656 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59662 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59666 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:34:39 [loggers.py:123] Engine 000: Avg prompt throughput: 833.9 tokens/s, Avg generation throughput: 127.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.0%
(APIServer pid=1486881) INFO: 127.0.0.1:59670 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59680 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59688 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:59694 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56396 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56420 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56424 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56440 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56446 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56448 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:34:49 [loggers.py:123] Engine 000: Avg prompt throughput: 461.1 tokens/s, Avg generation throughput: 126.9 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 56.3%
(APIServer pid=1486881) INFO: 127.0.0.1:56458 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56460 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:54148 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:54162 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:54178 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:54192 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:54198 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:54212 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:34:59 [loggers.py:123] Engine 000: Avg prompt throughput: 449.8 tokens/s, Avg generation throughput: 124.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.2%
(APIServer pid=1486881) INFO: 127.0.0.1:54226 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:54242 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:54254 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52500 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:35:09 [loggers.py:123] Engine 000: Avg prompt throughput: 219.3 tokens/s, Avg generation throughput: 146.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 56.2%
(APIServer pid=1486881) INFO: 127.0.0.1:52504 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52508 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52522 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52524 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52540 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37878 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37886 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37896 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37904 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37906 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:35:19 [loggers.py:123] Engine 000: Avg prompt throughput: 546.6 tokens/s, Avg generation throughput: 138.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.0%
(APIServer pid=1486881) INFO: 127.0.0.1:37920 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37930 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37934 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56974 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56990 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:35:29 [loggers.py:123] Engine 000: Avg prompt throughput: 285.5 tokens/s, Avg generation throughput: 145.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.0%
(APIServer pid=1486881) INFO: 127.0.0.1:56998 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57006 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57022 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57034 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57048 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57062 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57074 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57086 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57094 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33348 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33354 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33356 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33372 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33374 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33378 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33380 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33382 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33388 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:35:39 [loggers.py:123] Engine 000: Avg prompt throughput: 853.0 tokens/s, Avg generation throughput: 93.3 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 55.9%
(APIServer pid=1486881) INFO: 127.0.0.1:33394 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33396 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33410 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33418 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33420 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33432 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33442 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33446 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33462 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39738 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39744 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39760 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39768 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39780 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39790 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:35:49 [loggers.py:123] Engine 000: Avg prompt throughput: 878.5 tokens/s, Avg generation throughput: 128.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 55.9%
(APIServer pid=1486881) INFO: 127.0.0.1:39798 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39806 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39818 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39830 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52148 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52160 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52162 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52170 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52182 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52190 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52206 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52222 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52228 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52230 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52244 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:35:59 [loggers.py:123] Engine 000: Avg prompt throughput: 744.7 tokens/s, Avg generation throughput: 120.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.1%
(APIServer pid=1486881) INFO: 127.0.0.1:52254 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52258 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37946 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37956 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37960 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:36:09 [loggers.py:123] Engine 000: Avg prompt throughput: 265.3 tokens/s, Avg generation throughput: 146.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.1%
(APIServer pid=1486881) INFO: 127.0.0.1:37976 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37984 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60688 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60702 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60714 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:36:19 [loggers.py:123] Engine 000: Avg prompt throughput: 265.7 tokens/s, Avg generation throughput: 146.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.0%
(APIServer pid=1486881) INFO: 127.0.0.1:60720 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60736 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60738 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60744 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60750 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58754 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58766 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58778 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:36:29 [loggers.py:123] Engine 000: Avg prompt throughput: 426.2 tokens/s, Avg generation throughput: 142.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.0%
(APIServer pid=1486881) INFO: 127.0.0.1:58782 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58792 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58794 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58804 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58806 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58808 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58486 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58494 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58498 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58508 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58514 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58520 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:36:39 [loggers.py:123] Engine 000: Avg prompt throughput: 621.5 tokens/s, Avg generation throughput: 104.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 56.3%
(APIServer pid=1486881) INFO: 127.0.0.1:58524 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58540 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58554 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58556 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58568 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58572 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58576 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58582 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38024 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38036 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:36:49 [loggers.py:123] Engine 000: Avg prompt throughput: 519.0 tokens/s, Avg generation throughput: 123.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 55.9%
(APIServer pid=1486881) INFO: 127.0.0.1:38052 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38054 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60550 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60566 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60576 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60588 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60590 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60604 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:36:59 [loggers.py:123] Engine 000: Avg prompt throughput: 439.8 tokens/s, Avg generation throughput: 141.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 55.8%
(APIServer pid=1486881) INFO: 127.0.0.1:60608 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60616 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60636 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60638 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60654 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60666 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60676 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60692 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55808 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55822 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:37:09 [loggers.py:123] Engine 000: Avg prompt throughput: 610.5 tokens/s, Avg generation throughput: 136.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 55.8%
(APIServer pid=1486881) INFO: 127.0.0.1:55834 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55844 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55860 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:55874 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35578 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35582 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:37:19 [loggers.py:123] Engine 000: Avg prompt throughput: 248.6 tokens/s, Avg generation throughput: 126.1 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 55.9%
(APIServer pid=1486881) INFO: 127.0.0.1:35584 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35586 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35602 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60428 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60442 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60452 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60456 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60460 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60462 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:37:29 [loggers.py:123] Engine 000: Avg prompt throughput: 465.8 tokens/s, Avg generation throughput: 138.1 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 55.9%
(APIServer pid=1486881) INFO: 127.0.0.1:60470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60486 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60488 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60492 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60494 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60508 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60514 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60528 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60538 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60540 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60554 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60566 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60572 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60574 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60590 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60600 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60606 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52466 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52480 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:37:39 [loggers.py:123] Engine 000: Avg prompt throughput: 1014.4 tokens/s, Avg generation throughput: 101.6 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 55.8%
(APIServer pid=1486881) INFO: 127.0.0.1:52484 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52490 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52492 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52500 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37556 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37564 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37574 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37580 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37588 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37600 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37606 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37614 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37622 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37624 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:37:49 [loggers.py:123] Engine 000: Avg prompt throughput: 765.2 tokens/s, Avg generation throughput: 128.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.0%
(APIServer pid=1486881) INFO: 127.0.0.1:37634 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37652 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37660 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33438 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33442 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33452 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33454 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33458 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33486 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33490 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:37:59 [loggers.py:123] Engine 000: Avg prompt throughput: 681.7 tokens/s, Avg generation throughput: 133.4 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 55.9%
(APIServer pid=1486881) INFO: 127.0.0.1:33500 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33514 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33518 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:33530 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60238 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60246 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60250 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60256 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60270 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60278 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60286 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:38:09 [loggers.py:123] Engine 000: Avg prompt throughput: 563.0 tokens/s, Avg generation throughput: 119.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.0%
(APIServer pid=1486881) INFO: 127.0.0.1:60294 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60296 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60310 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35024 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35026 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35038 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35052 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:38:19 [loggers.py:123] Engine 000: Avg prompt throughput: 402.2 tokens/s, Avg generation throughput: 141.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 55.9%
(APIServer pid=1486881) INFO: 127.0.0.1:35064 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35078 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35086 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35092 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35102 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35114 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35124 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35134 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57616 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57620 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:38:29 [loggers.py:123] Engine 000: Avg prompt throughput: 490.1 tokens/s, Avg generation throughput: 73.5 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 55.9%
(APIServer pid=1486881) INFO: 127.0.0.1:57628 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57632 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57638 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57652 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57658 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57666 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57680 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57682 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57694 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57700 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57710 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57718 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:57726 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60006 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60014 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60030 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60046 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60052 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60060 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60066 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60074 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60080 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:38:39 [loggers.py:123] Engine 000: Avg prompt throughput: 1747.7 tokens/s, Avg generation throughput: 93.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 55.5%
(APIServer pid=1486881) INFO: 127.0.0.1:60084 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60090 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60098 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60100 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60108 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60116 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60118 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60130 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60138 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:60150 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38204 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38212 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38226 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38238 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38244 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:38:49 [loggers.py:123] Engine 000: Avg prompt throughput: 890.2 tokens/s, Avg generation throughput: 77.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 55.9%
(APIServer pid=1486881) INFO: 127.0.0.1:38246 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38262 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38268 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38282 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38288 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38304 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38310 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38314 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38328 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38332 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:38344 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42540 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:38:59 [loggers.py:123] Engine 000: Avg prompt throughput: 754.5 tokens/s, Avg generation throughput: 126.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 55.9%
(APIServer pid=1486881) INFO: 127.0.0.1:42554 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42556 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42564 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42580 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42592 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42608 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42614 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42618 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42624 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42630 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42632 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42636 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39236 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39242 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39258 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39262 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:39:09 [loggers.py:123] Engine 000: Avg prompt throughput: 920.8 tokens/s, Avg generation throughput: 107.0 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 55.7%
(APIServer pid=1486881) INFO: 127.0.0.1:39278 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39286 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39292 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39294 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39304 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39312 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39326 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58644 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:39:19 [loggers.py:123] Engine 000: Avg prompt throughput: 375.1 tokens/s, Avg generation throughput: 143.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 55.7%
(APIServer pid=1486881) INFO: 127.0.0.1:58656 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58668 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58682 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58684 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58696 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:50866 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:50874 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:50876 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:50880 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:50890 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:50898 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:50902 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:39:29 [loggers.py:123] Engine 000: Avg prompt throughput: 602.9 tokens/s, Avg generation throughput: 139.5 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 55.8%
(APIServer pid=1486881) INFO: 127.0.0.1:50908 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:50912 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:50916 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:50928 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:50942 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37098 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37106 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37114 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37116 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37124 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:39:39 [loggers.py:123] Engine 000: Avg prompt throughput: 462.6 tokens/s, Avg generation throughput: 129.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 55.8%
(APIServer pid=1486881) INFO: 127.0.0.1:37136 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37150 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37164 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37174 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37184 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37198 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37206 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37208 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37222 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39214 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39228 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:39:49 [loggers.py:123] Engine 000: Avg prompt throughput: 556.1 tokens/s, Avg generation throughput: 138.7 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.0%
(APIServer pid=1486881) INFO: 127.0.0.1:39230 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39234 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:39250 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42388 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42398 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42412 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42428 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42440 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42452 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:39:59 [loggers.py:123] Engine 000: Avg prompt throughput: 510.4 tokens/s, Avg generation throughput: 139.9 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.0%
(APIServer pid=1486881) INFO: 127.0.0.1:42468 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42474 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44032 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44048 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:40:09 [loggers.py:123] Engine 000: Avg prompt throughput: 157.6 tokens/s, Avg generation throughput: 148.1 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 56.0%
(APIServer pid=1486881) INFO: 127.0.0.1:44052 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44056 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44066 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44082 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44086 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44096 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44108 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44122 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44138 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44154 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44170 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:44182 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:46944 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:46952 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:46966 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:46976 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:46986 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:46994 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47000 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47004 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47010 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:40:19 [loggers.py:123] Engine 000: Avg prompt throughput: 1026.8 tokens/s, Avg generation throughput: 48.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 56.0%
(APIServer pid=1486881) INFO: 127.0.0.1:47018 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47028 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:47042 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36956 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36960 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36976 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36992 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:36998 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37010 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37012 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37016 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:40:29 [loggers.py:123] Engine 000: Avg prompt throughput: 656.8 tokens/s, Avg generation throughput: 131.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 55.8%
(APIServer pid=1486881) INFO: 127.0.0.1:37018 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:37022 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52672 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52674 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52690 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52706 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52718 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52734 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:40:39 [loggers.py:123] Engine 000: Avg prompt throughput: 498.1 tokens/s, Avg generation throughput: 137.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 55.6%
(APIServer pid=1486881) INFO: 127.0.0.1:52738 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52754 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52766 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:52770 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:43442 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:43456 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:43472 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:43480 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:43494 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:40:49 [loggers.py:123] Engine 000: Avg prompt throughput: 558.9 tokens/s, Avg generation throughput: 136.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 55.9%
(APIServer pid=1486881) INFO: 127.0.0.1:43510 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:43516 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:43532 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34492 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34508 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34520 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34528 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34540 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34550 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34564 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34570 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34586 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34594 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34604 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:40:59 [loggers.py:123] Engine 000: Avg prompt throughput: 742.8 tokens/s, Avg generation throughput: 113.1 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 55.9%
(APIServer pid=1486881) INFO: 127.0.0.1:34608 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34618 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34632 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34646 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34660 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:34676 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35158 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35162 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35172 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:41:09 [loggers.py:123] Engine 000: Avg prompt throughput: 497.9 tokens/s, Avg generation throughput: 140.2 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 55.7%
(APIServer pid=1486881) INFO: 127.0.0.1:35176 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35192 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35196 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35200 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35214 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35226 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35240 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:35242 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56286 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56298 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:41:19 [loggers.py:123] Engine 000: Avg prompt throughput: 522.1 tokens/s, Avg generation throughput: 92.6 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 55.7%
(APIServer pid=1486881) INFO: 127.0.0.1:56314 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56330 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56336 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56342 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56348 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56358 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56374 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56378 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56388 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56402 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56404 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:56408 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42310 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42326 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42336 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42338 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42352 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:41:29 [loggers.py:123] Engine 000: Avg prompt throughput: 984.1 tokens/s, Avg generation throughput: 106.9 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 55.5%
(APIServer pid=1486881) INFO: 127.0.0.1:42354 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42362 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42374 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42388 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42400 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:42406 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58470 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58482 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:41:39 [loggers.py:123] Engine 000: Avg prompt throughput: 578.8 tokens/s, Avg generation throughput: 137.6 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.2%, Prefix cache hit rate: 55.6%
(APIServer pid=1486881) INFO: 127.0.0.1:58496 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:58500 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:51414 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:51422 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:41:49 [loggers.py:123] Engine 000: Avg prompt throughput: 187.0 tokens/s, Avg generation throughput: 130.8 tokens/s, Running: 1 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.1%, Prefix cache hit rate: 55.7%
(APIServer pid=1486881) INFO: 127.0.0.1:51424 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:51428 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:51432 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:51436 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:51450 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:51452 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:54500 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:54506 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:54510 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:54520 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:54532 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO: 127.0.0.1:54534 - "POST /v1/chat/completions HTTP/1.1" 200 OK
(APIServer pid=1486881) INFO 10-29 02:41:59 [loggers.py:123] Engine 000: Avg prompt throughput: 651.7 tokens/s, Avg generation throughput: 126.8 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 55.6%
(APIServer pid=1486881) INFO 10-29 02:42:09 [loggers.py:123] Engine 000: Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 55.6%
(APIServer pid=1486881) WARNING 10-29 02:57:01 [launcher.py:98] port 8003 is used by process psutil.Process(pid=1486881, name='python', status='running') launched with command:
(APIServer pid=1486881) WARNING 10-29 02:57:01 [launcher.py:98] python -m vllm.entrypoints.openai.api_server --model /data_storage/shared/gjc/models/ALFWorld-Llama3.2-3B-Real/actor/global_step_938 --host 0.0.0.0 --port 8003 --max-model-len 4096
(APIServer pid=1486881) INFO 10-29 02:57:01 [launcher.py:101] Shutting down FastAPI HTTP server.
(APIServer pid=1486881) INFO: Shutting down
(APIServer pid=1486881) INFO: Waiting for application shutdown.
(APIServer pid=1486881) INFO: Application shutdown complete.