Asher
commited on
Commit
·
146ac84
1
Parent(s):
c864579
doc: minor fix.
Browse files
README.md
CHANGED
|
@@ -294,7 +294,7 @@ You can build and run vLLM from source after merging this pull request into your
|
|
| 294 |
|
| 295 |
### Model Context Length Support
|
| 296 |
|
| 297 |
-
The Hunyuan A13B model supports a maximum context length of **256K tokens (262,144
|
| 298 |
|
| 299 |
#### Extending Context Length to 256K
|
| 300 |
|
|
|
|
| 294 |
|
| 295 |
### Model Context Length Support
|
| 296 |
|
| 297 |
+
The Hunyuan A13B model supports a maximum context length of **256K tokens (262,144 tokens)**. However, due to GPU memory constraints on most hardware setups, the default configuration in `config.json` limits the context length to **32K tokens** to prevent out-of-memory (OOM) errors.
|
| 298 |
|
| 299 |
#### Extending Context Length to 256K
|
| 300 |
|