tests : update for LLAMA_SET_ROWS=1 #14961

ggerganov · 2025-07-30T07:39:20Z

target #14960

Extract the test updates from #14959 in a separate PR to be merged before enabling LLAMA_SET_ROWS=1 by default.

Test updates:

test-thread-safety
- limit the number of CPU threads per context to avoid hanging the system in some cases
- set cparams.n_seq_max = 1
embedding
- default to unified KV cache if -np is not specified
save-load-state
- default to unified KV cache if -np is not specified

ggml-ci

ggerganov · 2025-07-30T10:59:45Z

tests/test-thread-safety.cpp

+    // each context has a single sequence
+    cparams.n_seq_max = 1;
+
+    // prevent from launching too many threads
+    cparams.n_threads = std::min<int>(std::max(2u, std::thread::hardware_concurrency()/params.n_parallel), cparams.n_threads);
+


@slaren Small change to the test to make it compatible with split KV cache. Reduced the number of CPU threads because on the MacBook the process takes a long time (several minutes) to terminate (think it's some resource congestion when there are many threads started by the process, not sure).

This is a known issue with the thread pool implementation, using more threads than available will result in the threads spending more time spinning than doing work.

I am not convinced that it is good to ignore the parameters of the user to workaround what essentially is a bug. Can this be solved by running the test with -t 1?

Yes, -t 1 works. I was thinking to use -t 2 so we have context-level concurrency too. With -t 2 the test also runs cleanly on my devices.

ggml-ci

ggerganov mentioned this pull request Jul 30, 2025

llama : enable LLAMA_SET_ROWS=1 by default #14959

Merged

github-actions bot added testing Everything test related examples labels Jul 30, 2025

Base automatically changed from gg/graph-fix-stack-use-after-return to master July 30, 2025 10:52

ggerganov added 3 commits July 30, 2025 13:53

test-thread-safety : each context uses a single sequence

07d4b29

embedding : handle --parallel argument

d90b20d

ggml-ci

save-load : handle -np 1

d6233d6

ggml-ci

ggerganov force-pushed the gg/tests-update-for-set-rows branch from e1ebdea to d6233d6 Compare July 30, 2025 10:53

ggerganov commented Jul 30, 2025

View reviewed changes

thread-safety : avoid overriding threads, reduce test case arg

4e4c6a7

ggml-ci

ggerganov requested a review from slaren July 30, 2025 11:46

slaren approved these changes Jul 30, 2025

View reviewed changes

ggerganov merged commit 00131d6 into master Jul 30, 2025
54 of 55 checks passed

ggerganov deleted the gg/tests-update-for-set-rows branch July 30, 2025 12:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

tests : update for LLAMA_SET_ROWS=1 #14961

tests : update for LLAMA_SET_ROWS=1 #14961

ggerganov commented Jul 30, 2025

Uh oh!

ggerganov Jul 30, 2025 •

edited

Loading

Uh oh!

slaren Jul 30, 2025

Uh oh!

slaren Jul 30, 2025

Uh oh!

ggerganov Jul 30, 2025

Uh oh!

Uh oh!

Uh oh!

tests : update for LLAMA_SET_ROWS=1 #14961

tests : update for LLAMA_SET_ROWS=1 #14961

Conversation

ggerganov commented Jul 30, 2025

Uh oh!

ggerganov Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

slaren Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

slaren Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

ggerganov Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ggerganov Jul 30, 2025 •

edited

Loading