Skip to content

Tags: CodeLinaro/llama.cpp

Tags

b6029

Toggle b6029's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
embeddings: fix extraction of CLS pooling results (ggml-org#14927)

* embeddings: fix extraction of CLS pooling results

* merge RANK pooling into CLS case for inputs

b5797

Toggle b5797's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
ci : disable fast-math for Metal GHA CI (ggml-org#14478)

* ci : disable fast-math for Metal GHA CI

ggml-ci

* cont : remove -g flag

ggml-ci

b5752

Toggle b5752's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
batch : fix check for empty sequences in memory (ggml-org#14364)

* batch : fix check for empty sequences in memory

ggml-ci

* cont : reuse the var

ggml-ci

b5689

Toggle b5689's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
cmake: remove shader-gen step-targets from ggml-vulkan (ggml-org#14226)

* Remove step-targets from vulkan-shaders-gen

* Unset DESTDIR when building vulkan-shaders-gen

b5686

Toggle b5686's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
common : suggest --jinja when autodetection fails (ggml-org#14222)

b5627

Toggle b5627's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
llama : support GEGLU for jina-bert-v2 (ggml-org#14090)

b5548

Toggle b5548's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
CUDA: fix typo in FlashAttention code (ggml-org#13926)

b5460

Toggle b5460's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
release : fix windows hip release (ggml-org#13707)

* release : fix windows hip release

* make single hip release with multiple targets

b5255

Toggle b5255's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
ci: fix cross-compile sync issues (ggml-org#12804)

b5098

Toggle b5098's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
convert : ability to lazy-load safetensors remotely without downloadi…

…ng to disk (ggml-org#12820)

* gguf util : add SafetensorRemote

* fix style

* convert: add --remote option

* convert : allow using lazy remote tensors

It's a bit slow for now since everything is blocking and single-threaded.

* correct metadata.name

* small style fix

* support HF_TOKEN

* convert : use writeable buffer for remote lazy tensors

* convert : fix flake8 lint regarding lamdba assigment

* multithreaded download

* multithread: print debug

* fix style

* Revert "multithreaded download"

This reverts commit 42fc895.

* bring back _get_request_headers

---------

Co-authored-by: Francis Couture-Harpin <[email protected]>