Skip to content

Dicts: check maxSymbolValue < 255 failed, returning ERROR(dictionary_corrupted) #3724

@klauspost

Description

@klauspost

Describe the bug

I am building a custom dictionary builder.

When testing it on my current output:
dictionary.bin.gz

To Reproduce

Gunzip above. Place file at github_users_sample_set\user.0E6FBohGzg.json

Execute:

λ zstd -D dictionary.bin github_users_sample_set\user.0E6FBohGzg.json
zstd: error 11 : Allocation error : not enough memory

I have an older debug compiled zstd:

compress\zstd_compress.c: ZSTD_compress_insertDictionary (dictSize=4272)
compress\zstd_compress.c:4337: ERROR!: check maxSymbolValue < 255 failed, returning ERROR(dictionary_corrupted):
compress\zstd_compress.c:4435: ERROR!: forwarding error in eSize: Dictionary is corrupted: ZSTD_loadCEntropy failed
compress\zstd_compress.c:4835: ERROR!: forwarding error in dictID: Dictionary is corrupted: ZSTD_compress_insertDictionary failed
compress\zstd_cwksp.h: cwksp: freeing workspace
compress\zstd_compress.c:1123: ERROR!: check !dl->cdict failed, returning ERROR(memory_allocation): ZSTD_createCDict_advanced failed

So it seems like there is a restriction on the huffman tables - where maxSymbolValue must be 255 the provided table.

What is the reason for that. I would kind of expect it to be more efficient, if all symbols can't be represented? Could you explain a bit to me why this (undocumented?) limitation exists?

Expected behavior

I would kinda expect it to work?

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions