-
-
Notifications
You must be signed in to change notification settings - Fork 32.5k
Description
Proposal:
Cryptographic modules are one of the most annoying components to maintain because they are deeply interconnected but spread across multiple C modules. Most of CPython modules are well-contained but everything related to cryptography is actually spread around. At one point, I would like to suggest a single package holding the cryptographic primitives, similar to compression
package we now have. Currently we have:
hashlib
: high-level API for getting message digests and othershmac
: high-level API for HMAC_hmac
: C implementation of HMAC, backend is HACL*_hashlib
: C implementation of hash functions and HMAC, backend is OpenSSL._md5
,_sha1
,_sha2
,_sha3
: C implementation of hash functions, backend is HACL*._blake2
: C implementation of BLAKE-2, backend is HACL*.
The reason why I discriminated against BLAKE-2 is because we always prefer our implementation of BLAKE-2 because it is more versatile compared to OpenSSL's implementation which lacks personal identification & co. In addition, HACL* BLAKE-2 supports SIMD instructions (same as for HACL* HMAC) so it has different configuration options as well.
For testing, this becomes quite messy. I recently added an interface for blocking message digests for specific backends and it works well but it's still incomplete. I'm in the process of thinking how to configure the GIL_MINSIZE of hashes (see #91331) but this requires me to also identify hash functions by their family (in order to be able to distinguish them by module: for instance, SHA-224 and SHA-256 are both implemented in _sha2
as _sha2.SHA224Type
and _sha2.SHA256Type
respectively, or in _hashlib
as both _hashlib.HASH
instances).
Because of that, I end up changing hashlib_helper
every so often, and it tires me out. So I try to plan forward and introduce helpers and a better extensible structure. The provided interface is essentially based on imported objects (for instance, to create a SHA224 object, I can use hashlib.sha224
, _sha2.sha224
, _hashlib.openssl_sha224
, _hashlib.new("sha224")
or hashlib.new("sha224")
. When using new()
, the string used is what I call a "canonical hash name", and when using named constructor functions, I just need to know <module_name>.<method_name>
and so I can directly import the functions).
Anyway, this issue serves as a tracker for improving test.support.hashlib_helper
. I couldn't find any usage in the wild, and it's not documented, so I don't think I need to strive for maintainability. I will still make a NEWS entry though just in case.
Finally, the new helpers I introduced, and plan to introduce, are 3.15+ so I'm already in full conflict with 3.14. The good news is that cryptographic modules don't evolve fast, so usually, if there's a bug, it's either a security issue and I'll get conflicts up to the oldest security-only branch, or it's only 3 branches that I need to maintain which is fine for me (we already have conflicts between 3.14 and 3.13 for those modules...).
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response