The sha1 hash for 64test64xa is 6779c53432b8badf049bb9d8924a5785dd887243 which is 41 characters only using hexadecimal, 10digits and 6letters. But how long it would be if it was using the whole 26 letters in the latin alphabet? What if it also differentiated between UPPER and lower cases?
🇨🇦 tunetardis ( @tunetardis@lemmy.ca ) 9•1 year agoYou could try base64 maybe? The above would be: Z3nFNDK4ut8Em7nYkkpXhd2IckM= (28 chrs)
base64 uses A-Z, a-z, 0-9, and the + and / characters to encode 6 bits per character. That means you can encode every 3 bytes (or 6 hex) in 4 characters (since 3 * 8 bits = 4 * 6 bits). If the data are not a perfect multiple of 3 bytes, the last group of 4 characters gets padded out with = signs.
abhibeckert ( @abhibeckert@beehaw.org ) 2•1 year agoThere’s also a Base64URL variant that is a little more friendly in the modern world where the
+/=
often need escape sequences.The first two are replaced with more sensible characters and the third is just removed entirely - do you really need padding?
mozz ( @mozz@mbin.grits.dev ) 9•1 year ago- It’s a 160-bit hash, so using letters and numbers, it’d be log base (10+26) of 2^160, which is roughly 31. So 31 letters.
- Using upper and lower case, it’d be log base (10+26+26) of 2^160, or 27 letters.
- Don’t use SHA-1; use SHA-256
- Upper and lower case to represent SHA-256 would be log base (10+26+26) of 2^256, 43 letters
- Internally, it’s represented using 32 “letters” of 8 bits each, effectively using every possible ASCII character. The string representation is only of consequence when you’re exchanging it over a medium where it needs to be robust and human-readable, and probably the benefit from squeezing it down to fewer characters for that representation is not worth the cost in terms of making it unclear how you’ve chosen to squeeze it and making life difficult for people who are trying to convert to and from the format. Hexadecimal is a little bigger but it’s very clear and unambiguous what you’ve done, whereas using the full alphabet doesn’t have that property.
noot ( @noot@beehaw.org ) 5•1 year agoIt gets subtle when you consider Unicode. But you said latin alphabet, so you can look at just the UTF-8 section of this table, and assume 1byte = 1letter.
https://github.com/qntm/base32768#base32768
HTH
Gamma ( @GammaGames@beehaw.org ) English9•1 year agoI think we should consider Unicode, I want hashes that look like
lau52gj🍀pr18e🍅
xoggy ( @xoggy@programming.dev ) 4•1 year agohttps://www.unitconverters.net/numbers/decimal-to-base-36.htm
base 10 = 590741618446309885662238049322513167918815539779
base 16 = 6779C53432B8BADF049BB9D8924A5785DD887243
base 36 = C34WAO39N9K9XWPHW5W9XGRH0AHT0CG
sqgl ( @sqgl@beehaw.org ) 3•1 year agoHashing won’t fix sloppy typos/grammar.
How much would hash
digsetsdigests be shortened if the wholealphabelalphabet was used ? Melody Fwygon ( @Melody@lemmy.one ) English2•1 year agoYeah you can always take a hex hash output and convert it to Base64…which does conpress it significantly. Apply LZ Compression and boom.
🇨🇦 tunetardis ( @tunetardis@lemmy.ca ) 2•1 year agoApply LZ Compression and boom.
That would produce a binary stream. If that’s what OP wants, they could just leave the original hash in binary. And that would be unlikely to compress any further since hashes are, by their nature, high entropy already.
Tried to convert to base 64 and… it actually makes it longer. Why?
Melody Fwygon ( @Melody@lemmy.one ) English3•1 year agoYou didn’t convert a hex number into Base64, you Base64 encoded the hex string.
TL;DR, you used the wrong tool.
Whats the right tool? Cant seem to find one
Melody Fwygon ( @Melody@lemmy.one ) English1•1 year ago