Table of Contents
Message Digest (hash) is byte in byte out #
A message digest is defined as a function that takes a raw byte array and returns a raw byte array (aka
byte). For example SHA-1 (Secure Hash Algorithm 1) has a digest size of 160 bit or 20 bytes. Raw byte arrays cannot usually be interpreted as character encodings like UTF-8, because not every byte in every order is a legal that encoding. So converting them to a
new String(md.digest(subject), StandardCharsets.UTF_8)
might create some illegal sequences or has code-pointers to undefined Unicode mappings:
Binary-to-text Encoding #
For that binary-to-text encoding is used. With hashes, the one that is used most is the HEX encoding or Base16. Basically a byte can have the value from
127 signed) which is equivalent to the HEX representation of
0xFF. Therefore, hex will double the required length of the output, that means a 20 byte output will create a 40 character long hex string, e.g.:
Note that it is not required to use hex encoding. You could also use something like base64. Hex is often preferred because it is easier readable by humans and has a defined output length without the need for padding.
You can convert a byte array to hex with JDK functionality alone:
new BigInteger(1, token).toString(16)
Note however that
BigInteger will interpret given byte array as number and not as a byte string. That means leading zeros will not be outputted, and the resulting string may be shorter than 40 chars.
Using Libraries to Encode to HEX #
You could now copy and paste an untested byte-to-hex method from Stack Overflow or use massive dependencies like Guava.
To have a go-to solution for most byte related issues I implemented a utility to handle these cases: bytes-java (GitHub)
To convert your message digest byte array you could just do
String hex = Bytes.wrap(md.digest(subject)).encodeHex();
or you could just use the built-in hash feature
String hex = Bytes.from(subject).hashSha1().encodeHex();