xxHash 0.8.2
Extremely fast non-cryptographic hash function
Loading...
Searching...
No Matches
xxHash

xxHash is an extremely fast non-cryptographic hash algorithm, working at RAM speed limits.

It is proposed in four flavors, in three families:

  1. XXH32 family
    • Classic 32-bit hash function. Simple, compact, and runs on almost all 32-bit and 64-bit systems.
  2. XXH64 family
    • Classic 64-bit adaptation of XXH32. Just as simple, and runs well on most 64-bit systems (but not 32-bit systems).
  3. XXH3 family
    • Modern 64-bit and 128-bit hash function family which features improved strength and performance across the board, especially on smaller data. It benefits greatly from SIMD and 64-bit without requiring it.

Benchmarks

The reference system uses an Intel i7-9700K CPU, and runs Ubuntu x64 20.04. The open source benchmark program is compiled with clang v10.0 using -O3 flag.

Hash Name ISA ext Width Large Data Speed Small Data Velocity
XXH3_64bits() AVX2 64 59.4 GB/s 133.1
MeowHash AES-NI 128 58.2 GB/s 52.5
XXH3_128bits() AVX2 128 57.9 GB/s 118.1
CLHash PCLMUL 64 37.1 GB/s 58.1
XXH3_64bits() SSE2 64 31.5 GB/s 133.1
XXH3_128bits() SSE2 128 29.6 GB/s 118.1
RAM sequential read N/A 28.0 GB/s N/A
ahash AES-NI 64 22.5 GB/s 107.2
City64 64 22.0 GB/s 76.6
T1ha2 64 22.0 GB/s 99.0
City128 128 21.7 GB/s 57.7
FarmHash AES-NI 64 21.3 GB/s 71.9
XXH64() 64 19.4 GB/s 71.0
SpookyHash 64 19.3 GB/s 53.2
Mum 64 18.0 GB/s 67.0
CRC32C SSE4.2 32 13.0 GB/s 57.9
XXH32() 32 9.7 GB/s 71.9
City32 32 9.1 GB/s 66.0
Blake3* AVX2 256 4.4 GB/s 8.1
Murmur3 32 3.9 GB/s 56.1
SipHash* 64 3.0 GB/s 43.2
Blake3* SSE2 256 2.4 GB/s 8.1
HighwayHash 64 1.4 GB/s 6.0
FNV64 64 1.2 GB/s 62.7
Blake2* 256 1.1 GB/s 5.1
SHA1* 160 0.8 GB/s 5.6
MD5* 128 0.6 GB/s 7.8
Note
  • Hashes which require a specific ISA extension are noted. SSE2 is also noted, even though it is mandatory on x64.
  • Hashes with an asterisk are cryptographic. Note that MD5 is non-cryptographic by modern standards.
  • Small data velocity is a rough average of algorithm's efficiency for small data. For more accurate information, see the wiki.
  • More benchmarks and strength tests are found on the wiki: https://github.com/Cyan4973/xxHash/wiki

Usage

All xxHash variants use a similar API. Changing the algorithm is a trivial substitution.

Precondition
For functions which take an input and length parameter, the following requirements are assumed:
  • The range from [input, input + length) is valid, readable memory.
    • The only exception is if the length is 0, input may be NULL.
  • For C++, the objects must have the TriviallyCopyable property, as the functions access bytes directly as if it was an array of unsigned char.

Single Shot

These functions are stateless functions which hash a contiguous block of memory, immediately returning the result. They are the easiest and usually the fastest option.

XXH32(), XXH64(), XXH3_64bits(), XXH3_128bits()

#include <string.h>
#include "xxhash.h"
// Example for a function which hashes a null terminated string with XXH32().
XXH32_hash_t hash_string(const char* string, XXH32_hash_t seed)
{
// NULL pointers are only valid if the length is zero
size_t length = (string == NULL) ? 0 : strlen(string);
return XXH32(string, length, seed);
}
XXH32_hash_t XXH32(const void *input, size_t length, XXH32_hash_t seed)
Calculates the 32-bit hash of input using xxHash32.
Definition xxhash.h:2792
uint32_t XXH32_hash_t
An unsigned 32-bit integer.
Definition xxhash.h:515

Streaming

These groups of functions allow incremental hashing of unknown size, even more than what would fit in a size_t.

XXH32_reset(), XXH64_reset(), XXH3_64bits_reset(), XXH3_128bits_reset()

#include <stdio.h>
#include <assert.h>
#include "xxhash.h"
// Example for a function which hashes a FILE incrementally with XXH3_64bits().
XXH64_hash_t hashFile(FILE* f)
{
// Allocate a state struct. Do not just use malloc() or new.
assert(state != NULL && "Out of memory!");
// Reset the state to start a new hashing session.
char buffer[4096];
size_t count;
// Read the file in chunks
while ((count = fread(buffer, 1, sizeof(buffer), f)) != 0) {
// Run update() as many times as necessary to process the data
XXH3_64bits_update(state, buffer, count);
}
// Retrieve the finalized hash. This will not change the state.
// Free the state. Do not use free().
return result;
}
XXH64_hash_t XXH3_64bits_digest(XXH_NOESCAPE const XXH3_state_t *statePtr)
Returns the calculated XXH3 64-bit hash value from an XXH3_state_t.
Definition xxhash.h:6108
XXH3_state_t * XXH3_createState(void)
Allocate an XXH3_state_t.
Definition xxhash.h:5821
XXH_errorcode XXH3_64bits_reset(XXH_NOESCAPE XXH3_state_t *statePtr)
Resets an XXH3_state_t to begin a new hash.
Definition xxhash.h:5879
XXH_errorcode XXH3_64bits_update(XXH_NOESCAPE XXH3_state_t *statePtr, XXH_NOESCAPE const void *input, size_t length)
Consumes a block of input to an XXH3_state_t.
Definition xxhash.h:6063
XXH_errorcode XXH3_freeState(XXH3_state_t *statePtr)
Frees an XXH3_state_t.
Definition xxhash.h:5837
uint64_t XXH64_hash_t
An unsigned 64-bit integer.
Definition xxhash.h:820
Definition xxhash.h:1539