Message authentication code
Fundamentals
Definition
A message authentication code (MAC) is a cryptographic primitive that provides assurance of the integrity and authenticity of a message by generating a short, fixed-length value known as a tag, computed from the message and a shared secret key. This tag allows a recipient who possesses the same key to verify whether the message has been altered or originated from an unauthorized party. Formally, a MAC is defined as a family of functions parameterized by a symmetric key, where each function processes an input message of arbitrary length to produce an authenticator (the MAC tag) that is computationally infeasible to forge without knowledge of the key.[1] A MAC scheme is typically specified as a tuple of three probabilistic polynomial-time algorithms: . The key generation algorithm takes a security parameter and outputs a secret key from a key space . The tagging algorithm takes the key and a message from the message space (usually ) and outputs a tag from a tag space . The verification algorithm takes the key , message , and tag , and outputs 1 if the tag is valid for the message under the key (accepting it as authentic), or 0 otherwise (rejecting it). For security, the scheme requires that with overwhelming probability for honestly generated keys and messages.[8] Unlike digital signatures, which rely on asymmetric key pairs to enable public verification and provide non-repudiation, MACs utilize symmetric keys shared exclusively between the communicating parties, making them suitable for scenarios where key distribution is secure but public key infrastructure is unnecessary.[9]Terminology
In cryptographic literature, the output of a message authentication code (MAC) algorithm is commonly referred to as a "tag," a short fixed-length string appended to the message to enable verification, though it is sometimes interchangeably called a "code" or "MAC value" in older texts.[10][11] The term "tag" has become prevalent in modern standards and papers to emphasize its role as an authenticating appendage, distinct from the MAC scheme itself. Certain MAC constructions, especially those built on hash functions, are described as "keyed hashes," where a secret key is integrated into the hashing process to produce the tag, ensuring the output depends on both the message and the key. In MAC designs relying on universal hashing, the term "universal hash" denotes a family of hash functions with provable collision resistance properties under random key selection, foundational to unconditionally secure authentication schemes.[12][13] The nomenclature has evolved historically from "authentication code," as used in seminal works on universal hashing in the late 1970s and early 1980s, to the more precise "message authentication code" adopted in international standards beginning with the first edition of ISO/IEC 9797 in 1989, which formalized MAC algorithms for data integrity.[13] This shift reflects a focus on message-specific authentication in symmetric cryptography.[14] The primary acronym MAC expands to "message authentication code," denoting a symmetric-key primitive that generates a verifiable tag for a given message. HMAC, standing for "hash-based message authentication code," is a widely adopted variant that applies a cryptographic hash function (such as SHA-256) in a key-dependent manner to produce the tag. In MAC definitions, the term "message" encompasses any arbitrary binary string or data block, including non-textual content like files or network packets, rather than being restricted to human-readable text.[1] This broad scope allows MACs to protect diverse data formats in practical applications.[4]Security Properties
Integrity and Authenticity
A message authentication code (MAC) ensures integrity by generating a tag that depends on the entire message content and a secret key, such that any modification to the message will result in a mismatch during verification, thereby detecting alterations.[4] This protection arises because the MAC algorithm is designed to produce a unique output for each distinct input-message pair under the same key, making unauthorized changes computationally infeasible to conceal without knowledge of the key.[4] MACs also provide authenticity by confirming that the message originates from a party possessing the shared secret key, as only the legitimate sender can generate a valid tag that the receiver will accept upon verification.[4] This prevents impersonation attacks, where an adversary attempts to forge a message-tag pair, since producing a correct tag without the key violates the underlying cryptographic assumptions of the MAC construction.[15] The primary formal security property for MACs is existential unforgeability under chosen-message attack (EUF-CMA), which captures both integrity and authenticity in an adversarial setting.[15] In this model, an adversary is given oracle access to a MAC generation function for adaptively chosen messages and succeeds by producing a valid (message, tag) pair for a new message not previously queried; a secure MAC ensures no efficient adversary achieves this with more than negligible probability.[15] The adversary's resources, including running time, query count, and total message length, bound the insecurity probability, emphasizing resistance even against powerful attackers with partial information.[15] In probabilistic MACs, randomness plays a crucial role through the use of a nonce or initialization vector (IV), which is included in the input to ensure tag uniqueness and prevent replay attacks where an intercepted valid message-tag pair is reused. By requiring unique nonces for each message, these MACs render replays invalid, as verification fails if the nonce does not match the expected fresh value, while maintaining EUF-CMA security under the condition that nonces remain distinct.Key Security Goals
Message authentication codes (MACs) are designed to ensure the integrity and authenticity of messages but do not provide confidentiality, as the underlying message remains unencrypted and visible to any observer, with the MAC tag serving only as a verification mechanism rather than an obfuscation tool.[2] This limitation means that MACs must be combined with separate encryption schemes, such as in authenticated encryption modes, to protect message secrecy. A critical security goal for MACs is resistance to key recovery attacks, where an adversary observing multiple valid message-tag pairs cannot computationally feasibly extract the secret key used to generate the tags.[16] This property is foundational to MAC security models, ensuring that even under adaptive chosen-message attacks, the key remains indistinguishable from random, preventing the generation of arbitrary forgeries beyond the unforgeability bound.[17] To address replay attacks, where an adversary resubmits a previously valid message-tag pair to deceive the verifier, MAC constructions often incorporate nonces or timestamps into the authenticated data, ensuring that each tag is unique to a specific context and cannot be reused effectively.[18] Without such mechanisms, a deterministic MAC would accept replayed tags as valid, compromising the protocol's freshness guarantees.[18] In the context of quantum threats, MACs based on symmetric primitives like hash functions maintain security against Grover's algorithm by doubling the effective key and tag lengths to preserve pre-quantum security levels, as quantum search reduces brute-force complexity quadratically but does not break the core pseudorandomness assumptions. As of 2025, standards such as NIST's guidelines for post-quantum cryptography recommend these adjustments for hash-based MACs like HMAC, without requiring entirely new constructions for most symmetric authentication needs.[19][20] MAC designers face trade-offs in tag length, where shorter tags—while improving efficiency—reduce the security margin against brute-force forgery attempts, potentially allowing an adversary to guess a valid tag with non-negligible probability after sufficiently many trials.[21] For instance, tags below 128 bits may expose systems to practical attacks in high-volume scenarios, necessitating longer outputs to align with desired security levels like 128-bit resistance.[22]Constructions
Hash-Based MACs
Hash-based message authentication codes (MACs) construct authentication tags by leveraging cryptographic hash functions as the primary primitive, typically involving a secret key to ensure security. A simple approach nests the key and message within the hash, such as $ \text{MAC}_k(m) = H(k | m) $, where $ H $ denotes the hash function, $ k $ is the secret key, $ m $ is the message, and $ | $ represents concatenation.[16] However, this naive construction is vulnerable to length-extension attacks, where an adversary, given a valid tag for message $ m $, can compute a tag for an extended message $ m' = m | \text{pad} | \text{extension} $ without knowledge of $ k $, exploiting the iterative structure of Merkle-Damgård hash functions like MD5 or SHA-1.[16] To address these vulnerabilities, the HMAC construction was developed, providing a robust method for keying hash functions. HMAC is defined as
where $ \overline{k} $ is the key $ k $ padded to the block size of $ H $, $ \text{ipad} = 0x36 $ repeated across the block, and $ \text{opad} = 0x5c $ repeated across the block.[23] The inner hash computes $ H((\overline{k} \oplus \text{ipad}) | m) $, which processes the message after key-dependent padding, while the outer hash applies the outer padding to produce the final tag. This double-nesting with XOR-based padding prevents length-extension attacks by ensuring that the intermediate value is not directly reusable for extensions, as the outer key modification disrupts the hash's internal state.[16]
The security of HMAC reduces to the collision resistance of the underlying hash function $ H $. Specifically, if $ H $ is weakly collision-resistant, then HMAC is a secure MAC, with the proof establishing that any forgery against HMAC implies a collision in $ H $, under the assumption that the hash behaves as a pseudorandom function when keyed appropriately.[16] This reduction holds for iterative hash functions, making HMAC provably secure as long as $ H $ resists collisions, even if it is not a full ideal hash.[16]
Hash-based MACs like HMAC offer significant advantages in efficiency, particularly for messages of arbitrary length, as they avoid the block size limitations inherent in cipher-based alternatives and require only a single pass over the data after the inner hash.[23] Their simplicity and compatibility with widely available hash functions further enhance their practicality for software implementations.[16]
Block Cipher-Based MACs
Block cipher-based message authentication codes (MACs) adapt standard encryption modes to produce authentication tags from a secret key and a message, leveraging the underlying block cipher's pseudorandom properties. The most straightforward construction is the cipher block chaining (CBC) MAC, designed for fixed-length messages that are exact multiples of the block size. In CBC-MAC, the message $ m = m_1 \parallel m_2 \parallel \cdots \parallel m_t $, where each $ m_i $ is a block, is processed sequentially: the first block is encrypted directly with the key $ k $, and each subsequent block is XORed with the previous ciphertext before encryption. The final ciphertext block serves as the MAC tag: $ \text{MAC}k(m) = C_t $, where $ C_1 = E_k(m_1) $, $ C_i = E_k(m_i \oplus C{i-1}) $ for $ i = 2, \dots, t $, and $ E $ is the block cipher.[24] This mode initializes with a zero IV and requires no padding, ensuring efficiency for predefined message lengths.[24] However, CBC-MAC is insecure for variable-length messages due to length-extension attacks, where an adversary can forge tags by appending blocks without knowing the key. To address this, extensions like CMAC (also known as OMAC1) introduce subkey generation and conditional padding to handle arbitrary lengths securely. In CMAC, two subkeys $ K_1 $ and $ K_2 $ are derived from the master key $ k $ by encrypting a zero block and applying left shifts with XORed constants based on carry bits, typically using a 128-bit block cipher like AES for derivation. The message is padded to the block length: if the last block is full, it is XORed with $ K_1 $; if partial, it is right-padded with a '1' bit and zeros, then XORed with $ K_2 $. The CBC chaining then proceeds as in basic CBC-MAC, with the final block as the tag, truncated if needed. This construction provably achieves security for messages of any length, with a birthday bound on forgery probability when the tag is shorter than the block size.[25][26] For scenarios demanding higher throughput, such as hardware implementations with multiple processing units, parallelizable variants like PMAC offer an alternative to sequential chaining modes. PMAC partitions the message into blocks and applies the block cipher in parallel to each, using a key-derived tweak sequence (often based on a Gray code progression) to XOR with blocks before encryption; the results are then summed modulo 2 in the finite field of the block size to produce the tag. This allows all but the final (padded) block to be computed concurrently, reducing latency in parallel environments while maintaining provable security. PMAC requires approximately one block-cipher invocation per message block, similar to CBC-MAC, but its parallelism suits high-speed applications.[27] The security of these block cipher-based MACs reduces to the pseudorandom function (PRF) security of the underlying block cipher, ensuring that distinguishing the MAC from a random function requires roughly $ 2^{b/2} $ queries, where $ b $ is the block size (typically 128 bits for modern ciphers like AES). Forgery attacks are thus infeasible under standard models, with the authentication tag length usually matching the block size at 128 bits to achieve full security strength, though truncation to 64 or 96 bits is sometimes used with adjusted bounds.[24][25][27]Specialized Variants
One-Time MACs
One-time message authentication codes (MACs) are cryptographic primitives designed to ensure message integrity and authenticity under the restriction that each key is used to authenticate at most one message, providing security guarantees that hold only for this single-use scenario.[28] This approach contrasts with multi-use MACs by relying on information-theoretic principles rather than computational hardness assumptions, making it suitable for settings where unconditional security is desired.[13] The foundational construction, introduced by Wegman and Carter, employs a strongly universal hash family $ H $ to compute a tag for a message $ m $.[28] Specifically, given a key $ k $ that selects a hash function $ h_k \in H $, and a random one-time pad $ s $ from the tag space $ T $, the tag is generated aswhere $ \oplus $ denotes bitwise XOR, and $ h_k(m) $ maps the message to an element in $ T $.[28] Verification by the receiver, who shares $ k $ and $ s $, recomputes $ h_k(m) \oplus s $ and checks equality with the received tag.[28] The strongly universal property ensures that for any two distinct messages, the hash values are nearly uniformly random and independent, bounding the adversary's forgery probability.[28] In theoretical applications, one-time MACs achieve information-theoretic security, where the probability of an adversary forging a valid tag for a new message is at most $ \epsilon + \frac{1}{|T|} $, with $ \epsilon $ determined by the hash family's universality parameter and $ |T| $ the tag space size, without relying on unproven computational assumptions.[28] This makes them ideal for analyzing secure communication in models with unlimited computational power, such as in quantum-resistant cryptography or protocol proofs requiring perfect security.[13] However, their single-use requirement leads to key exhaustion after one authentication per key pair, rendering them impractical for scenarios involving multiple messages without key generation overhead.[28] Additionally, the tag length must match the hash output size, often necessitating large keys and pads that increase communication costs.[28]