Why but Why?
Ascon is the NIST lightweight competition (LWC) winner. I thought a good project would be to look into implementing this scheme in VHDL as a refresher on my digital logic design skills, and to attempt to glean what I had learned from a prior partial completion of the NIST Lightweight finalist Romulus-N Scheme which I had worked on about 2 years ago.
Introduction
The NIST Lightweight Cryptography Competition was a competition that was created in order to select a new standard of Authenticated Encryption and Associated Data (AEAD), and hashing functionalities that would be targeting constrained environments. Constrained environments could be defined as computing resources which may need a certain level of security but lack the level of computation power that are often seen in very secure cryptographic schemes. Hence these schemes focus on implementing cryptography with a limited set of computational methods which typically consist of substitutions, additions, xor operations for the permutations in the given cipher suite. This is in heavy contrast to say the candidates in the NIST Post-Quantum Cryptography Competition which can at times utilize DSP units for conventional multiplication which is quite large mathematical operations and increases critical path delay significantly. This coupled with the fact that we know that lightweight computing systems are typically strapped for resources there is no surprise that there will be a different approach to implementation.
Specification
The specification for LWC typically contains a description of the cipher suite from the high level all the way to the low level operations that the cipher implements. The beginning of the specification includes the description of the entire "family" of cryptographic functions that the candidate offers which typically contains hashing and/or authenticated encryption. Within these different family of functions, there also might be different parameters that can be changed to help accommodate for all types of lightweight target environments. The spec also defines the syntax that the designers will use in the document. The specification is the meat of the document for hardware designers, this is what we read and focus on in order to implement the cipher suite. Typically the specification will have a high level diagram seen below which will tie all the low level operations together and show you how Plaintext/Ciphertext is generated.
Figure 1. ASCON Encryption and Decryption Scheme Diagram
The and functions that are denoted are the intermediary and finalization permutations that are used respectively for encryption/decryption. In this case the different letters and are using different number of rounds of processing this varies for your choice of offered encryption schemes and which are the primary and secondary recommended configurations respectively. The permutation operation is the most fundamental of the operations that are made. Obviously there are other operations that are important to understand such as the Initialization and Finalization stages however those are typically quite easy to implement when compared to the permutation which is quite complex.
Permutation
The main component of difficulty to implement in the LWC cipher suite to implement is the permutation which is denoted as as discussed previously. In subsequent blogs we will look at things more in depth. I will aim on focusing on the implementation of the permutation function, and describing the main high level operations that are done in the variant permutation.
Controller and High-Level Wrapper
In all honesty the most difficulty I have had in implementing LWC systems have been getting the high level Algorithmic State Machine to work properly which is the high level state machine that handles all intermediate control signals in the datapath. In the design method that I use to implement LWC cryptosystems, I employ a method where I divide datapath which consists of (combinational logic, registers, main logic components) from the controller which is the state machine which controls all intermediary logic signals to datapath components such as registers, multiplexers, and etc. There is one exception to this rule, sometimes what might occur is that the permutation function will actually have its own nested datapath and controller to simplify the high-level controller to avoid adding a plethora of additional states. More will come on this later as we delve into the methods of madness.