What you need to know about WAF evasion techniques before we start is that this is a topic that is VERY hard to describe properly. WAFs are super diverse and research into them is sparse. All of this is because a WAF can be configured just like any networking component. The configuration can differ from target to target and this is a real challenge. We will first explore how WAFs work so we can design a proper attack technique. You need to know your enemy before you can fight it.
WAFs usually consist of several stages but not all of them have the same stages. Some WAFs don't have a normalization stage for example which makes them vulnerable to simple encodings like base64 or HEX of the payload. Some might even be missing the pre-processor if they are a bit less advanced and they might only have the input validation for example.
The pre-processor stage consists of putting all data in the same format and trying to understand what we are dealing with. WAFs can get several kinds of traffic coming through their filters like HTTP or HTTPs but also GET or POST or even if the body is made of JSON data or simply consists of parameters. All of these requests need to be analysed to know where the user submitted data is so we can check that later on. The pre-processor will also try to detect outliers such as malformed URLs and will discard them before they reach the input validation. We do all of this because input validation is an intensive process which takes a long time and we want to make sure we only validate what we need to.
Normalization will try to decode any UTF or encoded values to normal text. Researches have found it that they can trick the input validation by encoding their attack vectors into base64 for example. WAF builders caught onto this and started implementing normalization stages to combat this but this doesn't mean there are not more vulnerable WAFs out there. Even with the normalization stage existing, some WAF developers choose to not implement it to keep their costs down and it will also take a long time before all old WAFs out there will be updated to include the current normalization techniques.
The input validation seems very logical as it simply validates our input but validating input is never simple. For example we might ban the word "javascript" completely but if someone wants to write a text about javascript he's going to have a bad time.
Up until recently the input validators were pretty dumb. They would only check what they were told to check and while humans have been getting pretty good at setting up these rules that the validators follow, we are missing a lot of context. Every company is different and maybe needs a slightly different approach. Recently AI technologies have been rising up and they will look for patterns. Those are the cutting edge WAFs and that's where we can get some cutting edge research done if that is interesting to us.
The input validation can either be done by validating every request and response with either a ruleset which is simply a config file containing our rules. For signature based input validation we can every request and response based on all signatures available. As soon as a new attack is detected a new signature is sent out to all WAFs that are subscribed to the service. More about this later.
A WAF can be deployed in several modes, many hunters do not know this yet but the way a WAF is deployed will also slightly alter our attack technique. (More on this later)
The big advantage of this method is that we can easily create a secure environment without having all the knowledge about WAFs in-house. We also don't need to worry about hosting another server since the WAF operates in the cloud.