Since the headless version of Chrome has been released in 2017, I have been trying to create fingerprinting tests to distinguish real Chrome browsers from headless (and often automated) Chrome headless browsers. I have published two blog posts on this topic in August 2017 and in January 2018. While I tend to play this game on the defender side, my two blog posts have also started a cat and mouse game where companies and developers try to bypass the different techniques by overriding their crawler fingerprint.
In this spirit of cat and mouse game, I have recently come up with a new detection technique that renders all measures presented in these blog posts (post 1, post 2) or these libraries (library 1, library 2) useless. I may be overconfident concerning the false positive and false negative rates of my test, but I challenge you to prove me wrong by testing it on your most advanced crawlers.
Under the hood, I only verify if browsers pretending to be Chromium-based are who they pretend to be. Thus, if your Chrome headless pretends to be Safari, I won’t catch it with my technique, but because of the differences, it could be easily caught using many other fingerprinting techniques.
Ph.D. opportunity: if you are interested in doing a Ph.D. on browser fingerprinting, privacy, bot detection and/or web security, feel free to contact me by email.