Yet another interesting PDF obfuscation
Release date: 09-01-2010
Some time ago I was researching techniques to evade the antivirus engines to not detect the JavaScript in malicious PDF files (see this one). I came to a conclusion that the unescape() is a very important keyword that the AV products catch on. I was able to reduce the number of unescapes() to one and this was enough to fool all of the AV products. I was unable to eliminated the last unescape though, because the eval() function did not work for me (athough I saw it being used in other PDF JavaScript exploits), and I was unaware of other methods to build the function name in a dynamic way. But it turns out there are methods to do it; we can learn them from the black hat community.
Below you will see a sample of JavaScript carved out of a PDF file generated probably by the LuckySploit exploit kit.

The first obvious thing to clean is to add new line after each semicolon, which will render the code to something like the below.
As we can see, the first part of the code is a sort of "dictionary" where strings and numbers are being substituted to variables. These variables (often after some concatenations) are then used in the proper code (you see the "unes" and "cape" ?). Let's use a script to automatically substitute those variables so the second part of the code is more readable.
Now we see much more. At the end of the code we can see that the code is trying to exploit a vulnerability in util.printf() function (CVE 2008-2992), but still the code is somewhat obfuscated and we don't see how the basic functions are being called. To clean this up, we have to remove the obfuscation where the a?b:c operator is used. On the screen capture, you can see a lot of them, and it is easy to remove them, because constants are used to evaluate the first expression (something like 1<2?a:b). Let's clean them up and we will see the following:
Now we can clearly see that even if the eval() function was not available for some reason, we can still call basic function (like our lovely unescape()) by using the following construction:
sc = app["doc"]["unescape"](shellcode);
Of course, we can obfuscate the "unescape" string in whatever method we want, for example just like in the above code. This could be very helpful for AV evasion.