SiteExperts.com Logo Home | Community | Developer's Paradise | Jobs
User Groups | Site Tools | Site Information | Search

Inside Technique : Hiding HTML/SCRIPT... I think it IS possible! : Covering the Tracks

You may be thinking to yourself "This is all good and well, it's difficult to make a proper request to obtain the real code, and it looks a lot like the fake code. But the browser has already done the work of downloading the real code, I don't need to go back out to the website". Along the same lines is the famous Holophrastic quote (regarding image protection) "if I can see the image, it's already mine." (here) This is true, especially from a practical standpoint. Let's pretend for a second I did come up with a way to hide JavaScript source with no chance of cracking it. So what? You still see what the code does, you can always obtain the document's current HTML and work your way back to an equivalent source code, and Images can always be screen captured. However, this isn't real life, it's a game. And I control the rules! Rule #1 the objective is not to recreate the hidden code's functionality. It is to find the original code in its complete unencoded form. Rule #2 it is my unquestionable porogotive to write any code in such a way that it will only properly execute in only the environment of my choosing. And I choose Internet Explorer 6. Obviously my rules cannot and do not prevent challengers from trying with another User Agent, but I'm 99.99% sure that it will not properly execute in any other user agent without alteration*.

* Actually the code executes perfectly fine on all Win32 versions IE5.5 and above, you just have to change IE's user agent string to contain "MSIE 6" so that the right code is downloaded. I could have even made it work with IE5.0 but I only wanted to run extensive tests with 1 browser version so I made the arbitrary decision of IE6 only :)

There are 2 primary reasons I chose MSIE6 as the only browser with the ability to run the hidden code:

  1. I know what to expect. I've done a lot of work with Internet Explorer, and I know it's Scripting Object Model very well. I was confident that when limited to Internet Explorer I could erase all traces of the real code and replace them with the fake code. As it turns out I was wrong, I overlooked a <script> element that contained the JScript.Encode version of the real code. BachusII discovered this which ultimately led him to the solution. In the beginning of the development of this game I had attempted to make it work with Mozilla (I tested with FireFox 0.8), but could not make it uncache the real script once it was done executing. I knew then that I had to make it Microsoft Specific.
  2. Internet Explorer is a Microsoft Owned, closed source browser. Custom debug build versions of Internet Explorer and JScript engine don't (or at least shouldn't) exist outside of Microsoft*. With Mozilla for instance, a log of every JavaScript statement parsed by the engine could be made. Maybe the same thing could be done with the JScript engine, but that information is not readily available. In fact there doesn't seem to be any public documentation on the inner workings of JScript, unlike MSHTML. This mysterious void works heavily in my favor.

    * No debug versions of IE are available, but an IE debugger is. I have a defense against that too though which I'll cover later on.

There are a couple of ways I attempt to thwart other browsers from downloading, thus caching the real code. The first level of defense is also the easiest to bypass: Server Check of the HTTP User-Agent variable. All popular browsers have a way of changing the User Agent that isn't too difficult. It's even easier to set when working with a programmatic HTTP component/library. Nevertheless, every bit helps and a check for "MSIE 6" is in place.

The second check happens client side in hide.asp. I'm don't know if anyone thought to do this but I was afraid that someone would attempt to take a Mozilla browser and configure it to appear as though it were Internet Explorer. Then they could bypass all my IE hacks and have a cached version of the real code. Aside from the user agent string and other client side Identification properties that can be spoofed by Mozilla, Internet Explorer specific objects like document.all and swapNode() could also be easily added to Mozilla's Object model. I needed something that would be very complicated to replicate, something that would be dynamic. I ended up querying the highly proprietary document.namespaces collection for a namespace with a randomly generated urn property. It's not foolproof but it would take a lot of work to make a non Internet Explorer browser evaluate this statement as true without throwing exceptions.

if(document.namespaces[0].urn=="urn:pRandomNumber")

So that's the deal with the custom XML namespace. I just didn't want the real code to have a chance to be downloaded by a non Internet Explorer browser.

It could be argued that using Windows Script Encoded JScript also adds a level-in fact, the ultimate level of defense against Non IE Browsers running the real code. However this does me no good. If a non Internet Explorer browser even has the chance to evaluate the real code I've already lost. The browser will cache it, throw some exceptions over the JScript.Encode encoded script and leave it in memory for easy viewing. I designed the client side script of hide.asp so that only Internet Explorer has the chance to make a proper request to hidden.asp thus downloading the real code.

So now that I've gained at least a small amount of assurance that the real code will only run in Internet Explorer, how do I make IE cover my tracks? Let's take a closer look at the real code .

var x = document.body.appendChild(document.createElement('div'));

x.addBehavior('#default#download');

x.startDownload(wtf.src,function(){});

This is a simple use of the Download Behavior, one of the default behaviors introduced with Internet Explorer 5. Out of all the default behaviors, I never could really figure out why Microsoft felt they had to include it. It's functionality is duplicated, even superseded by common techniques like dynamically changing the src attribute of IFRAMEs and SCRIPTs. And the entire purpose of the MSXML XMLHTTP Component is to provide an easy to use yet powerful object to make HTTP download requests. XMLHTTP allows you to download just about any URL whereas the download behavior can only be used to download within the same domain.

Having said that I recently discovered some unique aspects of the download behavior that are especially useful for Script hiding. I was going to try and explain these features but as I tried to describe them I realized it was really long and didn't make much sense. So instead I'll use a code Illustration! Play along at home if you have IIS. I'll take you through a partial evolution of the challenge's "real code". Along the way you'll discover how to force the Temporary Internet Files cache to behave! I actually tried dozens of other ideas, but none of them produced anything significantly different from the following examples we're gonna run. On with the code!

Follow these easy steps!

  • Create a new IIS Application. Yes it has to be an application, not just a virtual directory! We're partially recreating the challenge and need to make use of our own global.asa and Session object. I don't know about you but I'm calling mine "cachetest" You don't have to call yours "cachetest" but I'm not responsible for changing the URLs I'll be providing. That's your job.

  • All these files should be places right in the root:

  • This is cache test #1, follow these easy steps

    1. Fire up Internet Explorer and navigate to about:blank.
    2. Clear your Temporary Internet Files.
    3. Ensure the setting "Check for newer versions of stored pages:" (Tools->Internet Options->"Settings" Button located in the "Temporary Internet Files" grouping) is set to "Always".
    4. Rename "SimpleHidden1.asp" to "SimpleHidden.asp"
    5. Now navigate to http://localhost/cachetest/simplehide.asp

    At this point your browser should be locked. Sorry but I warned you ahead time. You did read through the code and comments before blindly executing it, didn't you? :)

    Here is what caused this behavior:

    var s = wtf.src;
    wtf.src = '';
    wtf.src = s;

    So why does it lock up? The src attribute on the script element is blanked, then set back to its original value. This makes IE Request the script again, resulting in fake code being sent, right? Wrong. It turns out if you change the src property to something then set it back to it's original value within the same synchronous process Internet Explorer will never attempt to re-download the script. It merely reexecutes what was already there. In this case it means an infinite loop is created, hence the lockup. Now look in your Temporary Internet files. If you take a look in hidden.asp?pass=guid you'll see that real code is cached, not the fake code.

    This is cache test #2, follow these steps

    1. Close all Internet Explorer Instances that refer to simplehide.asp or simplehidehidden.asp
    2. Navigate to about:blank.
    3. Clear your Temporary Internet Files.
    4. Rename "SimpleHidden.asp" to "SimpleHidden1.asp"
    5. Rename "SimpleHidden2.asp" to "SimpleHidden.asp"
    6. Now navigate to http://localhost/cachetest/simplehide.asp

    The browser didn't lock up this time, why? In Internet Explorer when something doesn't work as expected many times a setTimeout is needed to break in a synchronous process. This break allows Internet Explorer a chance to some things that it can only do when script is not actively running.

    var s = wtf.src;
    wtf.src = '';
    setTimeout(function(){wtf.src=s;},0);

    In this case it gives IE a chance to realize that the src attribute has been set to ''. So now when the src attribute is set back to its original value IE doesn't just take the shortcut approach of reexecuting the already resident script. Instead it obeys the caching rules. Since the current rule is "Automatic" a new version is obtained from the server, overwriting the in-memory script of wtf and replacing the contents of the cached file residing in "Temporary Internet Files".

    Unfortunately this only works with cache settings set to "Always". While not uncommon, it isn't even the default setting. Also another weakness is prone to setTimeout: to see it hold down the F5 key to force a lot of refreshes in a small period of time. Look in your Temporary Internet Files, there should be many hidden.asp's. For me only about 1 out of every 3 was the fake code. Because setTimeout is asyncronous, it doesn't have to be executed and can be preempted by a refresh. In order for this to work for our purposes we need a new version downloaded every time no matter what.
    Meet cache test #3.

    1. Close all Internet Explorer Instances that refer to simplehide.asp or simplehidehidden.asp
    2. Navigate to about:blank.
    3. Clear your Temporary Internet Files.
    4. Rename "SimpleHidden.asp" to "SimpleHidden2.asp"
    5. Rename "SimpleHidden3.asp" to "SimpleHidden.asp"
    6. Now navigate to http://localhost/cachetest/simplehide.asp

    No locking up this time either that's a good sign. Take a look in your Temporary Internet Files. What's this? There is no cached file. Check out the code:

    var x = document.body.appendChild(document.createElement('div'));
    x.addBehavior('#default#download');
    x.startDownload(wtf.src,function(){});

    I can't explain it either... Well I can explain that instead of setting of setting the src attribute of the script element the download behavior forces a re-download. There's no no-cache header set, no expires header set. But for some reason it's not there. Another oddity, open another instance of Internet Explorer and point it to http://localhost/cachetest/simplehide.asp. Now look back at your Temporary Internet Files, an entry for hidden.asp does exists now. Open it up and see that in fact the fake code has been cached. Go ahead and change your cache settings: Automatically, Never, it doesn't matter. The behavior is exactly the same. Interestingly matter how many instances you have open, 1 and only 1 cached file exists. It's very consistent but I have know idea why it works the way it does.

    Another thing that strikes me as odd, most times when a callback function is specified instead of a return value it indicates an asynchronous process. setTimeout is a perfect example of this. However unlike simplehidden2, holding down F5 does not preempt the callback function from running (or at the very least, the download process).

    That does it for the code illustration. We now have a neat little trick for guaranteeing that the cache won't give away the real code and may even help lend credibility to the code deception (If 2 are open and the fake code is cached).

    Next we have to take care of in-memory variables that may give the real code away. DOM viewers, debuggers, Bookmarklets, even statements typed directly into the address bar can all peer into document properties.

    By resetting the src property of the script element "wtf", the script text is also reset. Though in hindsight this would have been more sneaky:

    var x = document.body.appendChild(document.createElement('div'));
    x.addBehavior('#default#download');
    x.startDownload(wtf.src,function(s){wtf.text = s;});

    Instead of blanking the text/innerHTML property of the script element the fake code takes it's place. Leaving the attacker thinking that the fake code was there all along rather than being suspicious that the script element's code has been erased (this actually happened). Of course this means another script element will be created. As it is I overlooked one more script element was created by the encoding process:

    document.write('<script language=jscript.encode>#@ ...

    This creates a script element that contains the JSCRIPT.ENCODE version of the real script. This could have been taken care of with a simple

    document.scripts[3].removeNode(true);

    <- Previous (The Deception) Index Next -> (Script Debugger)