Comment

Two recent projects have come out that attempt to address the “dynamic script loader” use case: HeadJS and ControlJS. Since I’m the creator of LABjs, a general, all-purpose, performance-oriented dynamic script loader that’s been around for about a year and a half now, and is well-known enough to be in use on several major sites, including Twitter, Vimeo, and Zappos, many people ask my opinions when new entries into this space arise.

I’ve been hesitant to rush to negative judgement on either one, because I believe it’s important to encourage experimentation and progress in this area, for the sake of the greater web. But I do think it’s important to shed a little bit of light onto both projects and explain some concerns I have with their approach. I respect the authors of both libraries, so I hope they will take my ramblings here as constructive criticism rather than an attack.

ControlJS: backstory

ControlJS comes from the immensely experienced and talented Steve Souders, who leads performance efforts for Google. For readers who aren’t aware, when I first was building LABjs back in 2009, Steve stepped in and offered some very helpful and timely advice and collaboration, and I credit much of LABjs’ success since to Steve’s wisdom and experience.

Recently, though, Steve has focused in a different direction than was necessarily the focus back then. His focus is now more aimed at delaying all script loadings until after the rest of the page content has loaded, whereas LABjs’ goal was to simply allow easy parallel loading of scripts (instead of blocking behavior) right along side the rest of the page content. In my opinion, there are some scripts which are less important, like Google Analytics, social sharing buttons, etc. And I whole heartedly agree with “deferring” that code’s loading until “later”, to not take up precious valuable bandwidth/connections/cpu.

But there’s also a lot of JavaScript that is just as important as the content it decorates, and to me, the idea of delaying that code by a noticeable amount will lead, in general, to proliferation of a distasteful visual effect I call “FUBC” (Flash of Un-Behaviored Content). In other words, we’ll see pages that flash up raw text content, only to swap out a moment or two later into the widgetized and JavaScript stylized version of the page, where the content is in fancy tab-sets, modal dialogs, etc.

To be clear, any dynamic script loader can create this effect unintentionally, including LABjs. But what ControlJS appears to be intentionally doing is delaying scripts even longer past when content is visible/useable, which will exacerbate this FUBC problem quite a bit.

There are of course ways to mitigate this, using default/inline CSS rules and <noscript> tags (a technique I talk about on the LABjs site), but if you hide all the raw content with CSS and don’t re-display it until the JavaScript is present, you lose ALL of the benefit of deferring that JavaScript logic to let the content load quicker. The goal is admirable, but I think it will end up being neutral or worse UX for most sites.

This is a tenuous and difficult UX balance that sites need to consider carefully. I do not endorse Steve’s suggestions, that basically all sites should move to this model. It makes sense for some of them, if they intentionally design their UX that way. But it’s by far not optimal for a lot of sites, and using his suggestions will trade out performance for really sub-optimal user experience during page-load if not done with caution and reserve.

So, that is the context under which Steve presents us ControlJS. It’s an attempt to create a script loader that more closely models his view of how page-loads should work (that scripts should all defer loading and execution until all content is finished). If you agree with him, and aren’t worried about FUBC (or have already carefully thought about and designed around it), ControlJS is something to at least consider. But if you’re just planning to drop in ControlJS to an existing site, I think this is possibly a big mistake and the wrong direction for the web to head in.

So, I simply chose to disagree with Steve on this point, and we’re now focusing on different goals.

ControlJS: approach

In addition to my UX concerns with Steve’s approach to page-load optimization that’s embodied in ControlJS, I have some problems with the functional approach he’s taken as well.

User-agent

First, ControlJS relies on browser user-agent sniffing to choose different loading techniques for different browsers. I have been a very vocal critic of user-agent sniffing, even to the opposition of brilliant guys like Nicholas Zakas.

I’m not going to rehash the whole debate of user-agent sniffing. I simply won’t use it, and I think most people agree that it’s a bad choice. Feature-detection is far preferable. In between the two, but I still think better than user-agent sniffing by an important amount, is browser-inferences. LABjs uses a couple of browser-inferences (basically, testing for a feature known to be characteristic of only one or a family of browsers), very reluctantly as a temporary stop-gap until such a time as the browsers support dynamic script loading use-cases with feature-testable functionality. ControlJS makes no such attempt to be robust or future-thinking on this topic, simply relying on basic user-agent sniffing. This definitely worries me.

Moreover, I’ve been heavily engaged in trying to petition browsers and the W3C to support native functionality that supports the dynamic script loading use-cases, in a feature-testable way. If you’re unfamiliar with that proposal/effort, take a look at this WHATWG Wiki Page: Dynamic Script Execution Order.

Mozilla (starting with FF 4b8, due out in a few days) and Webkit (very shortly in their Nightlies) have implemented the `async=false` proposal made there, and done so in a way that’s feature testable. LABjs 1.0.4 was released a few weeks ago with the new feature-test in it to take advantage of that new functionality when browsers implement it. Eventually, the goal is to deprecate and remove the old hacky browser inferences (and the old hacky browser behavior it activated) in favor of this new (and hopefully soon standardized) behavior.

I asked Steve several times to join in the efforts to advocate for this new feature-testable native functionality to be standardized and adopted by browsers, which I believe will DRASTICALLY improve script-loaders’ ability to performantly load scripts into pages. He politely declined to participate, and suggested my efforts were misguided. And instead, he released ControlJS using old and sub-optimal user-agent sniffing in its place. This is obviously not how I hoped things would progress.

Preloading

During the development of the original 1.0 release of LABjs back in 2009, I consulted several times with Steve on various trade-offs that had to be made to get a generalized script loader to address all the various use-cases.

The biggest problem I faced was that some browsers (namely IE and Webkit) offered no direct/reliable way to load scripts in parallel but have them execute in order, which is important if you have dependencies (like jQuery, jQuery-UI, and plugins, etc). This is especially difficult to do if you are loading any or all of those scripts from a remote domain (like a CDN), which is of course quite prevalent on the web these days.

I developed and tested a trick I call “cache-preloading” (something which is now getting a lot of attention from various script loaders), which was basically a way to handle this. It was openly admitted to be a hack, sub-optimal, and hopefully something that would be eventually removed from LABjs.

There are various ways to approach this trick, but the fundamental characteristic is that you “preload” a bunch of scripts into the browser cache without execution, and then you make a second round of requests for those scripts, pulling them from cache, to then execute them. The reason this trick is important to consider is that it has a rather large (and potentially fatal) assumption attached to it: that all the scripts being requested are served with proper and future expiration headers, so that they can actually be cached.

Importantly to this post, relying on this assumption was very worrisome to Steve in our discussions back then. He asserted that as much as 70% (Update: maybe 38%, 51%, …) of script resources on the internet are not sent with proper caching headers (or none at all), and so for all of those scripts, if they are loaded with a script loader using a “preloading” trick, the first load won’t successfully cache, and the second request will cause a second full load. Not only is that terrible for performance, depending on the script loader’s internal logic, this assumption failing can create hazardous race conditions.

In addition to the possible double-load, Steve was also worried that even if a script was properly cached, the delay for the second cache hit could be noticeable and bad for large scripts. Again, large script files are very common these days, with the proliferation of advice from people like Steve who suggest concatenating all files into one file (to reduce HTTP requests).

After much discussion and back-and-forth, I reluctantly decided that Steve’s concerns were valid, and so I added more complexity to LABjs (increasing its size and introducing several ugly hacks) to try to abate this problem. We tenuously agreed that the likelihood was that most of the scripts that are being served with improper headers are being self-hosted (thus on local domains), and that remote domains (like CDNs) would be much more likely to be correctly configured.

I devised a solution where LABjs uses XHR to “preload” local scripts, and only falls back to the “cache-preload” trick for remote scripts. In addition, I gave people the ability, through config values in the LABjs API, to easily (with a single boolean value) turn off either or both of those preloading tricks, so that developers would be much less likely to accidentally trip over this problem if they indeed had to load a script that had improper caching behavior.

I’m sure you can imagine my surprise when, this morning, I cracked open Steve’s code, and I see that he’s now using the “cache-preloading” tricks as the only method, without XHR. In other words, he was really concerned about this problem when he helped me design (and complicate LABjs), but now it’s not a concern of his. He briefly mentions the assumption of cacheability off-handedly in his blog post, buried in a lot of other text.

I haven’t seen any research to suggest a radical shift since late 2009 that has most or all script resources now being properly cacheable. So I still consider the concerns that Steve voiced to be quite real and important.

Again, I always considered the “cache-preload” to be a hacky last-ditch fallback, and always have intended to remove it as soon as browsers provided a better solution (that’s happening now, slowly but surely). So, I’m quite concerned to say the least that ControlJS (and HeadJS and others) are latching onto the “cache-preload” as their primary (or only) means of handling parallel-load-serial-execute.

Not only are they white-washing over the cacheability assumption, They’re completely ignoring the recent movement by Mozilla and Webkit to implement the far-preferable `async=false` functionality. I consider this quite unfortunate.

Brittle Cache-Preloading

One last note on the “cache-preload” tricks: LABjs’ version of the “cache-preload” trick was to use a non-standard and undocumented behavior of certain browsers (IE and Webkit) that would fetch a script resource into cache, but not execute it, if the script element’s `type` value was something fake like “script/cache” (instead of “text/javascript”).

However, the HTML spec now says that such resources should NOT be fetched. So, Webkit (about a month or two ago) dutifully obeyed the spec and patched their browser to stop fetching them. This means that LABjs’ “cache-preload” trick broke entirely in Webkit nightlies. Now, thankfully, Webkit is moving rapidly to adopt the more preferred `async=false` in its place, so hopefully LABjs won’t be broken in any major Webkit-based browser release.

But there’s a SUPER IMPORTANT lesson that must be learned here. LABjs got by with non-standard behavior for awhile. But browsers are quickly catching up to the spec/standards. It was almost disastrous for LABjs, had `async=false` not gotten the attention it did, to die a public death because of relying on hacky non-standard behavior.

But ControlJS, HeadJS, and many other script loaders like them are doing the same thing. They aren’t necessarily using the exact same trick as LABjs used, but they are pinning their entire loading functionality on hacky, non-standard behavior. ControlJS uses the <object> preloading hack for some browsers and the `new Image()` hack for other browsers.

Essentially, they’re relying on the fact that these browsers will fetch the script resource content into cache but not execute it because the container used to do the fetching doesn’t understand JavaScript. I don’t know about you, but this sounds to me dangerously like the spec statement of telling browsers not to fetch content with a declared MIME-type the browser can’t interpret. How long do you think it’ll be before Webkit, Mozilla, Opera, or IE patch to stop fetching/caching content via <object> or `Image()` that they can’t interpret?

Some have argued that <object> and `Image()` will always have to fetch such content, because the browser can’t know the URL won’t return valid image or object content until after it receives the response. This may be true, the browser may always have to fetch it. But the browser might chose to discard the contents (and not cache it) if it sees a content-type that it won’t consider valid for the requesting container. If I were on a browser dev team, and were coding with the intent of following the spirit of the spec, that’s exactly the logic I’d implement.

And I’d especially do that, even though it may break script loaders, because the spec process (and browsers) are already starting to implement a more direct and reliable approach: `async=false`.

DOM-ready

I think it’s ironic that JavaScript modules for loading scripts asynchronously have to be loaded in a blocking manner. From the beginning I wanted to make sure that ControlJS itself could be loaded asynchronously.

I understand and empathize completely with Steve’s sentiment here. He and I both agreed from day one of LABjs that it’s unfortunate that you have to “load some JavaScript so that you can load more JavaScript”. But at its heart, this is how bootstrapping works, and it’s a reality.

However, I believe Steve has completely glossed over a really important point (and it kind of ties back to my earlier section on the UX of FUBC): if you asynchronously and dynamically load the loader, and don’t take special precautions, the loader has no way of knowing (in some browsers, including FF3.5 and before) if the page’s DOMContentLoaded (aka “DOM-ready”) has passed or not.

While the script loader doesn’t really need to care to much about DOMContentLoaded, some of the scripts that you may be loading do very much care. As I wrote about last year regarding jQuery, DOM-ready, and dynamic script loading, it’s very common that people write jQuery code that looks like this:

$(document).ready(function(){
   // I know the DOMContentLoaded/DOM-ready event has passed! wee!
});

The problem is, jQuery can’t reliably detect DOMContentLoaded in FF3.5 and before (and a few other obscure browsers) if jQuery itself is not loaded statically/synchronously (that is, in a blocking way, so that it’s loaded before DOMContentLoaded can pass). So, if you just blindly load jQuery dynamically, and it happens to finish loading after DOMContentLoaded/DOM-ready has passed, any code you have in your page that looks like that snippet above will just sit forever waiting, and never fire!

So, LABjs takes advantage of the fact that you are almost certainly going to load “LAB.js” file with a normal blocking script tag (I know, counter-intuitive, right!?). Inside of LAB.js, at the very end, is a small little snippet that detects if the page doesn’t properly have a `document.readyState` property on it (what jQuery uses for detecting DOMContentLoaded in those browsers), and if so, it patches the page.

This has the effect that a few hundred milliseconds later, when jQuery finishes loading (if you’re using jQuery of course!), even if DOMContentLoaded has already passed, jQuery will properly see the right state, and your code will fire immediately. The page-level hack looks like this:


// required: shim for FF <= 3.5 not having document.readyState
if (document.readyState == null && document.addEventListener) {
    document.readyState = "loading";
    document.addEventListener("DOMContentLoaded", handler = function () {
        document.removeEventListener("DOMContentLoaded", handler, false);
        document.readyState = "complete";
    }, false);
}

This is an unfortunate page-level hack that's necessary for FF3.5 and before, for jQuery to be properly loaded dynamically and still have code like that snippet above operate as expected. You can see the code is pretty small and compact, so including it in LABjs doesn't hurt the size/complexity too much. But it works because LABjs typically is loaded statically.

So, does that mean LABjs can't be loaded dynamically and still have jQuery and DOM-ready code work? No! It just means that the page-level hack in that snippet needs to be part of whatever little "bootstrapper" code you use to dynamically load LABjs to a page. And it means that THAT code that you use as a bootstrapper for LABjs must itself load statically (like in an inline script-block, etc).

In fact, I advocated a little while back exactly how to dynamically load LABjs and still preserve this whole DOM-ready business with this snippet.

What concerns me about ControlJS, HeadJS, and almost every other loader out there is that they completely ignore this important point about DOMContentLoaded detection for FF3.5 and below. I know for me, I still regularly use LABjs to dynamically load jQuery, and I still use `$(document).ready(...)` to safely wrap code, and I still support FF3.5. So for me, this DOM-ready protection code is important. Whether it appears in the loader code itself, or in the snippet of bootstrapper code, I think those loaders are really missing something important if they don't include that snippet (or something like it).

Invalid Markup

This post is already getting really long (again! I can't be short-spoken even if I try!). So I'm going to at least keep this final section brief, and try to wrap up quickly.

ControlJS is able to achieve its "deferral" of script loading and execution (until after window.onload in fact!) by suggesting that you must change all your script elements in your code to be unrecognizable by the browser loading mechanism.

You change a script tag's `src` attribute to be `cjssrc` instead. This is an invalid attribute name, and will be ignored by the browser. But it also invalidates your markup. I consider this to be a bad-practice. At least the attribute could have been `data-src` or something, so that it fits with valid spec attribute naming.

Secondly, ControlJS requires that you change a script block's `type` value to "text/cjs" (from "text/javascript"). I have two problems with this approach. Firstly, as we discussed above, the spec and standards bodies are moving to explicitly de-supporting invalid types. What if sometime soon, the spec says that any element with an unrecognized type should not even be added to the DOM but should be entirely ignored?

Or what if some browser just interprets what the spec currently says to mean that? If I were a browser developer, I could easily argue that the ignoring of such an element (not adding it to the actual DOM) would help improve the page's performance by taking up less memory and having fewer DOM nodes to inspect during DOM traversal/manipulation.

Also, he chose "text/cjs". This assumes future-wise that no MIME-type will ever take the name "text/cjs". Notice that LABjs at least chose a much different value, like "script/cache", which is far less likely to have future collisions. But Steve chose a "text/xxxx" format for the value, which fits closely with how such values currently are assigned, but also makes it more likely that there will be a conflict someday.

Moreover, in HTML5, the type value is now optional, and for performance reasons, I (and most people) now omit that from our markup. ControlJS will require us to go back and add `type` attributes to all our script tags, making the markup slightly larger. That's pretty minor, but it bugs me nonetheless.

Lastly, it is unclear from the code or the documentation what the expected/suggested behavior should be if I have some script tag elements in my markup that are marked-up to have CJS bindings, and other script tags that I leave alone. How will CJS loaded scripts interact (before or after) scripts that are loaded using the browser's native mechanism? If there's some reason I need to have a mixture of CJS and non-CJS script tags in my page, it will make the interpretation of my markup (specifically, the implied execution order of the script elements) a lot more confusing.

Summary

This is part 1 of this post. I'm going to make another post soon (part 2) where I shift my focus from ControlJS to HeadJS (and possibly other script loaders).

Again, I apologize to Steve (and to anyone else) if the tone of this post seems to be overly harsh. But I think it's important that the other side of the coin be put out there for developers to consider as they compare how ControlJS differs from something like LABjs.

Moreover, I reiterate what I've said a dozen times already: instead of creating more and different script loaders, I think a better use of our time would be to consolidate efforts in getting the browsers and the spec/W3C to give us a reliable native mechanism for managing dynamic script loading. Since `async=false` is already moving along nicely, I encourage and invite anyone who's interested to join that discussion.

I hope very soon that proper and simple handling of dynamic script loading will be very easy and straightforward, and well-conforming script loaders (which I can say LABjs will certainly be) will eventually be free from a lot of hacky, legacy junk that weighs them down. I call on all other script loaders to join that effort for a better script loading future. (ok, yeah, that was lame. :))

This entry was written by getify , posted on Thursday December 16 2010at 02:12 pm , filed under JavaScript, Performance Optimization and tagged , , , , , , , . Bookmark the permalink . Post a comment below or leave a trackback: Trackback URL.

27 Responses to “On Script Loaders”

  • Aaron Peters says:

    Kyle,

    you really can’t do short blog post, can you? ;-)
    Nice writeup, insightful for me. Learned more about LABjs and the making of.

    About Dom-Ready and LABjs.
    From what you write, is this a correct conclusion:
    – I use LABjs to load jQuery, jQuery UI and some plugins.
    – I do this by loading LAB.js itself statically, and I put all this code at bottom of the BODY
    – and I can still put those little code blocks with $(document).ready(function(){ do something } *anywhere in my HTML*, because of your code that ‘patches the page': even if DOMContentLoaded has already passed, jQuery will properly see the right state…

    Also, can I have those little code blocks with $(document).ready(function(){ do something } *anywhere in my HTML*, even if I load LABjs itself dynamically?

    I was under the impression that those little code blocks had to become part of the LABjs code block(s), you know, with the .wait()

  • kl says:

    You’re nitpicking a bit (e.g. script nodes with invalid type will be added to DOM, because that’s done at earlier stage than script execution, and it doesn’t make sense to remove them later, exactly because it affects scripts, CSS and DOM traversal).

    However, I agree with most points. As user of browser that’s not on Google’s favourite list I hate their ignorance in this area.

  • getify says:

    @Aaron–
    I apologize, I guess my wording in this post caused a little confusion on that issue of jQuery and `document.ready`.

    and I can still put those little code blocks with $(document).ready(function(){ do something } *anywhere in my HTML*, because of your code that ‘patches the page’: even if DOMContentLoaded has already passed, jQuery will properly see the right state…

    No, such code that normally would appear in your HTML would have to be moved to a function wrapper that is passed to `.wait()`. Of course, you could do just that: just wrap an inline block in some named function wrapper like `function inline_block_1() { … }` and not have to move it’s HTML position any. And then, wherever in your code that your $LAB chain happens, when you want to call your `.wait()`, just pass in the reference to the appropriate “inline block wrapper” function.

    But you do have to make sure it won’t execute right away. Not because of any other reason other than `$` needs to have been defined by the time such a piece of code is executed, obviously.

    However, just as you’d expect, `document.ready` blocks can appear without such a wrapper anywhere inside an external script file that is guaranteed to be executed after jQuery has loaded.

    Also, can I have those little code blocks with $(document).ready(function(){ do something } *anywhere in my HTML*, even if I load LABjs itself dynamically?

    The same answer applies from above even if LABjs is loaded dynamically. The key thing to remember is, if you are going to load LABjs dynamically, whatever little snippet of bootstrapper you have doing the loading of LABjs needs to have the little page-level patch in it. I showed what such a bootstrapper snippet might need to look like (including the page-level patch) in this gist.

    NOTE: Even if you duplicate that little snippet into your bootstrapper that loads LABjs, you don’t have to modify LABjs, because its copy of that page-hack will just transparently be skipped.

  • getify says:

    @kl–

    Thanks for your comment!

    You’re nitpicking a bit

    Yeah, I freely admit some of my ramblings above are not iron-clad arguments but just some free-form thoughts. I think more than such points being salient in and of themselves, they should just be taken as part of my overall tone of reaction to CJS — basically: “meh”.

    Specifically, my suggestion that perhaps it’s dangerous to use an invalid script element type because the W3C spec is definitely moving toward “ignoring” such content (for performance reasons) was really just a “what if”.

    It goes to the overall point that we should use standards-compliant code wherever possible, and any time there’s a willful violation of spec, there needs to be a really good reason and it needs to be carefully considered. I was pointing out that perhaps CJS did not, in this respect.

    Just as purely a technical aside:

    script nodes with invalid type will be added to DOM, because that’s done at earlier stage than script execution

    I certainly won’t claim any knowledge/specialty over how the DOM parsing/building process works (and it sounds like you are more aware of it). But just speaking in terms of how normal code compilation works, it seems plausible to me that a DOM element with an invalid attribute value could be excluded from the DOM tree.

    Speaking completely conceptually:
    In a simple/standard compiler, there’s a parsing/tokenization phase, then an AST phase where the “tree” in this case seems like it would conceptually be built. Could not the AST phase, passing through the token list item by item, building the DOM simply skip inserting a node in the AST if its child `type` attribute-node had an invalid value? That doesn’t require waiting for “script execution” phase much later, it simply is interpreting the markup and attribute values inline during DOM parsing. Right?

    In the same way, couldn’t a node that has an invalid/unrecognizable DOM name (like “foobar”) also be skipped from being added to the DOM?

    :)

    Anyway, thanks for your comment (and for reading that whole long post).

  • Aaron Peters says:

    Thanks for you reply Kyle.

  • @Aaron @Kyle

    I usually use this shim code for any global functions I want to utilize ahead of time. It could be tweaked to maintain “this” integrity and handle chaining probably, maybe I’ll do that this weekend:

    shim.js

  • Kyle,
    You make some good points and, as usual, your arguments are measured and balanced.

    Re FUBC: I’d argue that, on 80% of sites, 90+% of the functionality is provided natively by the HTML. While that ratio may be a little high, I would not say it’s outrageously so especially given (most) sites make some effort to support noscript browsers. As long as statically included CSS is sufficient to maintain overall geometry, I think the benefits of post-load rendering more than make up for it. Also, I think ControlJS is more of an attempt to dynamically add behavio. Instead of always loading the Menu control code, add it when the user provides some indication they will need it. To me this makes more sense than relying on a statically defined order and mitigates many of the FUBC issues. I see the biggest issues here are dynamic CSS and IE’s silly behavior when it encounters unknown tags, but ControlJS doesn’t require code late-load, it simply strongly urges that you do.

    RE invalid type attributes: Invalid script types are becoming quite common as a form of client-side HTML template definition. Any change in the standard would affect much more than just script loaders (not that similar things haven’t happened before ;-) Nevertheless, it would seem unwise for the spec to dictate browsers simply not process any element: We are explicitly relying on this fact to transition to new HTML 5 elements (requiring inline script execution)

    RE the text/cache hack: The very pre-fetching LABjs and ControlJS implement using essentially the same hack, IE has natively supported through it’s readyState property and change event since version 4. IE simply re-uses the same conventions everyone relies on to perform image pre-fetching: The img URL is fetched immediately when the src attribute is assigned but obviously not displayed to the user until it’s inserted in the DOM.

    Similarly, a script url is fetched when it’s src attribute assigned but the script is not executed until the element is added to the DOM. Once the script finishes downloading (or if it’s in the cache) IE will set the readyState to loaded and fire the change event. The trick is to ensure the readystatechange handler is installed before the src is set. Otherwise, when the script is in cache, you will not receive the “loaded” transition event.

    I’m stupefied as to why the other browsers have not implemented this same thing. We all ready see how useful the document.readyState property is. The concept of state applies to more than just XHR requests and the document. IMO, the spec should embrace the state concept and provide formal definitions for their values and transition events. It would greatly simplify event management (do we really need a DOMContentReady event?) and provide a means by for caching one-time event occurrence.

  • getify says:

    @Will-
    Thanks for your comments.

    Re FUBC: I’d argue that, on 80% of sites, 90+% of the functionality is provided natively by the HTML. While that ratio may be a little high, I would not say it’s outrageously so especially given (most) sites make some effort to support noscript browsers.

    I don’t know what the percentages are, but I think you may have missed my point slightly.

    Even if I conceed that only 20% of the sites on the internet are using significant JavaScript “presentational behavior” (that is, JavaScript that alters the presentation on page-load), it’s the fact that most of them are not currently designed for a gradual progression from the “noscript” version of their site to the JavaScript version of their site that worries me, regarding Steve’s approach. And btw, I think the number is a lot more like 40% with the popularity of jQuery, jQuery-UI, YUI, and many other widget-rich plugin frameworks.

    When all that radical presentational logic is guaranteed to run at (or right after) DOMContentLoaded (and usually right as things are being rendered), there’s a very low chance that a JavaScript user sees the transformation. And for 99% of the visitors in that category, they’ll never see the un-behaviored content. So the extent to which the switch is radical doesn’t affect them.

    If we start making most users on 20%-40% of the web seeing these radical alterations to pages (think of how radical the shift is when a tabset widget is rendered) in a very noticeable and jarring way, I think we’ll have a lot of users hating the UX (even if it’s purely faster).

    The right way to do it, if you’re really concerned about that extra 100ms that your JS takes to alter your page, is to break up the alteration into several “layers”, and load each layer one at a time, a few seconds apart. That way, the user would see small chunks of changes, a little bit at a time. If they left the page idle for the first 10 seconds (while they read a news headline), the page would sorta just change gradually around them.

    But this type of progressive UX is by far not the norm on sites. And it’s going to take a lot of careful planning to get right. And I don’t want to endorse Steve’s approach of just making a radical shift to having sites all start loading in a way that’s going to make the jarring UX reconfiguration more obvious and disturbing. His loader (and his advice) should be part of a very comprehensive UX overhaul of the site’s page-load, and should be applied in a measured and cautious way. But Steve makes none of those points clear… he’s only concerned with making the page faster.

    And I’d argue that the 100-200ms he might be saving in terms of when content is visible is not worth the radical jarring of that content into some other configuration. I as a user would rather just not see the content until it’s mostly ready, unless you can gracefully transition from what I currently see to something more decorated.

  • getify says:

    @Will (part 2)-

    RE invalid type attributes: Invalid script types are becoming quite common as a form of client-side HTML template definition. Any change in the standard would affect much more than just script loaders

    I know it’s happening a lot more these days. But I don’t think that makes it a safe or right thing to do just because everyone’s doing it. I saw how easily the spec and browsers were willing to yank the rug out from under me with how LABjs’ loading tricks were broken in browsers, and it’s really changed my perspective on how we should be, as web authors, using web technology. If we know something is not in spec, or is something that’s even of questionable “support” in the spec, it’s a good idea to be REALLY cautious of that thing. We shouldn’t just plow forward rough-shod and expect that the spec and browsers will just bend to our practices. I never want the experience again that I had in the past 2 months of wondering if all my hard work on LABjs was going to be for loss because I’d pinned it on non-standard and unreliable browser behavior.

    RE the text/cache hack: The very pre-fetching LABjs and ControlJS implement using essentially the same hack, IE has natively supported through it’s readyState property and change event since version 4. IE simply re-uses the same conventions everyone relies on to perform image pre-fetching: The img URL is fetched immediately when the src attribute is assigned but obviously not displayed to the user until it’s inserted in the DOM.

    Yes, on the WHATWG Wiki Dynamic Script Execution Order site, it’s been brought to my attention that this behavior has been in IE for awhile.

    While that works in IE, I’d give the same caution as above, but even stronger: it is NOT in the spec. Pinning any behavior in your page to non-standard behavior is only fine for short-term hacks while you advocated for long-term solutions that are standardized. Unfortunately, most people using short-term hacks are just content to leave the status quo… until something breaks.

    I know this because it’s exactly how LABjs was for its first year. That doesn’t make it wise. I learned my lesson.

    While I don’t necessarily think the “readyState” behavior in IE is the right way to handle preload-defer-execution, I recognize that it’s a valid approach, and that’s why it’s chronicled on the wiki page. However, it doesn’t (yet) have the support of several browsers as does my `async=false` proposal.

    Admittedly, my proposal isn’t intended to solve the “load now, execute-on-demand at some much later time” behavior that Steve is advocating for. But we’d need to standardize the `readyState` proposal and get it cross-browser implemented before we could adequately do what Steve wants to do.

    As I described in painful detail above, relying on non-standard hacks (which are subject to breakage if the caching is wrong) to simulate that behavior now is dangerous, especially because Steve is not participating in the standards process to develop and codify a long-term solution.

  • getify says:

    For posterity sake, here’s a thread I’ve started on the W3C public-html list regarding the behavior that Steve is going for (load now, parse/execute on-demand later). I’ve extended the discussion to include both CSS and JS resources.

    http://lists.w3.org/Archives/Public/public-html/2010Dec/0174.html

    As always, I highly encourage my readers to follow/join in that discussion.

  • Sniffing for particular browsers (either by looking for substrings in the UA string or by checking for browser-specific objects) and assuming that behaviors that aren’t currently interoperable will persist makes me sad.

    Minor nitpick: It is extremely unlikely that browser would stop putting the cjssrc attribute in the DOM. Your conclusion is correct though: ControlJS shouldn’t be squatting names reserved for future HTML specs but should use something like data-src or data-cjs-src instead.

    Also, after reading your post, I took a look at the source of jQuery. The code that assumes a relationship between readyState and DOMContentLoaded is not thoroughly robust, because there is a period during the lifetime of a document when DOMContentLoaded has fired but readyState is not yet “complete”. Unfortunately, there’s also a period when readyState is “interactive” but DOMContentLoaded has not yet been fired.

  • getify says:

    @Henri-
    Thanks for your comment!

    Minor nitpick: It is extremely unlikely that browser would stop putting the cjssrc attribute in the DOM.

    This was not really what I was conjecturing about. I was actually saying that perhaps a browser might at some point leave an entire element (like the <scrript> element) out of the DOM if its `type` value is unrecognized.

    I recognize that’s extremely unlikely, if for no other reason than legacy content, but it was an argument I could see being made, for DOM performance reasons. If we got to a point in time (especially on resource-limited devices like mobile) where having a lot of those “invalid” elements in the DOM was really slowing down the DOM API (traversal/manipulation), having a more advanced algorithm that either removes such elements from the DOM tree, or is at least able to dynamically “ignore” (skip over) them, might be a plausible advanced optimization.

    And I could see possibly the “justification” for that because a) the spec doesn’t say not to, and b) because the spec does sorta lay the precedent with its wording of telling the browser that it should not fetch content with a declared type that isn’t valid.

    In other words, flip this around. Why should the browser keep the external <script> element with invalid `type`in the DOM if it’s effectively ignoring that element, by never fetching its content? Similarly, for consistency sake, why should an inline <script> element with an invalid type be kept, when an external <script> element of that same type would be ignored?

    I’m not saying browsers should (or ever will) do this. But I’m saying I could see it as a plausible step at some point. And the main point is, if there’s such a plausible reasoning to be made, use of such a feature should be done with great caution.

    The code that assumes a relationship between readyState and DOMContentLoaded is not thoroughly robust, because there is a period during the lifetime of a document when DOMContentLoaded has fired but readyState is not yet “complete”. Unfortunately, there’s also a period when readyState is “interactive” but DOMContentLoaded has not yet been fired.

    Very interesting, indeed. Do you have any documentation on this? I assume you mean this from the Mozilla/Firefox perspective, right? Or is there something in the spec that spells this out explicitly?

    AFAIK, jQuery’s code is intended to use DOMContentLoaded when it’s present, as the prefered signal for “DOM ready”, and it falls back on examining `readyState` if the DOMContentLoaded event isn’t available to listen to. So, it’s admittedly an inperfect system of fallbacks (even falling back to the `doScroll` hack from Diego Perini for IE), but has it ever been shown that the $(document).ready(...) is firing too early (and thus in an unsafe way)?

    I’m curious if this is a case where, because of ambiguity in the spec, or incomplete/divergent browser implementations, it’s simply impossible to have a completely robust implementation of “DOM-ready”? Or is this something that jQuery could/should be patched to fix? I’m especially curious because I’m also currently working to help advocate for Prototype to fix the same logic in their core, and I don’t want to guide them wrong!

  • It seems your stance on readystate has softened since Iposted to WHATWG. Labjs allready implemens readystate support in its XHR fetcher, as do most loaders, and it seems wise to make use of existing concepts, especially when they solve such a wide-range of use-cases.

    While I understand you’re once-bitten-twice-shy mentality and I certainly am “in favor of standards,” I believe it’s natural for the spec to lag realitly lest we risk letting the tail wag the dog. (When you think about it, LABjs’ fallback position would simply have been to load sequentially. Optimal? No, nor does it take us back to 2006) I don’t hear cries to stop using innerhtml, and I’m not sure it was ever standardizrd (if it was they left out script tag handling)

    your point about removing elements w/ invalid attributes seems like a strech. Were the text/cache scripts not added to the dom or just not loaded?

    Finally, we have a fiddler plug-in for ghostwriter that converts pages on-the-fly to use defered js execution. The nuber of non-functional sites is _really_ small and the delay is negligible for the most part. I can provide if you’re interested. The point is not that 100 ms is too long, it’s that it might be better spent elsewhere. Presuming we know the proper orde in which t enhance is foolish. Deference allows enhancements to be event-driven, not static, and prevents the problems with fubc you describe since it can be reactive (assuming your css is styles your initial tabview whi i’ve found is overwhlmingly the case)

  • Why should the browser keep the external <script> element with invalid `type`in the DOM if it’s effectively ignoring that element, by never fetching its content? Similarly, for consistency sake, why should an inline <script> element with an invalid type be kept, when an external <script> element of that same type would be ignored?

    For the external script case, the main reason not to start dropping it from the DOM is that pages tend to assume particular stuff is in the DOM, so dropping stuff that’s currently interoperable could well break pages. The trend has actually been in the other direction: Safari started out not putting comment nodes into the DOM, but now Safari puts comments in the DOM.

    The inline case is used by WebGL shaders, Silverlight, AmpleSDK, etc., and is now a spec feature called “data blocks”. http://www.whatwg.org/specs/web-apps/current-work/#script

    Do you have any documentation on this? I assume you mean this from the Mozilla/Firefox perspective, right? Or is there something in the spec that spells this out explicitly?

    It’s in the spec. http://www.whatwg.org/specs/web-apps/current-work/#the-end By code inspection, it looks pretty close to what Firefox does.

    By code inspection, it seems to me that jQuery’s “ready” stuff works if called before DOMContentLoaded has fired or after onload has fired, but I think (didn’t test) that if called in between, you end up waiting for an event the fired already. I can’t think of a thoroughly robust way of asking a document if its DOMContentLoaded has fired already.

  • getify says:

    @Will-

    It seems your stance on readystate has softened since I posted to WHATWG.

    My stance has flipped completely because I found the wording for the underlying “preloading” behavior (not the `readyState` thing) in the spec, which I had missed before:

    HTML Spec, 4.3.1, ‘Running a script’ algorithm

    Specifically, in step 12 (which is about fetching a `src` URL):

    For performance reasons, user agents may start fetching the script as soon as the attribute is set, instead, in the hope that the element will be inserted into the document. Either way, once the element is inserted into the document, the load must have started. If the UA performs such prefetching, but the element is never inserted in the document, or the src attribute is dynamically changed, then the user agent will not execute the script, and the fetching process will have been effectively wasted.

    I assume at some point the spec incorporated what IE did (or the other way around), but none of the other browsers have followed suit, yet.

    The spec is still lacking in that it doesn’t define the `readyState` stuff (or something equivalent) so that the page author can properly detect when “preload” has finished. For this to be a viable thing that other browsers could implement, they’d have to copy IE’s behavior, or the spec would need to be extended here.

    In a future release of LABjs, I intend to feature-detect for `document.createElement(“script”).readyState` (basically, only IE at this point), and if it’s found, use that method instead of the uglier “script/cache” hack. Since IE supports this in all viable browser versions, and the spec supports it, I think that’s enough to justify that this “preloading” hack is more preferable (at least in IE) than “script/cache”. At that point, LABjs will have feature-tests for both methods, and falling back to the hacks for others.

    Now, as to whether I think this `readyState` behavior is better than the main `async=false` proposal… At this point I’m not sure which one I prefer. Certainly, the fact that `async=false` is now implemented in two browsers (FF4 and Webkit), where `readyState` preloading is only in IE, means that they both need to co-exist for now.

    Perhaps we’ll converge on one or the other that all browsers support, someday. `readyState` preloading has more browsers to convince, but at least it’s already in the spec. `async=false` isn’t in the spec yet, but has gotten more browser support. So at this point, it’s a wash as to which one would be the front-runner.

    I do like that `readyState` preloading is more flexible than `async=false`. If it eventually wins and is in all browsers, that’ll make me happy. But there’s a long road ahead for that. Until then, both will have to be in use.

    The good news is, if all browsers could get to the point where they used one or the other, that’d be a HUGE win because it’d mean we were totally feature-testable. IE using `readyState`, FF/Webkit using `async=false`… that leaves Opera…?

  • getify says:

    @Henri-
    Thanks for the clarifications. I’ll look further into the dom-ready thing for jQuery. It’d be great if we could construct a test-case where jQuery is loaded at exactly the right time to exploit this problem you point out.

    It would seem a fallback (which I think Prototype does) would be to hook onto window.onload, so that if in the odd case that you missed on DOMContentLoaded, but window.onload was still to come, you could be sure your code would “eventually” run.

  • pU@getify: unfortunately feature detection of of preloading using readystate prop detection would fail in opera which has readystate, but no prefetch. without reddystate, the specs suggestion to prefetch would be useful for defered execution — ordered exec is not going to work.

    If you’re working on prototype’s domready detection, can i suggest also looking at newer (>1.4.2) jquery features which allow configured ready callback delays? Late-loading scripts that assume ready() indicates all scripts have loaded can be difficult w/o this.

  • getify says:

    @Will — I have a running theory that the feature test will need to be this:

    var rs_preloading = document.createElement("script").readyState == "uninitialized";

    It appears that IE defaults the value to “uninitialized” and Opera defaults it to “loaded”. Still researching to try to confirm.

  • @Kyle — If the suggestion to download the src element prior to insertion “for performance reasons” is all ready in the spec, then the “standards-approved” way to do this is through onload-chaining. Create the elements, install an onload handler that adds the next script in the chain, assign the src and attach the first one. Parallel downloads occur in browsers that choose to be performant, LABjs becomes standards-compliant and no further modification to the spec is required. (Plus, we can “hush that fuss” around the semantics of async and boolean values) This is how we (are going to) do it in Ghostwriter.

    You’ll need to add some checks for browsers that use readyState (since, unlike today, a value of “loaded” no longer indicates execution has occurred) but you should be able to work around that by testing for a null parentNode or simply delaying the handler installation until right *before* attaching. (If you install after attaching, there’s a chance you’ll miss it)

    As a means of pre-fetching/defering execution, readyState has value, but neither it nor async=true are required to achieve ordered execution. Parallel downloads, as the spec states, are a *suggested* means for achieving better performance; however, they are not required to provide ordered execution.

  • getify says:

    @Will-
    Yes, “onload chaining” as you suggest would work in LABjs. And maybe it’s indeed better than my previous `async=false` proposal, because it’s more flexible.

    But async=false is:

    1) a temporary stop-gap to prevent FF4 or Webkit breakage of LABjs… so it’s still quite useful in that it accomplished that goal quite well. It’s unlikely `readyState` would have moved through that quickly.

    2) implemented in two browsers (and possibly a third soon), so it’s currently moving along much quicker than `readyState`, which is only in one browser (IE) thus far, and has been that way for a long time.

    As it stands, it appears to me like I will need both async=false AND readyState preloading to be the primary techniques in LABjs for a good long while, until one of those two techniques “wins” and is reliable in all browsers (maybe someday!?).

    The hacky “cache-preload” fallbacks (and their ugly cousin browser inferences) might have to remain in LABjs for a little while, but eventually I will just phase them out, and the “fallback” for older browsers will at that point simply be worse performance through serial loading.

    ——-
    However, as Steve’s post points out, there IS in fact a decent use-case for controlling execution (deferring it, etc) separately from loading, and in that case, “onload chaining” won’t be sufficient. We’ll need the readyState behavior as part of the equation to fully enable that use-case without hacks. So I still think there’s value in getting the spec to add that wording, and in getting browsers to implement the full enchilada.

  • alexander farkas says:

    Some good points. I saw a lot of people putting thier scripts at bottom and then setting the visibility to hidden until all js was loaded. I really hate developers not understanding those performance rules. Using a small “seeding” file in the head (after styles), wich loads 2-4 files immediately has the best “cost-value ratio”.

    What I don’t really like about labjs, are the workarounds for script execution in a specific order, including XHR. There is a simple solution, called [ready-]events/callbacks for this, wich also works without a scriptloader, but with async = true.

    In a current project of mine, my hole code in each file is wrapped in an anonymus callback-function wich looks like this:

    jQuery.webshims.ready('ready es5 forms', function($, webshims, window, document, undefined){
       // run the function if 
          // DOM ('ready'), 
          // ES5 ('es5')
          // and HTML5 form features ('forms')
       // are loaded/ready (to use)
    });
    

    Due to the fact, that most developers already wrap thier code in a function, it’s quite simple solution + everything can load simultaneous + a developer can decide to remove the scriptloader and use script-tags with async attribute.

  • getify says:

    @alexander-

    What I don’t really like about labjs, are the workarounds for script execution in a specific order, including XHR.

    I don’t like the hacks either. Unfortunately, for the generalized script loading use-case that LABjs is focused on, there is no direct way (without the hacks) to accomplish it in the browsers.

    When you reduce the functional use-case for a script loader, there’s lots of other ways you can devise to approach things. But for what I’m doing with LABjs, they are still necessary. I hope soon we have the benefit of `async=false` and `readyState preloading` in all browsers, as feature-testable, so that those hacks can be deprecated and eventually removed.

    There is a simple solution, called [ready-]events/callbacks for this, wich also works without a scriptloader, but with async = true.

    Due to the fact, that most developers already wrap thier code in a function

    Unfortunately, this is exactly the narrowing assumption you’ve made which works (only) for exactly what you want/need (and thus makes LABjs look like the ugly step-sister). But there’s several script loader use cases this solution does not work for.

    The primary (although not only) drawback to what you suggest is that all scripts you load with such a system must “register” themselves with the global loader mechanism. What if you need to also load (at the same time as these other specially wrapped scripts) a third-party script or library (like jQuery) which isn’t wrapped like that?

    Are you going to self-host and wrap all such scripts yourself? This is an ugly road to start down.

    And what if you need to use a script loader to load things on-demand well after the page has loaded? Those “ready” events are not reliable in that case.

    I appreciate that in narrower cases, like yours, a solution like what you suggest works. But please understand that LABjs is focused on a broader set of use-cases and doesn’t have the same luxuries. And to me, I’d rather use the same generalized/multi-purpose tool for all my script loading, than a different custom strategy (that only works) for each different scenario.

  • alexander farkas says:

    @Kyle

    You are right about the fact, that this approach means. A script has to be wrapped and somehow register itself. And I know, that you wanted to have a scriptloader, wich has a) simultaneous loading and b) ordered script execution, without the necessity of touching the loaded scripts. Well, this seems impossible without hacks, but you know that.

    But registering is already done, if you say to the loader, please load this file. The only extra is the wrapper. And there are ways to make it possible, that this wrapper works with and without the base loader.

    And what if you need to use a script loader to load things on-demand well after the page has loaded? Those “ready” events are not reliable in that case.

    I don’t exactly know, what you mean. But I think, you mean something like a race condition? Script a should be executed after script b. But script b already fired his ready event, before script a was loaded? If you mean this. No, I take care of this. (https://gist.github.com/707837#LID125)

    I don’t want to use my scriptloader in my project. The main problem, my project wants to solve, is how can a developer extend the interface of DOM-elements to get all the nice HTML5 stuff in all browsers (absolutley accurate implementations including DOM-Accessors etc. (https://github.com/aFarkas/x-browser-accessors/blob/master/index.html)). I have only written my own scriptloader, because I couldn’t decide, wich scriptloader is more “industry standard”. (LabJS seemed to be the winner, but had this upcomming browser issue (FF4 and Webkit)).

    I thought I could use yepnope. Due to the fact, that it is a capability based loading script, it would match my needs. Then I saw, that they start queueing script-loading (instead of simultaneous loading), if you want to use a “ready-callback” (currently they use labjs, but there is an upcomming thing version 0.5). This is technically not nessecary and wouldn’t fit my needs. So I’m still waiting. My hope is, that there will be an official jquery plugin (my project already has a jquery dependency, so this would make much sense).

  • Patrick says:

    Just in case someone wants to see the effect if the scripts get loaded twice because of the miserable preloading by HeadJS: http://patrickburke.de/script_loader/
    If you disable the cache in Firefox (I’ve done this with the Web Developer toolbar), the page loads about twice as long when using HeadJS instead of the blocking script tags. In the network tab you can also see that the scripts are loaded twice.
    This is sad. Because of the additional features HeadJS offers (HTML5 enabler, feature detection, etc.) I’d rather prefer that one. But since false caching headers or some setting in the browser may lead to such an effect…
    But it seems like Tero is about to implement the loading methods described by you: “Eventually all this wisdom will be merged to Head JS” (http://headjs.com/#download)

  • I’m the author of Head JS.

    I’ll definitely implement readyState and async=false tricks as well as DOM ready fix as suggested.

    However I don’t quite understand how this async=false trick works and how it’s supposed to be used. Can anyone help me? Any good links with working code?

  • Gobezu says:

    Kyle,

    interesting read, although I am kinda 2 month after its initial post this has been a worthy input when I continue looking at the various libraries I should support improving a performance optimization plugin I have developed for the Joomla! cms, jbetolo.

    Shall we still expect the second part of this post?

    All the best!

  • Alex Weber says:

    Wow… I must have read this article about 3 times now but I keep coming back to it. I comitted to HeadJS a while ago and maintain a Drupal module for it and am very interested in what you have to say in the second part of this article. So congrats for the great writeup and looking forwards to the followup!

    — Alex

Leave a Reply

Consider Registering or Logging in before commenting.

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Notify me of followup comments via e-mail. You can also subscribe without commenting.