I'd like to know how specific this is to Chrome. From a high-level perspective it seems like a very weird hack. |
@annevk I guess we need feedback from all implementers here, i.e. include Edge team to this conversation. Does anyone know who is working on SW in Microsoft? |
Going to tackle this in two replies, one addressing the problem & another addressing the proposed solution - because I don't think they match up right now. Back when we started work on SW there were calls for higher level APIs, manifests and such, because SW may present performance issues in some cases, and your response to that was roughly "We shouldn't try and solve performance problems until we know we have them, and what shape & size they are". This is still a good plan. I want us to present solid data here that shows the size & shape of the problem, and I want other vendors to verify that SW startup is the bottleneck here, and not something specific to Chrome. There are multiple ways to solve this problem, and the way to pick the best solution is for us all to have good visibility into the issue. |
Without concrete data on the problem, it's difficult to assess solutions. Service worker startup is not free - but we need to assess when that cost significantly detracts from the benefits. It could be: I have no fetch event. I'm paying the startup cost for no reason. We solved this in #718, but Chrome hasn't landed it yet. This is the case for at least one of the large sites reporting this regression. I have a fetch event but it doesn't We have an existing solution that we can apply here: passive listeners. Although we may need to disable reading the request body in this case. I have a fetch event, it sometimes calls I think this is the case the proposed solution is aiming for, but I really want to hear more about sites that use this pattern. self.onactivate = (e) => {
// If unset, preflight requests are sent without special marking
e.setPreflightHeader("X-Site-Specific-Header", "thinger");
// ...
} This suggests the preflight will happen for each and every request (or just navigation requests for some reason?), which feels like a huge change. If I serve a 3gb video from the cache, what happens to the preflight? Is the user going to end up downloading the 3gb again, or will the stream be aborted? Either way, as a developer, I feel like I've just lost a lot of control. self.onfetch = (e) => {
if (e.preflightResponse) {
// This is a navigation fetch which has already been issued.
// If the `preflightResponse` isn't used, then everything proceeds as
// if it hadn't been sent in the first place.
}
} If the fetch event is blocked on a preflight response, you've killed service worker as a means for creating offline-first experiences. One way around this is to make self.onfetch = event => {
event.respondWith(
event.preflightResponse || fetch(event.request)
)
} If Additionally, using the response is still blocked on the SW, as the fetch event is always consulted. Is that ok? Again I want more concrete data. The preflight is always going to happen, but I have to opt into using it. It seems really weird that the preflight isn't an opt-in, there isn't an opt-out, yet using the preflight is opt-in.
I see you put that bit in because I'm in favour of a route-based solution. A route-based solution should allow a developer to declare "for requests that look like this, do this", in a way that can at the very least started while the service worker boots up. This wins over the proposed preflight solution because:
But again, I think we need clear data before we do something like this. We shouldn't rubber-stamp a scenario-solving solution, especially while we're so vague on the scenario. |
A route-based sketch: self.addEventListener('install', event => {
event.waitUntil(
// populate caches
);
event.declareRoute("/", new FallbackSources(
new CachesSource(),
new FetchSource(),
new FetchEventSource()
));
}); This API is up for debate wayyyyyy beyond the naming, and I still think we need better data before we'd proceed with this. But here's how the above sketch could work:
Attempt each source in series until an acceptable response is found. Alternatives could be
Look for a match in the cache.
Try to get a response from the network.
Defer to the fetch event. So the example above:
The whole thing can complete without the service worker starting up unless we hit step 3, and I'm pretty confident it could be polyfilled. |
Can we tie this "preflight request" to the concept of "preload"? We already have a rel=preload concept in the specs I believe. The PlzNavigate effort to preload the site before the user finishes typing could be seen as the same mechanism. This would give us a consistent mechanism to hang this optimization on without directly tying it to browser-specific mechanisms. This does some like it could be useful, but its complex enough that it would probably only be used by a minority of sites. Maybe its worth it if those sites are extremely high traffic? |
@jakearchibald So, the following would allow service worker startup to happen simultaneous with the network request?
|
But the load would be delayed by bad network connections. I guess maybe your AnySource() solution would allow you to handle this, but its not clear to me exactly how the pre-flight proposal would like to handle flaky network, either. |
I don't think this is particularly chrome specific. Starting a worker thread, parsing js, interpreting, running the jit if necessary, etc are all overhead to letting the SW handle the fetch event. Coming up with a mechanism to allow this overhead to be performed in parallel with an initial network request would benefit all browsers. I guess ultimately its a tradeoff between complexity and those overhead costs. Firefox might be slightly faster to start a service worker right now (I don't really know though), but we will likely end up with overhead similar to chrome once we fix our infrastructure to handle multiple content processes. If we can come up with something that is not too complex, then it seems like a useful addition. I do kind of agree with Jake, though, that maybe we should wait to implement these kinds of optimizations. Its still early days for people figuring out how they want to use service workers. If we implement this now we might miss out on a useful pattern people find they need or might make it more general purpose (complex) than is needed. |
With the routing proposal... self.addEventListener('install', event => {
event.waitUntil(
// populate caches
);
});
self.addEventListener('fetch', event => {
event.respondWith(
caches.match(event.request).then(r => r || fetch(event.request))
);
}); ...could be replaced with.... self.addEventListener('install', event => {
event.waitUntil(
// populate caches
);
event.declareRoute("/", new FallbackSources(
new CachesSource(),
new FetchSource()
));
}); ...and the SW wouldn't need to boot up for any fetch events. My blog currently composes a streamed response. If the cost of service worker bootup was greater than its benefit, I could do: self.addEventListener('install', event => {
event.waitUntil(
// populate caches
);
event.declareRoute({mode: "navigation"}, new AnySource(
new FetchEventSource(),
new FetchSource()
));
}); ...allowing the browser to race the network and the fetch event for navigations. |
Enabling multiple Service Workers for a single scope #921
A few things seems missing from the conversation. Many sites we've partnered with have deployed nearly-no-op SWs to do tracking and monitoring. This defeats the no-
Getting UI to users tends to involve "booting up" large amounts of JS, CSS, and associated context. This is true for FB, Inbox (in some modes), Docs (in some cases), and many others. In a no-SW world, the best strategy is to inline fresh content into the response document where the booted code can consume it and render it. With SWs in the mix or server-rendering of snapshots (Inbox & Docs, but only in certain modes), things get more complicated. A service designer of one of these systems wants to be able to:
To handle the second of those, the proposal I've provided lets the server know a few things:
It does this with minimal new infrastructure, enabling both streaming for "static" sites that want to handle HTML partials and not data, as well as allowing sites that deal in data (vs. inline HTML) to do so naturally. Would love to hear from @bmaurer on the alternatives proposed here as well. |
That's a pretty good summary for what we want to do. In the shorter term we're going to cache parts of the page markup and then make a request for the rest, in the much longer run we're going to cache most of the markup but will still need to make a request for page data. In both cases we're going to want to make a request to our server to get data as soon as we can and at the same time start getting code to the client window as soon as we can. Service worker startup isn't free: in Chrome, our in-the-field logging is telling us that service worker startup adds about 200ms to page load time on desktop. It might be slightly better or worse in other browsers, but it's still going to be a significant cost to startup a process and initialize the service worker code. Our site is pretty optimized to deliver resources quickly and in the right order, so our concern is that this extra time might mean that service workers are a regression in some cases. What we don't want to do however is have a race between network and cache without actually opening from the service worker. If we're going to build a version of the site that can be fully loaded from a service worker we will most likely still need to make a request each time the page loads to get the newest content, so simply loading the site from cache won't always be a win if we have to wait to make a request afterwards anyway. The optimization that we need allows us to get the initial request out before the service worker starts up but still allows the fetch event to get and process a request if one was sent out. Having the option to set a header or other fetch options here is important because even in the short term we aren't going to return the same kind of markup when a service worker is there vs when it isn't. Also, making this field nullable in the fetch event seems fine to me. It can be called possiblePreflightRequest or something like that and then you can wait for it just like a normal request if you want to use it. |
@slightlyoff can you answer my questions about your proposal & show why it's better than declarative routing? |
To add a bit more color to Nate's comments -- First, I think it's worth considering the position a site is in as it first adopts service worker. SW is a huge change to how people write websites. Any complex site is going to take a while to get it's feet wet. One thing that Alex mentions is that some people might use SW to do things that don't involve intercepting the root document the user navigates to (eg, caching static resources). A site might gradually use sw for only parts of the root document. Maybe first they just cache their page header. To the extent that using SW causes a perf hit (as Nate mentions there's a startup cost at least in chrome) that makes it hard for a site to dip their feet into using SW. Even if SW eventually enables great wins sites tend to develop incrementally and want each increment to be better on metrics. So in my mind an ideal solution here would be that a service worker that does nothing, has no cost Now let's think about a more advanced SW deployment, taking a site like FB as an example. FB would have two goals:
I think many apps share this set of dual goals -- for example, an email application would want to show existing emails quickly but fetch updated emails. In order to accomplish (2) one needs to communicate to the site's server "this user is opening the app". This message is generally fairly light weight -- at a minimum it needs to identify the user. But it may also need to communicate a small amount of information about the state of the cache. For example, you might need to communicate the last cached newsfeed story. As a slightly more complicated case one could imagine a large class of apps (uber, postmates, etc) that want to communicate the location as quickly as possible. On our mobile apps we go to great lengths to make this notification happen as quickly as possible. Early in the startup process we send out a UDP packet that contains an encrypted user identifier. (https://code.facebook.com/posts/1675399786008080/optimizing-facebook-for-ios-start-time/) Jake's route based proposal makes sense to me here. It seems like a generalization of Alex's preflight request. I think to make Jake's proposal work you'd need a few things:
|
I'll try (although I though I had by clarifying the requirements):
I think we should probably do declarative routing. We should probably go with something not dissimilar from your proposal, @jakearchibald, but I don't think it solves the issues @n8schloss or @bmaurer are raising nor does it really help our PlzNavigate integration. It's also quite a bit larger. As a final thought, I think these can layer together quite nicely. Having the preflight come back as a readable stream that can be directed to the document (in cases where you might use a |
I think that if we want to handle navigation differently (so sites can only do service workers for subresources or some such) it needs to be an opt-in primitive of sorts. We should not take total network control away from the site due to Chrome's PlzNavigate project. |
Trying to get a concrete picture of the problem here. Someone shout up if this is wrong. The loading strategy is:
Here's some sample code for this: https://gist.github.com/jakearchibald/4927b34bb14c3681a3c1c1ce6ede5b09/b43199815f451cadde7280aa9b8dea1f84a88387 And the problem is:
It's still unclear to me how that delay compares to the benefits of rendering without the network. Even if it's only the shell that's rendered without the network, getting the shell & JS from the cache should come with a benefit vs the network. But yes, as a site adopts SW, there may be particular navigation routes that simply defer to default browser behaviour. |
But what about a race between the fetch event and a default network response? |
This is why we need a routing approach. Making all-or-nothing behaviour changes doesn't fit into this gradual model.
Like I said in #920 (comment), the spec already caters for this, and impelementation is in progress.
Seems fair. It certainly shouldn't be slower than appcache.
Do you need something beyond
This seems like a separate thing (it's the first time posting has come up). What problem is this solving? As in, when would you want to POST along with an initial navigation? Related: there is a rough plan for background posting/downloads. |
These could easily be options to event.declareRoute({mode: 'navigation'}, new FetchSource({
applyHeaders: {
'X-Site-Specific-Header': "thinger"
}
}), {fireFetchEvent: 'yes'}); I think that's equivalent to your proposal (I don't know for sure as you haven't described how your proposal works),
Are we sure that's the case? Based on the code example above, if we had routing, the following could happen while the SW was booting up:
If that takes longer than 200ms, then we've won. Not only that, but we rendered while that was happening. Contrast this to your preflight proposal, where:
If you're racing
We're already talking about hacking around your proposal. It's too high level, and too all-encompasing. Also, you still haven't answered my questions on how it works. My questions are in #920 (comment). The outstanding questions are:
|
It's already opt-in. If you don't want it, don't add a fetch listener. |
True, we made it opt-in ex post facto through a hack. It still looks weird compared to the other APIs exposed by service workers and makes it harder to add registration options specific to fetch, such as some kind of filtering mechanism. |
Here's the current model for an "app shell" site that uses JS to populate content https://gist.github.com/jakearchibald/4927b34bb14c3681a3c1c1ce6ede5b09/b43199815f451cadde7280aa9b8dea1f84a88387 Here's what it would look like with the routing proposal https://gist.github.com/jakearchibald/4927b34bb14c3681a3c1c1ce6ede5b09/c8bdf11ada8a99bb506f369d7d040051ad4dda16 - this doesn't need the SW to boot up at all. @slightlyoff, can you show how your solution would work in this case? (in addition to clarifying the behaviour of your proposal) |
The problem is that even though we win on getting the shell displayed faster, we lose in terms of the amount of time that it takes for the server to start processing our request and to send more data. Keep in mind that today Facebook uses chunked encoding. Within 10 ms of a request touching our server we start sending back instructions on how to load the app shell. So for us the loading of the app shell is already happening in parallel with our server work. For many requests the critical path in displaying new content is the speed of computing that ranking. Even if SW is a win in being able to display the app shell more quickly that may not help us meet the business objective of displaying up to date newsfeed stories more quickly. What we're looking for here is the best of both worlds -- being able to display the app shell without a network connection without increasing the delay between when a user clicks facebook.com and when our servers are able to start giving them new content. |
Thanks! Is there any server rendering of dynamic content going on here, or is content added to the page using JS? If it's the former, a streaming solution seems a better fit. If it's the latter, could |
This would be content to be added via JS. The problem is that rel=preload is going to get loaded way too late. In chrome at least that requires the renderer process to be started. Also, this doesn't give the SW a great chance to save state. Your idea of using etags is interesting but I think it might not capture the kinds of data one would want to send. For example, one may wish to send information about what JS/CSS files are in the cache so the server can render markup for an appropriate version of the site. Ideally what I'd want here is something like "facebook.com/ has a route that will load a cached version of the app shell. In addition, it will send a request to facebook.com/newdata with the header CurState:XYZ where XYZ can be managed by the service worker as its cached state changes". The request to /newdata should have as little delay as possible. |
@bmaurer hmm, so there's a problem with both my & @slightlyoff's proposals for this use-case. The preloaded response that ends up in the navigation's fetch event, is actually intended for a future subresource fetch. Both designs put you in a situation where you'd have to pass the preloaded response into the global scope and hope it's still there when you need it. This is really hacky, and would break if a browser ever spun up multiple instances of the SW for parallelism (there's no plan to do this, but it'd be nice if we could at some point). It feels like Something like https://gist.github.com/jakearchibald/4927b34bb14c3681a3c1c1ce6ede5b09/cb8328fee735414ced91ff38617b48f73cfb0a1f? |
I think this is missing the current state token, though, right? The In the above comments it seems they wanted to use a POST, although I guess a cookie could be used with your proposal here? |
I captured @slightlyoff in the pub and got him to answer the questions on his proposal. Here they are (from memory):
Maybe/probably. No mechanism yet. Maybe it should only be for certain urls.
Just navigations.
Unsure.
It will make an extra request. |
Oh good point. Updated code so the preload includes credentials. https://gist.github.com/jakearchibald/4927b34bb14c3681a3c1c1ce6ede5b09/dc1b9eccb825287cda46a8defed1254d660e2675
Yeah, I'm not sure what the POST thing's for. |
Well, I think @bmaurer said they wanted the state token to be managed by the service worker directly. Right now there is no way to set a cookie in a service worker, although I seem to recall some API was under discussion there. |
It seemed to me that facebook wanted to start getting user content prepared before SW boots up. The credentialed request in my example allows that. If the state token is managed by the SW, then you're back into a position where you have to wait for SW to boot up, which is no different than today. |
Well @bmaurer can best clarify what he meant by:
But I read that as the last time the service worker changed something it would store the new XYZ header in some API that would then go out with the next preload. So it doesn't have to block the preload, but its still managed by the service worker. |
Sorry for the slow response, have been on holiday. Thanks @jakearchibald for capturing (accurately) our pub conversation. My largest concern about the declarative proposal here is that it's a high-level, open-ended design space which is something we've decided in the past not to attack without lots of evidence of need. At some level, I think these proposals are compatible. You can imagine the high-level thing being implemented in terms of the low-level sketch I outlined without much difficulty. While I accept the critique that this may be scenario solving to some degree, we've had multiple customers ask for something that's effectively this and we know that multi-process browsers will be architecturally constrained in ways that make the PlzNavigate decision common. The alternative that preserves our low-level bias in design is to simply block all navigations that have SWs on them on renderer startup and defeat PlzNavigate (and equivalents). I'm not sure this is tenable, so I'd like to see something small that's low-level in this area while we design the high-level route system. |
I think the main problem with your sketch is that it's not opt-in and uses confusing terminology. I think by default a service worker should remain in control of all networking activity. It can relinquish some of that control back to the browser, but that should be opt-in. The terminology should make it clearer that this affects navigation only (only top-level presumably?). Even then, I'd still like to hear from Apple, Microsoft, and Mozilla's e10s team what their thoughts are on this (and also Servo). Would love to have some insight as to how much this relates to Chrome architecture decisions vs overall browser architecture. |
Both proposals are declarative. With your proposal you're declaring that an additional request should happen along with navigations, and it should have particular headers.
It's trying to scenario-solve, but it's missing the mark. You're creating a preflight for the navigation request, but you're already hacking it to return different data. I agree with @annevk that I'd like to hear more from other vendors on this, but if we're going to continue to throw around ideas we can't forget about the problems actually being presented: The way things are today: Without SW:
With SW:
If I'm reading @bmaurer correctly (correct me if the above lists are wrong), the problems are:
I don't think we should try and solve these with one function call. There's two parts there, avoiding SW startup for handling responses, and providing some kind of declarative preload (maybe multiple, maybe not to the same URL). |
Speaking as a Mozilla SW dev focused on @bmaurer's use case (and with less wisdom than @wanderview), what if we hang a "session priming request" off the ServiceWorker registration? General operation:
|
Again I think this is scenario-solving to the extreme. I'd rather look at solutions that solve this issue for Facebook, but maybe some other sites too. We already have
The preloaded URL may be the same across every navigation on Facebook's origin, but I don't think that will be generally true. |
Here's a little menu of features. To those building sites: of the following, which is the minimum that solves the problems you have? Which features are more than you'll ever need?
Additionally, should 1 & 2 happen for all navigations, or just initial launches? |
Declarative routing is definitely more broadly useful. From an implementation perspective, if we're talking about minimizing latency so that a site can know ASAP when a session is starting for its app shell (and to help the site avoid going crazy with push notifications and background syncs to mitigate the latency and avoid Flash Of Stale Content on startup)[1], then from Firefox's perspective AIUI, we want to hang the info about the request off the registration. This is because we currently always have the registration around and can probably keep it that way at least on an a top/recent basis for small requests. Anything more than that currently involves us opening per-origin data which means doing a non-trivial amount of I/O that is either explicitly or implicitly serialized behind other I/O. We ask our quota manager to open the origin's directory, we open API specific SQLite databases for each origin, etc. If we had declarative routing, we could probably just extract this information up out of the routes as an optimization. And I do believe it could respond faster in many other more generic cases than we could spin up a service worker and load its routing polyfill; just not as fast as a specialized thing hung off the registration. My main concern about the declarative routing is that it seems like a huge chunk of API surface that it's premature to standardize right now, and unrealistic to expect Firefox at least to get to in the short/medium-term based on our focus on non-polyfill-able APIs and our multiprocess priorities. I think we could get to a more targeted mechanism hung off the registration sooner, but it's definitely possible as what amounts to a short-term hack it's not worth it and we should wait for declarative routing. Re: 1: And I do think this is a very important use-case. I previously worked on Firefox OS apps, primarily email. Despite our offline/client-side bias, the name of the game was always minimizing the time to useful content. Since anything the server does is inherently parallel, it's a major win to notify the server before blocking on local device I/O which can exhibit variability from contention. |
@asutherland it seems to me that all these things (declarative routing or a preflight) would be hung off a service worker rather than the registration. As in if V1 sets these things, but V2 doesn't, they're gone. That's closest to how the fetch event works today.
That's what I'm aiming to discover through my previous comment. I'm not against a focused API, but I don't want to end up creating a |
So to respond to @jakearchibald's comment above.
I'm not sure what you mean by ping, do you mean hit a url and be able to access it's response from the SW or something else? In our use case and I suspect many others we're going to be caching most of the page shell and responses from our servers to service workers will return structured data. For Facebook if we measure the time that it takes to show newsfeed a lot of this time is blocked by fetching the data on our backend, and if we look at how long it takes to start service workers then the delay introduced by having them on the critical path introduces an overall regression. This kind of API is the best solution for this problem imo, the service worker stays in charge and still can construct responses, but hitting the backend isn't blocked on the service worker starting. I suspect that for most sites this will be a common use case, you'll want to cache most of your site but you still want to have content as fresh as possiable as quickly as possiable. I think it's also okay for preflightRequest to be optionally there or not, this way if the SW is indeed started and it wants to add extra data to headers or make a request to a different endpoint all together it can. But if the SW is not started it only get's to add headers (and maybe change some other fetch options).
Nope, we ultimately want to reply with just the newsfeed content as structured data. We want the SW to be able to get this and then turn it into the webpage. We always want the service worker on the critical path in this case.
So right now we'd just want to do one preload before the sw starts and then be able to access its response in the fetch event.
Nope, at least right now we're never going to respond just from cache and will always have a hybrid kind of approach where the service worker constructs a response from both cache and network. I could see that others might find this useful though and I bet that down the line this would be useful for us as well, but this kind of declarative routing could also totally work with preflight request and it also doesn't solve the same issue that preflight request solves. |
@n8schloss given you use chunked-encoding currently, how important is a streaming render to you (in terms of the streaming HTML parser)? If you're downloading content client side then displaying it, you generally have to download it all before displaying any of it. Is this a problem? |
I had a chat with @n8schloss & co and have a much better handle on the problem. Their fetch event does something equivalent to this:
All my assumptions were based on the simpler shell model where the shell loads then it sets about getting data, which isn't the case here (also yay streams!). My idea around reusing Making the preload available along with the navigation request works here, since it's being piped into the navigation response. However, this doesn't really work with the simpler shell model, where you're left with a response that you don't need yet. Best you can do is throw it into the global scope and hope for the best. They also want to be able to set headers on the preload request outside of the SW lifecycle (such as "last post received timestamp", so setting it via the activate/install event is not the solution. Cookies aren't the prefered way of doing this, since there's no cookies API in SW, and having that data hit the server for other requests is undesirable. |
FWIW, there's another very important web site that uses a similar serving pattern and would benefit from this - my blog. |
Aside from Facebook problem. Although @jakearchibald declarative syntax is tempting, I think it is too soon for this. @slightlyoff preflight solution seems to me a very valid options but I really think it should be opt-in so, during registration, a service worker could include a preflight list ( navigator.serviceWorker.register('sw.js', { scope: '/', networkRace: ['/ping', '/shell-assets'] }); List items would act as prefixes so any request to self.onfetch = (e) => {
if (isHeavyAsset(e.request) && e.networkResponse.inProgress) {
e.networkResponse.abort();
e.respondWith(handle(e.request));
}
} If for some race condition, you end sending another request to the network as in: self.onfetch = event => {
event.respondWith(
event.preflightResponse || fetch(event.request)
)
} The second one will hit the http cache hopefully instead of reaching the actual network. With the cancellable API the former example would be something like: self.onfetch = event => {
event.respondWith(
(!event.networkResponse.inProgress && event.networkResponse.response) || fetch(event.request)
)
} Does it make sense? |
Anyway, optimal solution would be to somehow foresee the need of the service worker and reduce the startup penalty. Perhaps a more low level solution for this is a better option. |
F2F:
|
Isn’t each GET bavigate going to be wasteful and encourage a more complicated pushState() navigation scheme for subsequent navigations? (Which would be problematic if we ever manage to tie other things to navigation, such as animations, apart from making sites more complicated.) |
Realistically I think a complex application will still end up using pushState rather than doing fresh navigations on each page view. Keep in mind that the use case here is on URLs where getting the lastest server-side data is critical for the page to be "done", ie a news feed or other page that is expected to show real time information. Even if you take out network latency there's a substantial price to be paid in terms of initializing client side JS (installing a bunch of prototype objects, etc). Apps may have features that cross page navigation (eg on Facebook chat tabs) where a fresh page view could be a disruptive experience (imagine you're half way through typing a message). One thing that could mitigate this issue is to allow for a TTL for pre-flight requests -- if the page is in browser history and less than X seconds old not to issue a pre-flight. |
This feature will be opt-in, and includes the setting of some headers. A well built server could return an empty response for the preflight. Sure it's more wasteful than nothing, but it isn't much. I think there's potential for browser optimisation here too. If the concurrent fetch's response body isn't read by the time
I hope we can tackle this at some point. I hear developers going down this route because they want transitions, but it seems a shame that they have to reimplement the whole navigation state thing for that. At some point I hope we can look at navigation transitions again. I know it isn't a complex app, but I wouldn't go |
Maybe you could solve this by making the pre-flight request a cacheable resource. That way if the user has visited that specific page on your blog within a timeframe you define the request would never hit your server. |
I think I may already be doing that. But I'm not really worried about the cost of the preflight, since it's less than the cost of a regular navigation. If I've cached a given article, I guess I could set a cookie with the last-modified time, so the server could shortcut the response. The cookie is better than the header in this case as it can be scoped to a particular article. Ideally last-modified would do the trick, but I guess it's possible for the resource to be in the cache API but not the HTTP cache. |
Google Docs is experimenting with service workers and we noticed that in the 95th percentile there is a regression of ~400ms and most of it is related to starting the SW. This solution will completely eliminate the problem for us. We actually need a simpler solution, we would like the SW to make the preflight request and only if the SW decides to make the exact same request, it will be handed to the SW (the SW doesn't need to know whether it was a preflight request). If the SW does not make the request, the request that was made to the server will be ignored. |
@ykahlon Well, I'm thinking about |
(My kind-of a proposal for this case, a bit rethinked/reworked) The
|
Okay, I improved the example again and put it to a gist: https://gist.github.com/NekR/d2d70f4e34d8829ea3c078456351ddce Some notes: I do always use |
But just to clarify, the idea above works for you too?
So does that mean your implementation would look something like: self.addEventListener('fetch', event => {
event.respondWith(
event.preloadResponse || fetch(event.request)
);
}); …where |
@jakearchibald, The example above will work for us. |
I though this option would be per route. If this is the case and trying to anticipate other kind of scenarios, what about a navigator.serviceWorker.register('sw.js', {
routes: {
'some/route': new NetworkRace(),
'some/other/route': new NetworkRace({ headers: ..., ttl: ... })
}
}); Handling the fetch event, we can use something like: self.onfetch = evt => {
doThingsWith(evt.networkResponse);
}; |
Aside, @NekR I don't get why we need the And, for clarification: suppose a preflight request is so fast that it answers completely before waking the service worker up? What happen then? Is that response used ignoring to trigger navigator.serviceWorker.register('sw.js', {
routes: {
'some/route': new PingNetwork(),
'some/other/route': new PingNetwork({ headers: ..., ttl: ... })
}
}); |
@delapuente, I think that it is important that no matter what, the service worker will be the one that decides what to do with the request even if the browser already has the response ready. Otherwise, the behavior will be non deterministic and in some cases the service worker may want to load from local storage or perform some other work. |
@delapuente, at the last f2f I think we decided that preflight was pretty tangential to routing. In this case we aren't trying to define behaviors to bypass the service worker or to race the network and cache. We simply want the request to be sent out without incurring the full service worker startup cost. Then once the service worker has started it can get the request and presumably use the resulting stream and maybe mix stuff from that stream and cache. But it's still totally up to the service worker. The preflight api as talked about at the f2f is the simplest, most performant and broadest solution that we could come up with for this. Routing is another beast altogether. |
Yes, sure, I agree with @ykahlon on the service worker being the ultimate responsible for handling the request and with @n8schloss about not routing, this API proposal was shaped to allow adding routes at some point in the future. I'm fine with a declarative API too as proposed by @NekR. The thing I don't understand is the need of the |
This is why I got the idea about One real world example might be this, which I had in one of my previous projects: We had very fast newsfeed generation, but very slow generation/fetching of likes for the newsfeed. So we ended up with 2 requests: one for newsfeed on one for likes, to show content to the user as fast as possible. In the case with |
Right, let's sort out the API for this. How about: self.addEventListener('install', event => {
event.addNavigationPreload();
});
self.addEventListener('fetch', event => {
event.respondWith(event.preloadResponse || fetch(event.request));
});
How's that? @n8schloss you mentioned wanting to set a header to a particular value, but the more I think about it the more it feels like cookies with a different name. If we had a cookie API in service worker, would that be enough here? |
Hmm, I'm not totally sure that this would work for us. Being able to modify this after the SW is installed is important and I think a header makes much more sense than cookies. Header VS CookieWe're going to be serving totally different responses when there is a SW present vs when there isn't. Having a header means that we can ensure that this extra info only gets added when there's an SW, using a cookie means that it'll always be sent. This can lead to some weird things if a user uninstalls an SW without deleting cookies, if browser vendors ever want to build something that skips SWs when they are being unresponsive, if SW cleanup runs into issues, etc. Modifying After InstillationIf we go the header route then we'll likely want to change the header without updating the SW. A use case for this is if we do a code push to update Facebook, but nothing in the Service Worker code changed. In that case we may not want to push an update to the Service Worker, but we'd still want to modify the header to match what the newest version of Facebook server code is expecting. Or we may want to update the header with information about how up to date our cache of the page chrome is, etc. |
The header @jakearchibald how about change the name |
@n8schloss as @delapuente points out, you'd have the fixed |
Did not know. Thank you. |
Yes, with the "navigation-preload" header we can ensure that tell that something came from the Service Worker, but we still end up with a situation where we we're going to be sending down unneeded data on every request. There's data that only makes sense on preload from a service worker context so to me it makes sense that we'd just set and send them only from within the service worker. Also, while this won't necessarily be a problem for us, cookies can be modified from both the window and via responses from the server. For developers who don't control their entire stack this could very well cause issues. Cookies seem like a hack here, I think that being able to set data in a header makes more sense and brings preload's options more in line with what you can do with fetch today. |
I agree that's not ideal, but how much additional data are you planning on sending? And what will that data be?
If we want this header to be updated outside of the service worker lifecycle it puts developers in a weird position, as the code for activating this feature can get out of sync with the fetch handling code. If they activate this feature and their fetch event doesn't cater for it, they can end up with two HTTP requests per navigation rather than one.
Cookies already exist for storing state and sending it to the server, creating another thing which does that feels like a hack. |
Jake's cookie suggestion definitely makes a lot of sense to me. At least in Gecko, cookies are synchronously available just like SW registrations. They're also a known quantity, although things like limits do vary between browsers. And creating a less wacky API to manipulate them would be a great thing for everyone. It also avoids creating another separately managed per-TLD+1 hunk of (potentially) persistent memory. Sites get the same (memory) budget they had before with the same latency characteristics. The one downside is that without some specialization, this would end up being a "it will work most of the time" thing. In addition to the SW-specific-cookies-not-deleted-on-unregister case @n8schloss pointed out, there's also the issue that cookies are subject to eviction due to browser-wide cookie limits and per-TLD+1 base domain limits. The required specialization seems like it would be a major hack/monkeypatch not particularly justified by this use-case. For example, we could treat a cookie with the path of the SW itself as magic so that it gets deleted on The argument for leaving cookies as they are is that:
|
I still think that being able to add a custom header is preferable to cookies. To sum up my argument for a custom header:
Would there be issues with allowing custom data to be set on a X-Service-Worker-Preload header? |
// this can be called at any point
registration.active.setNavigationPreload({
value: 'true' // header value
}).then(() => console.log(`It's set!`));
registration.active.getNavigationPreload()
.then(settings => console.log(settings.value));
self.addEventListener('fetch', event => {
event.respondWith(event.navigationPreload || fetch(event.request));
}); @jakearchibald to propose API |
I was thinking on @annevk proposal for // set or reconfigure
window.fetchPolicies.setNavigationPreload({ headerValue: 'true', scope: '/one-scope' });
window.fetchPolicies.setNavigationPreload({ headerValue: 'true', scope: '/other/scope' });
// clear
window.fetchPolicies.clearNavigationPreload({ scope: '/' });
// access to response.
self.onfetch = evt => {
evt.respondWith(evt.navigationPreload || fetch(event.request));
}; With ergonomics in mind, navigation preload could be set at the same time you register a Service Worker. navigator.serviceWorker.register('sw.js', { scope: '/one/scope/', navigationPreload: 'true' }); What do you think? |
This is definately a service worker API, as it's useless without it. We decided to tie the setting to the SW to reduce the chance of enabling the feature when you didn't have a SW that knows how to handle it. The proposal we developed as a group (in my comment above) is also accessible from clients. |
Sure. +1 |
Soooooo our API design doesn't really work. We decided that this setting should have a lifetime associated with the service worker itself, meaning that it must be set per service worker. This reduces the likelihood that it'd be set and forgotten about, leaving (some?) users with double requests on navigations, since the preload is never used. This worked well with my pre-F2F proposal, but we also decided that the value for the header must be settable at any point. This is at odds with the above idea, because I'm pretty sure developers will want to treat the header as a piece of state that lives beyond the service worker. Eg: Imagine I'm updating my service worker. I use navigation preloads, and I want to continue using them: self.addEventListener('install', event => {
event.waitUntil(
new Promise(resolve => {
if (!self.registration.active) {
resolve();
return;
}
resolve(self.registration.active.getNavigationPreload());
}).then(settings => {
settings = settings || {value: 'some-default'};
return self.registration.installing.setNavigationPreload(settings);
})
);
}); The above is horrible, and doesn't even work. If the header value is updated at any point while the new service worker is waiting, its initial value will be out of date. Developers could work around this by calling Feels like we've solved one footgun and created two more. If allowing the header to be set at any time is a requirement (and Facebook says it is), I think we should just put this on the registration object. |
Here's a proposal for a registration-based API: // enabling the feature.
self.addEventListener('activate', event => {
event.waitUntil(
self.registration.navigationPreload.enable()
);
});
The preload will have the header // disabling the feature.
self.addEventListener('activate', event => {
event.waitUntil(
self.registration.navigationPreload.disable()
);
}); Both If you ever launch a service worker that enables this feature, then later decide you don't need it, you're kinda doomed to forever disable it in your activate event. I don't see a way around that. // access to response (same as previous proposals)
self.addEventListener('fetch', event => {
event.respondWith(event.navigationPreload || fetch(event.request));
});
// setting the header
self.registration.navigationPreload.setHeaderValue("hello"); The header can be set at any point. Setting returns a promise that resolves once the registration is updated. Setting the header does not auto-enable the feature. This means page code can be updating the header without enabling the feature for a service worker that isn't ready for it. Also: self.registration.navigationPreload.getState(state => {
console.log(state); // {enabled: false, headerValue: "true"}
}); Any takers? |
Everything in the comment above makes sense to me. I suspect that most people will want to keep the header somewhat in line with their caches, and this isn't all that different from managing the sw cache (in terms of which lifecycle stage you'd want to do things in). |
jakearchibald |
50c3cdf
|
I'd like us to reconsider If, in future, we have some kind of routing system (klaxon), it'd be nice to provide a per-route way to define a preload and what it'd look like. This means preloads could happen for subresources, and maybe non-GET requests. It'd be great if we could reuse |
@jakearchibald what are the advantages to allow |
None really. By having the setting on the registration we can't really restrict when it can be called without creating a confusing API. We'd have to strongly recommend calling it during
It can also be disabled at any point, it's just that
Yep, whenever you want. |
Just for my understanding, can anyone explain what is wrong with my |
@NekR I see you mentioning it a few times, but I couldn't really find a concrete & up-to-date proposal. I see https://gist.github.com/NekR/d2d70f4e34d8829ea3c078456351ddce, but it doesn't really explain how it works. I'm not against (in future) some kind of API to find out about in-flight requests in the fetch group, but using such a thing becomes really tricky. Timing is especially difficult as you're exposing a single request to multiple processes. |
@jakearchibald "the proposal" was this comment: #920 (comment) but yes, I probably had to make it in a separate repo.
I see. So it's just too complicated for a simple use case from this repo. |
@NekR No need for a separate repo, I just saw other comments saying you'd updated bits and I didn't know if you'd updated that comment or if it was spread across multiple places. A single gist may have been better. Anyway, here's a review: Triggering this feature at registration time is confusing as it isn't clear what it's tied to. If my url & scope are the same, but It doesn't really mention what It doesn't really cover which requests are in It doesn't cover what happens if multiple processes read a single response. Do additional reads fail? This is where most of the complexity happens, you'd need some sort of mutex for accessing a response body. |
Some thoughts from #whatwg IRC today: In regards to the "preload header", it feels like we are close to a somewhat generic thing that could be separated from the service worker API. Perhaps we could consider:
Then you could do things like:
Or for a preload only header:
Note, this does not provide any substring matching. So it doesn't allow you to set headers for an entire scope or collection of URLs. If that is needed we could consider adding it. For Cache API, though, we considered it a de-opt and removed it previously. Maybe its more necessary for something like this, though. Anyway, this is just an idea. I'm not objecting to the current spec proposal. |
In https://codereview.chromium.org/2416843002/ I was made aware that this proposal introduces a nullable promise type into the IDL. I guess We are actually hoping to outlaw that from Web IDL entirely: https://www.w3.org/Bugs/Public/show_bug.cgi?id=25049 Can someone summarize why a nullable promise type is needed? I don't see any IDL in this issue, and I can't even see anyone proposing |
@domenic in short, during navigate But honestly I thought that it's going to be: if (event.navigationPreload) {
event.navigationPreload.then(() => ...);
} Not nullable promise. With nullable promise it would require checking that promise all the time. |
Maybe I was unclear. By nullable promise I mean a value that is sometimes null, instead of a promise. That appears to be what your example illustrates. We should not introduce this new concept that is unused anywhere else on the platform. Everywhere else, if an attribute represents an asynchronous value, it's always accessed asynchronously---it's not sync-if-null, async-otherwise. |
@domenic, that makes sense to me. It's important that we provide a sync way to check this though. There are times where someone might not want to call respondWith if the preload was not sent. I guess maybe we'll need something like event.hasNavigationPreload which is a bool and then event.navigationPreload which is always a Promise<?Request> (ie a promise that will sometimes resolve to a request and other times resolve to null)? |
That would work. But I'm not sure why you wouldn't just do event.navigationPreload.then(res => {
if (res) {
event.respondWith(res);
} else {
// ...
}
}); Comparing it with if (event.hasNavigationPreload) {
event.respondWith(event.navigationPreload);
} else {
// ...
} it seems like at most you'd lose a single microtask in the no-navigation-preload case, and they'd be equivalent in the yes-navigation-preload case. |
I agree that makes sense indeed especially for
const wait = event.navigationPreload.then(res => {
if (res) {
event.respondWith(res);
} else {
// ...
}
});
event.waitUntil(wait); |
@domenic, wouldn't the example you posted throw an error? As soon as you've done any async work you can no longer safely call event.respondWith because any subsequent call is in a different function. (See step 1 in section 4.5.4 on https://www.w3.org/TR/service-workers/#fetch-event-respondwith-method) |
Ah yeah, I wasn't aware of that, but @NekR's got my back there :) |
@domenic, that example has the same issue :) From section 4.5.4, "If the active function is not the callback of the event handler whose type is fetch, then: 1) Throw an "InvalidStateError" exception." |
@n8schloss I expected that it may not work, that's why I tagged it "probably". But not knowing that sections of the spec, it feels like that should work. |
@domenic, I don't think that's a good idea because we still want the browser to be able to respond to the fetch event normally if respondWith wasn't called. Imo the developer really needs to decide if they are going to respond to a fetch event before any async work happens. |
@n8schloss That sounds reasonable too, but there are some moments when developer doesn't know that ahead of time. This is why |
I'm very confused. Above you said we weren't talking about the fetch event. Now we are again? Regardless, I'll leave this to those who are more involved. My only real concerns are not having a nullable promise type, and not overburdening the API because of some desire to make it sync, when in reality the worst case would be a single microtask of delay. |
@domenic, we're talking about a field on the fetch event, but sure, that sounds reasonable. I'll let others chime in with what they think should be the resolution here :) |
@n8schloss Would you be comfortable with always calling
I think people have the impression that not calling Also, I don't recall no-respondWith being any faster than pass-through fetch in my recent benchmarking. |
@wanderview, our in the field data is still showing fetch from service workers as much slower than fetch from the window (at least for Chrome). In Chrome this will likely be the case until crbug.com/450471 is solved. In addition to the perf benefits, there are also UX benifits to not responding to fetch from the SW. Many browsers will not show the default error page when there's a network error given to respondWith. The browser network page is going to be much much more useful that anything a developer makes to indicate that something is wrong with the user's network setup (and I think it's somewhat unreasonable for most developers to have to make an error page for each error condition that could happen anyway). That being said, for us, we're likely going to always enable preload, so this is not an issue for us right now. But by not having a way to get this info synchronously we're limiting the flexibility of the api and maybe making it hard for other developers and could be hurting things that we want to do in the future. When the browser is dispatching the fetch event it should already know if it sent a preload request or not, so it doesn't seem like a huge deal to make that info available to developers. |
The error page issue is fixable by the browsers. We just need to stash the reason for the NetworkError on an internal attribute that content can't see. Then we can use the correct network error page if its passed back to a navigation respondWith(). I also feel that the perf issues should be fixable long term. I guess I'm just worried about contorting the API for short term issues. If the problems can't be fixed, we could also add a sync boolean getter later. |
That's fair, on the flip side I'd argue that the browser already needs to have this data so there's minimal overhead to exposing it to the developer. Since at least for a while developers will be able to extract value from having this info, why not just expose it? I realize that at this point we're kind of nitpicking, so if implementing a sync way to get this data isn't trivial then I'm fine if we don't do it :) |
@jakearchibald now that apparently |
@jakearchibald Are we sure we will never want this for non-navigations? Should it just be called |
@domenic I thought it was going to be a promise. What have I missed? |
@jakearchibald in #983 it's apparently going to be a |
In #983 there are two objects named
|
Yeah, I'd forgotten to add Also, I've called it I've gone with @domenic's suggesting of making it resolve with undefined if it isn't a navigation or the feature hasn't been enabled. |
NavigationPreloadManager.setHeaderValue should reject invalid HTTP header field values #1000
At the risk of sounding very stupid (I do not know inner workings of browser and potentional security problems). Wouldn't it be simpler to just define a way to unsubscribe some requests from service workers (e.g. what @jakearchibald proposed with .declareRoute and FetchSource) AND creating an API that could manipulate browser HTTP cache (not cache api). |
@jmuransky, that's not solving the problem that we're trying to solve here. The idea here is that we still want the service worker code to run and be able to process the result of the network fetche. We just want said fetch to go out to the network earlier, before we've incurred the cost of starting the service worker. This is somewhat similar to the optimization talked about at https://code.facebook.com/posts/1675399786008080/optimizing-facebook-for-ios-start-time/. |
Ok. My mistake then... |
horo-t |
754e439
|
Implementor feedback from Chrome: the ability to change navigation preload state (NPS) anywhere (enable/disable/setHeaderValue) is causing some complexities with how this thing is persisted in relation to the existing Registration record.
Arguably the registration returned is a nascent one and won't have meaning until a worker successfully installs, so setters shouldn't be called yet. |
Nerfing all events on all event targets in a serviceworker after initial script evaluation seems odd #371
More specific feedback: Chrome's initial implementation will reject enable()/disable() if there is no active worker. We could change behavior later to allow it but it'd make the implementation a bit more complex. Related is w3c/push-api#221, the push spec's subscribe() waits for an active worker but this can cause a deadlock (neither implementation currently follows this: Chrome rejects if no active worker and Firefox resolves without waiting). |
From a developer perspective it's much much much simpler if you are able to call enable()/disable() during the active event. It'd be a bit confusing to have to call enable() first thing after your SW runs once it's already been activated, or to have to make logic to deal with both the activated/unactivated cases. |
That's covered because in onactivate the worker is already active. This will restrict you from calling it in oninstall (in the case of a new registration. during updates it's callable because there's an active worker). |
During the activate event the worker is already the active worker (at least with the terminology as used in the spec), so I'd expect that to still work |
I think it's important for NPS promises to resolve once the data is successfully stored. As for the timing, I was following the push API's lead here. I'm not against changing as long as we both change. I don't fully understand the complexity in waiting. Assuming… registration.navigationPreload.enable(); …waits for an active worker, how is this more complex than: await registration.ready;
registration.navigationPreload.enable(); (assuming #770) |
I actually didn't really have waiting in mind when talking about complexity above. The complexity was about implementing resolve-after-store without waiting for active. This would likely require (for Chrome) initially storing the NPS outside the registration record if needed, then moving it into the record once that's stored (to avoid regressing the time taken to load a registration). Implementation-wise, I think waiting is doable. The big disadvantage is the It's still simplest to just reject when there is no active worker though. Since Chrome's push API has done this since inception, it's probably OK to keep doing it. |
But store can only happen once there's an active worker. If the tab closes before a worker becomes active, nothing is stored, so I'm not sure why we'd need to store anything outside the registration.
Hmm, yeah, background spec also does the same even though it's spec'd to "wait". @wanderview how do you feel about this change? I'm happy with it. If we agree I'll submit PRs in push, sync & update NPS. |
Ah yes, this is getting into implementation details that are probably not worth going into. The proposals to reject on non-active or wait for active are OK with me, with a preference for rejection. Again my complexity feedback was about trying to implement a solution that neither waits for active nor rejects on non-active, yet still has the properties that upon resolution:
For this solution, it’s not enough to just set the NPS in memory on our “IO” thread (where live service worker registrations are managed) and expect it to be stored once the registration is stored because of race conditions: the store operation may already be queued up or running on the “DB” thread. So I was thinking the DB thread should always write the newest NPS state: either in the registration if it exists or outside if not, and swap in outside NPS once the registration is stored. But I guess alternatively the DB thread could keep unstored NPS in memory and check that whenever storing a registration, so yes it could be possible. Still a bit clunky though. |
I'm going to update the PR so we reject if there's no active SW. I'll submit PRs for background sync and push to do the same. Update: I'm going to tackle #965 then this |
jakearchibald |
9349b66
|
Reject if there's no active service worker #230
jakearchibald |
296bfcc
|
jakearchibald |
e05a71d
|
Reject if active worker is null. #134
Ok, I've updated the spec to update if there's no active worker. Also filed w3c/push-api#230 and WICG/BackgroundSync#134. |
Apologies in advance for the length of this issue.
A few weeks ago I was discussing the topic of the upcoming "PlzNavigate" feature with @naskooskov, @n8schloss, and @bmaurer.
The TL;DR of PlzNavigate is that navigation actions in Chromium will not be handled as they currently are -- sending them through a Renderer process which then routes them to the Browser process for eventual dispatch by the network stack -- and will instead be immediately routed to the Browser-side network stack, improving time-to-navigation in the common (non-SW-controlled) case. This is beneficial in the PlzNavigate world which is much more aggressively multi-process oriented. Saving the time to create processes is a big win, particularly on Android which "features" particularly slow native process creation.
In Chromium (and one assumes similarly architected browsers), this means that PlzNavigate-style request optimisation runs afoul of Service Worker handling of these requests. This isn't particularly satisfying as the SW may indeed choose to make a request for the top-level resource from the network. Indeed, waiting to issue these requests on Service Worker startup is being reported by large sites as a regression in the 10s or even hundred+ millisecond range. This is notable on sites which do not handle fetches for top-level documents but only want to use SWs for caching.
What if we could enable PlzNavigate and remove the hit generated by SW startup?
The idea in the following proposal is to allow a style of declarative navigation request decoration for these "preflight" navigation requests, allowing the Service Worker to use (or discard) the response. If no decoration is added and the site's SW decides to handle the request directly (e.g. with a
e.respondWith(fetch(e.request))
), nothing should break. Similarly, it's a goal to avoid sending the results of the "preflight" request to the document without the Service Worker's involvement.To accomplish this, the proposal we sketched out on the whiteboard was to allow the
onfetch
event that corresponds to the navigation to have access to the original (preflight) response. To enable a savvy server-side to repurpose this preflight navigation to, e.g., send up-to-date data in a different format than HTML (imagine JSON or similar), we'd also allow the Service Worker to register a header to pass along with the preflight'd navigation request. All together, the strawman looks roughly like:Obviously the names are bike-sheddable. The goal however isn't to be super declarative about deciding what "routes" are handled in which style. Instead, it's to allow the maximum of flexibility for cooperating servers and clients to eliminate SW startup latency.
Thoughts?
/cc @jakearchibald @wanderview @jungkees @mkruisselbrink