I have a web-based documentation searching/viewing system that I\'m developing for a client. Part of this system is a search system that allows the client to search for a t
Fetch all the data as a string, and use split()
. This is the fastest way to build an array in Javascript.
There's an excellent article a very similar problem, from the people who built the flickr search: http://code.flickr.com/blog/2009/03/18/building-fast-client-side-searches/
Instead of using $.getScript
to load JavaScript files containing function calls, consider using $.getJSON. This may boost performance. The files would now look like this:
{
"key" : 0,
"values" : [0,1,2,3,4,5,6,7,8]
}
After receiving the JSON response, you could then call AddToBookData
on it, like this:
function AddToBookData(json) {
BookData[BookIndex].push([json.key,json.values]);
}
If your files have multiple sets of calls to AddToBookData, you could structure them like this:
[
{
"key" : 0,
"values" : [0,1,2,3,4,5,6,7,8]
},
{
"key" : 1,
"values" : [0,1,2,3,4,5,6,7,8]
},
{
"key" : 2,
"values" : [0,1,2,3,4,5,6,7,8]
}
]
And then change the AddToBookData
function to compensate for the new structure:
function AddToBookData(json) {
$.each(json, function(index, data) {
BookData[BookIndex].push([data.key,data.values]);
});
}
Addendum
I suspect that regardless what method you use to transport the data from the files to the BookData
array, the true bottleneck is in the sheer number of requests. Must the files be fragmented into 40-100? If you change to JSON format, you could load a single file that looks like this:
{
"file1" : [
{
"key" : 0,
"values" : [0,1,2,3,4,5,6,7,8]
},
// all the rest...
],
"file2" : [
{
"key" : 1,
"values" : [0,1,2,3,4,5,6,7,8]
},
// yadda yadda
]
}
Then you could do one request, load all the data you need, and move on... Although the browser may initially lock up (although, maybe not), it would probably be MUCH faster this way.
Here is a nice JSON tutorial, if you're not familiar: http://www.webmonkey.com/2010/02/get_started_with_json/
I tested three methods of loading the same 9,000,000 point dataset into Firefox 3.64.
1: Stephen's GetJSON Method
2) My function based push method
3) My pre-processed array appending method:
I ran my tests two ways: The first iteration of testing I imported 100 files containing 10,000 rows of data, each row containing 9 data elements [0,1,2,3,4,5,6,7,8]
The second interation I tried combining files, so that I was importing 1 file with 9 million data points.
This was a lot larger than the dataset I'll be using, but it helps demonstrate the speed of the various import methods.
Separate files: Combined file:
JSON: 34 seconds 34
FUNC-BASED: 17.5 24
ARRAY-BASED: 23 46
Interesting results, to say the least. I closed out the browser after loading each webpage, and ran the tests 4 times each to minimize the effect of network traffic/variation. (ran across a network, using a file server). The number you see is the average, although the individual runs differed by only a second or two at most.
Looks like there are two basic areas for optimising the data loading, that can be considered and tackled separately:
But I'm not sure how far you'll really be able to go with optimising the data loading alone. To solve the actual problem with your application (browser locking up for too long) have you considered options such as?
Web Workers might not be supported by all your target browsers, but should prevent the main browser thread from locking up while it processes the data.
For browsers without workers, you could consider increasing the setTimeout
interval slightly to give the browser time to service the user as well as your JS. This will make things actually slightly slower but may increase user happiness when combined with the next point.
For both worker-capable and worker-deficient browsers, take some time to update the DOM with a progress bar. You know how many files you have left to load so progress should be fairly consistent and although things may actually be slightly slower, users will feel better if they get the feedback and don't think the browser has locked up on them.
As suggested by jira in his comment. If Google Instant can search the entire web as we type, is it really not possible to have the server return a file with all locations of the search keyword within the current book? This file should be much smaller and faster to load than the locations of all words within the book, which is what I assume you are currently trying to get loaded as quickly as you can?