We\'re trying to save the content (HTML) of WKWebView in a persistent storage (NSUserDefaults, CoreData or disk file). The user can see the same content when he re-enters th
I would recommend investigating the feasibility of using App Cache, which is now supported in WKWebView
as of iOS 10: https://stackoverflow.com/a/44333359/233602
Easiest way to use cache webpage is as following in Swift 4.0: -
/* Where isCacheLoad = true (Offline load data) & isCacheLoad = false (Normal load data) */
internal func loadWebPage(fromCache isCacheLoad: Bool = false) {
guard let url = url else { return }
let request = URLRequest(url: url, cachePolicy: (isCacheLoad ? .returnCacheDataElseLoad: .reloadRevalidatingCacheData), timeoutInterval: 50)
//URLRequest(url: url)
DispatchQueue.main.async { [weak self] in
self?.webView.load(request)
}
}
I'm not sure if you just want to cache the pages that have already been visited or if you have specific requests that you'd like to cache. I'm currently working on the latter. So I'll speak to that. My urls are dynamically generated from an api request. From this response I set requestPaths
with the non-image urls and then make a request for each of the urls and cache the response. For the image urls, I used the Kingfisher library to cache the images. I've already set up my shared cache urlCache = URLCache.shared
in my AppDelegate. And allotted the memory I need: urlCache = URLCache(memoryCapacity: <setForYourNeeds>, diskCapacity: <setForYourNeeds>, diskPath: "urlCache")
Then just call startRequest(:_)
for each of the urls in requestPaths
. (Can be done in the background if it's not needed right away)
class URLCacheManager {
static let timeout: TimeInterval = 120
static var requestPaths = [String]()
class func startRequest(for url: URL, completionWithErrorCallback: @escaping (_ error: Error?) -> Void) {
let urlRequest = URLRequest(url: url, cachePolicy: .returnCacheDataElseLoad, timeoutInterval: timeout)
WebService.sendCachingRequest(for: urlRequest) { (response) in
if let error = response.error {
DDLogError("Error: \(error.localizedDescription) from cache response url: \(String(describing: response.request?.url))")
}
else if let _ = response.data,
let _ = response.response,
let request = response.request,
response.error == nil {
guard let cacheResponse = urlCache.cachedResponse(for: request) else { return }
urlCache.storeCachedResponse(cacheResponse, for: request)
}
}
}
class func startCachingImageURLs(_ urls: [URL]) {
let imageURLs = urls.filter { $0.pathExtension.contains("png") }
let prefetcher = ImagePrefetcher.init(urls: imageURLs, options: nil, progressBlock: nil, completionHandler: { (skipped, failed, completed) in
DDLogError("Skipped resources: \(skipped.count)\nFailed: \(failed.count)\nCompleted: \(completed.count)")
})
prefetcher.start()
}
class func startCachingPageURLs(_ urls: [URL]) {
let pageURLs = urls.filter { !$0.pathExtension.contains("png") }
for url in pageURLs {
DispatchQueue.main.async {
startRequest(for: url, completionWithErrorCallback: { (error) in
if let error = error {
DDLogError("There was an error while caching request: \(url) - \(error.localizedDescription)")
}
})
}
}
}
}
I'm using Alamofire for the network request with a cachingSessionManager configured with the appropriate headers. So in my WebService class I have:
typealias URLResponseHandler = ((DataResponse<Data>) -> Void)
static let cachingSessionManager: SessionManager = {
let configuration = URLSessionConfiguration.default
configuration.httpAdditionalHeaders = cachingHeader
configuration.urlCache = urlCache
let cachingSessionManager = SessionManager(configuration: configuration)
return cachingSessionManager
}()
private static let cachingHeader: HTTPHeaders = {
var headers = SessionManager.defaultHTTPHeaders
headers["Accept"] = "text/html"
headers["Authorization"] = <token>
return headers
}()
@discardableResult
static func sendCachingRequest(for request: URLRequest, completion: @escaping URLResponseHandler) -> DataRequest {
let completionHandler: (DataResponse<Data>) -> Void = { response in
completion(response)
}
let dataRequest = cachingSessionManager.request(request).responseData(completionHandler: completionHandler)
return dataRequest
}
Then in the webview delegate method I load the cachedResponse. I use a variable handlingCacheRequest
to avoid an infinite loop.
func webView(_ webView: WKWebView, decidePolicyFor navigationAction: WKNavigationAction, decisionHandler: @escaping (WKNavigationActionPolicy) -> Void) {
if let reach = reach {
if !reach.isReachable(), !handlingCacheRequest {
var request = navigationAction.request
guard let url = request.url else {
decisionHandler(.cancel)
return
}
request.cachePolicy = .returnCacheDataDontLoad
guard let cachedResponse = urlCache.cachedResponse(for: request),
let htmlString = String(data: cachedResponse.data, encoding: .utf8),
cacheComplete else {
showNetworkUnavailableAlert()
decisionHandler(.allow)
handlingCacheRequest = false
return
}
modify(htmlString, completedModification: { modifiedHTML in
self.handlingCacheRequest = true
webView.loadHTMLString(modifiedHTML, baseURL: url)
})
decisionHandler(.cancel)
return
}
handlingCacheRequest = false
DDLogInfo("Currently requesting url: \(String(describing: navigationAction.request.url))")
decisionHandler(.allow)
}
Of course you'll want to handle it if there is a loading error as well.
func webView(_ webView: WKWebView, didFail navigation: WKNavigation!, withError error: Error) {
DDLogError("Request failed with error \(error.localizedDescription)")
if let reach = reach, !reach.isReachable() {
showNetworkUnavailableAlert()
handlingCacheRequest = true
}
webView.stopLoading()
loadingIndicator.stopAnimating()
}
I hope this helps. The only thing I'm still trying to figure out is the image assets aren't being loaded offline. I'm thinking I'll need to make a separate request for those images and keep a reference to them locally. Just a thought but I'll update this when I have that worked out.
UPDATED with images loading offline with below code
I used the Kanna library to parse my html string from my cached response, find the url embedded in the style= background-image:
attribute of the div, used regex to get the url (which is also the key for Kingfisher cached image), fetched the cached image and then modified the css to use the image data (based on this article: https://css-tricks.com/data-uris/), and then loaded the webview with the modified html. (Phew!) It was quite the process and maybe there is an easier way.. but I had not found it. My code is updated to reflect all these changes. Good luck!
func modify(_ html: String, completedModification: @escaping (String) -> Void) {
guard let doc = HTML(html: html, encoding: .utf8) else {
DDLogInfo("Couldn't parse HTML with Kannan")
completedModification(html)
return
}
var imageDiv = doc.at_css("div[class='<your_div_class_name>']")
guard let currentStyle = imageDiv?["style"],
let currentURL = urlMatch(in: currentStyle)?.first else {
DDLogDebug("Failed to find URL in div")
completedModification(html)
return
}
DispatchQueue.main.async {
self.replaceURLWithCachedImageData(inHTML: html, withURL: currentURL, completedCallback: { modifiedHTML in
completedModification(modifiedHTML)
})
}
}
func urlMatch(in text: String) -> [String]? {
do {
let urlPattern = "\\((.*?)\\)"
let regex = try NSRegularExpression(pattern: urlPattern, options: .caseInsensitive)
let nsString = NSString(string: text)
let results = regex.matches(in: text, options: [], range: NSRange(location: 0, length: nsString.length))
return results.map { nsString.substring(with: $0.range) }
}
catch {
DDLogError("Couldn't match urls: \(error.localizedDescription)")
return nil
}
}
func replaceURLWithCachedImageData(inHTML html: String, withURL key: String, completedCallback: @escaping (String) -> Void) {
// Remove parenthesis
let start = key.index(key.startIndex, offsetBy: 1)
let end = key.index(key.endIndex, offsetBy: -1)
let url = key.substring(with: start..<end)
ImageCache.default.retrieveImage(forKey: url, options: nil) { (cachedImage, _) in
guard let cachedImage = cachedImage,
let data = UIImagePNGRepresentation(cachedImage) else {
DDLogInfo("No cached image found")
completedCallback(html)
return
}
let base64String = "data:image/png;base64,\(data.base64EncodedString(options: .endLineWithCarriageReturn))"
let modifiedHTML = html.replacingOccurrences(of: url, with: base64String)
completedCallback(modifiedHTML)
}
}
I know I'm late, but I have recently been looking for a way to store web pages for offline reading, and still could't find any reliable solution that wouldn't depend on the page itself and wouldn't use the deprecated UIWebView
. A lot of people write that one should use the existing HTTP caching, but WebKit seems to do a lot of stuff out-of-process, making it virtually impossible to enforce complete caching (see here or here). However, this question guided me into the right direction. Tinkering with the web archive approach, I found that it's actually quite easy to write your own web archive exporter.
As written in the question, web archives are just plist files, so all it takes is a crawler that extracts the required resources from the HTML page, downloads them all and stores them in a big plist file. This archive file can then later be loaded into the WKWebView
via loadFileURL(URL:allowingReadAccessTo:)
.
I created a demo app that allows archiving from and restoring to a WKWebView
using this approach: https://github.com/ernesto-elsaesser/OfflineWebView
EDIT: The archive generation code is now available as standalone Swift package: https://github.com/ernesto-elsaesser/WebArchiver
The implementation only depends on Fuzi for HTML parsing.